publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2026

  1. small_batch.png
    MSE-Break: Steering Internal Representations to Bypass Refusals in Large Language Models
    Ashwin Saraswatula, Pranav Balabhadra, and Pranav Dhinkar
    International Conference on Machine Learning(ICML) Actionable Interpretability 2025; Under review at ICLR Main Track, 2026
  2. white.png
    Data Whitening Improves Sparse Autoencoder Learning
    Ashwin Saraswatula and David Klindt
    Association for the Advancement of Artificial Intelligence(AAAI) XAI4Science, 2026
  3. von.png
    Bridging The Von Neumann Gap: Why LLMs Haven’t Made Novel Discoveries
    Ashwin Saraswatula
    Association for the Advancement of Artificial Intelligence(AAAI) XAI4Science, 2026