2026 Submitted to COLM 2026
S. Pandey, et al.
SELFDOUBT: Uncertainty Quantification for Reasoning LLMs via the Hedge-to-Verify Ratio
- Proposed an O(1) black-box uncertainty framework extracting behavioral hedge/verify signals from reasoning traces, significantly outperforming Semantic Entropy on discrimination (p=0.001) at 10× lower cost; a zero-hedge gate achieves 96.1% precision across 7 models and 3 benchmarks.
2026 Submitted to UAI 2026
S. Raghu, S. Pandey
Don't Blink: Evidence Collapse during Multimodal Reasoning
- Identified a universal evidence collapse phenomenon in reasoning VLMs, observing visual attention drops up to 90.8% during generation and a task-conditional failure regime where confident but visually disengaged predictions are hazardous on sustained visual reference tasks but benign on symbolic tasks.
Under Review Journal of Systems and Software
S. Pandey, et al.
Repair of Thought: Advancing Automated Program Repair through a Dual-Model Reasoning Framework
- Introduced a function-level APR framework achieving an SOTA 83.1% plausible repair rate on Defects4J, with an automated verification pipeline combining AST alignment, control-flow symbolic analysis, and semantic checks.