Reliable Control-Point Selection for Steering Reasoning in Large Language Models
Published in arXiv, 2026
We find that 93.3% of keyword-detected reasoning boundaries are behaviorally unstable. Our stability filtering method retains only genuine behavioral signals, achieving 0.784 accuracy on MATH-500 (+5.0 over SEAL) with cross-model transfer.
Recommended citation: Haomin Zhuang, Hojun Yoo, Xiaonan Luo, Kehan Guo, Xiangliang Zhang. (2026). "Reliable Control-Point Selection for Steering Reasoning in Large Language Models." arXiv:2604.02113. https://arxiv.org/abs/2604.02113
