Master Student @ Department of Statistics, University of Chicago
E-mail: haolinyang2001 [at] uchicago.edu
Links:
GitHub
Google Scholar
ORCID
CV
I am currently a master student in statistics at the University of Chicago. Previously I graduated from Tsinghua University where I majored in English and minored in Economics & Finance and Statistics. My research focuses on using mathematically-sound and theoretically-inspired methods to understand the internal mechanisms of Large Language Models which underly their external behaviors, primarily in-context learning, and control their behaviors robustly and efficiently through such understanding. In my undergraduate days I also investigated the differences between the strategies applied by pretrained language models and those suggested by traditional translation studies to improve raw machine translations of English academic texts into Chinese. I have multiple papers accepted/submitted to top-tier international conferences including NeurIPS and ICLR.
I am actively seeking productive research collaborations in the aforementioned area or other related fields. If you are interested in working together, feel free to contact me.
I am seeking Ph.D. positions starting in Fall 2026.
Research Interests
Keywords: Mechanistic Interpretability, In-context Learning, Large Language Models
Publications
Total Publications: 7, Cumulative Impact Factor: 224.6, Total Pages: 337.
International Conferences
- Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning
Haolin Yang, Hakaze Cho, Yiqiao Zhong, Naoya Inoue
Annual Conference on Neural Information Processing Systems (NeurIPS). 2025. 45 pages. [h5=371, IF=23.3]
[PDF] [arXiv] [Poster] [Abstract] [Bibtex]
The unusual properties of in-context learning (ICL) have prompted investigations into the internal mechanisms of large language models. Prior work typically focuses on either special attention heads or task vectors at specific layers, but lacks a unified framework linking these components to the evolution of hidden states across layers that ultimately produce the model's output. In this paper, we propose such a framework for ICL in classification tasks by analyzing two geometric factors that govern performance: the separability and alignment of query hidden states. A fine-grained analysis of layer-wise dynamics reveals a striking two-stage mechanism: separability emerges in early layers, while alignment develops in later layers. Ablation studies further show that Previous Token Heads drive separability, while Induction Heads and task vectors enhance alignment. Our findings thus bridge the gap between attention heads and task vectors, offering a unified account of ICL's underlying mechanisms.
@inproceedings{
anonymous2025unifying,
title={Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning},
author={Yang, Haolin and Cho, Hakaze and Zhong, Yiqiao and Inoue, Naoya},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=FIfjDqjV0B}
}
Journals
- Does Checking-In Help? Understanding L2 Learners’ Autonomous Check-In Behavior in an English-Language MOOC Through Learning Analytics
Yining Zhang, Fang Yang, Haolin Yang, Shuyuan Han
ReCALL. 2024. 16 pages. [IF=5.7]
[PDF] [Abstract] [Bibtex]
Concerns over the quality of teaching in massive open online courses devoted to language learning (LMOOCs) have prompted extensive research on learning behavior in such courses. The purpose of this study is to gain a better understanding of autonomous learning check-ins – that is, individuals sharing their own learning records and/or other information about their learning-related experience – a novel behavior that has not been studied in previous LMOOC research. Using learning analytics, we found that just 6.2% (n = 699) of a sample of 11,293 learners autonomously engaged in check-in behavior, and that the content of these learners’ check-ins varied considerably according to their contexts and the language skills they were seeking to acquire. We further found (1) a positive association between check-in behavior and LMOOC completion; (2) that students who chose to check in earned relatively low grades on unit quizzes, especially in their early stage of learning, but outperformed the non-check-in group significantly in final exam scores; and (3) that those who checked in engaged with a significantly wider array of in-LMOOC learning components than those who did not, and thus accessed a wider system of language-learning experiences. Taken together, these results confirm that check-in behavior can aid the process of learning in an LMOOC and further highlight this behavior’s wider potential to aid self-directed autonomous online learning.
@article{zhang2024does,
title={Does checking-in help? Understanding L2 learners’ autonomous check-in behavior in an English-language MOOC through learning analytics},
author={Zhang, Yining and Yang, Fang and Yang, Haolin and Han, Shuyuan},
journal={ReCALL},
volume={36},
number={3},
pages={343--358},
year={2024},
publisher={Cambridge University Press}
}
Pre-prints
- Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis
Haolin Yang, Hakaze Cho, Naoya Inoue
Pre-print. Submitted to International Conference on Learning Representations (ICLR). 2026. 45 pages. [h5=362, IF=48.9]
[PDF] [arXiv] [Abstract] [Bibtex]
We investigate the mechanistic underpinnings of in-context learning (ICL) in large language models by reconciling two dominant perspectives: the component-level analysis of attention heads and the holistic decomposition of ICL into Task Recognition (TR) and Task Learning (TL). We propose a novel framework based on Task Subspace Logit Attribution (TSLA) to identify attention heads specialized in TR and TL, and demonstrate their distinct yet complementary roles. Through correlation analysis, ablation studies, and input perturbations, we show that the identified TR and TL heads independently and effectively capture the TR and TL components of ICL. Using steering experiments with geometric analysis of hidden states, we reveal that TR heads promote task recognition by aligning hidden states with the task subspace, while TL heads rotate hidden states within the subspace toward the correct label to facilitate prediction. We further show how previous findings on ICL mechanisms, including induction heads and task vectors, can be reconciled with our attention-head-level analysis of the TR-TL decomposition. Our framework thus provides a unified and interpretable account of how large language models execute ICL across diverse tasks and settings.
@article{yang2025localizingtaskrecognitiontask,
title={Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis},
author={Yang, Haolin and Cho, Hakaze and Inoue, Naoya},
journal={arXiv preprint arXiv:2509.24164},
year={2025}
}
- Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight
Haolin Yang, Hakaze Cho, Kaize Ding, Naoya Inoue
Pre-print. Submitted to International Conference on Learning Representations (ICLR). 2026. 48 pages. [h5=362, IF=48.9]
[PDF] [arXiv] [Abstract] [Bibtex]
Large Language Models (LLMs) can perform new tasks from in-context demonstrations, a phenomenon known as in-context learning (ICL). Recent work suggests that these demonstrations are compressed into task vectors (TVs), compact task representations that LLMs exploit for predictions. However, prior studies typically extract TVs from model outputs or hidden states using cumbersome and opaque methods, and they rarely elucidate the mechanisms by which TVs influence computation. In this work, we address both limitations. First, we propose directly training Learned Task Vectors (LTVs), which surpass extracted TVs in accuracy and exhibit superior flexibility-acting effectively at arbitrary layers, positions, and even with ICL prompts. Second, through systematic analysis, we investigate the mechanistic role of TVs, showing that at the low level they steer predictions primarily through attention-head OV circuits, with a small subset of 'key heads' most decisive. At a higher level, we find that despite Transformer nonlinearities, TV propagation is largely linear: early TVs are rotated toward task-relevant subspaces to improve logits of relevant labels, while later TVs are predominantly scaled in magnitude. Taken together, LTVs not only provide a practical approach for obtaining effective TVs but also offer a principled lens into the mechanistic foundations of ICL.
@article{yang2025taskvectorslearnedextracted,
title={Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight},
author={Yang, Haolin and Cho, Hakaze and Ding, Kaize and Inoue, Naoya},
journal={arXiv preprint arXiv:2509.24169},
year={2025}
}
- Binary Autoencoder for Mechanistic Interpretability of Large Language Models
Hakaze Cho, Haolin Yang, Brian M. Kurkoski, Naoya Inoue
Pre-print. Submitted to International Conference on Learning Representations (ICLR). 2026. 36 pages. [h5=362, IF=48.9]
[PDF] [arXiv] [Abstract] [Bibtex]
Existing works are dedicated to untangling atomized numerical components (features) from the hidden states of Large Language Models (LLMs) for interpreting their mechanism. However, they typically rely on autoencoders constrained by some implicit training-time regularization on single training instances (i.e., normalization, top-k function, etc.), without an explicit guarantee of global sparsity among instances, causing a large amount of dense (simultaneously inactive) features, harming the feature sparsity and atomization. In this paper, we propose a novel autoencoder variant that enforces minimal entropy on minibatches of hidden activations, thereby promoting feature independence and sparsity across instances. For efficient entropy calculation, we discretize the hidden activations to 1-bit via a step function and apply gradient estimation to enable backpropagation, so that we term it as Binary Autoencoder (BAE) and empirically demonstrate two major applications: (1) Feature set entropy calculation. Entropy can be reliably estimated on binary hidden activations, which we empirically evaluate and leverage to characterize the inference dynamics of LLMs and In-context Learning. (2) Feature untangling. Similar to typical methods, BAE can extract atomized features from LLM's hidden states. To robustly evaluate such feature extraction capability, we refine traditional feature-interpretation methods to avoid unreliable handling of numerical tokens, and show that BAE avoids dense features while producing the largest number of interpretable ones among baselines, which confirms the effectiveness of BAE serving as a feature extractor.
@article{cho2025binary,
title={Binary Autoencoder for Mechanistic Interpretability of Large Language Models},
author={Cho, Hakaze and Yang, Haolin and Kurkoski, Brian M. and Inoue, Naoya},
journal={arXiv preprint arXiv:2509.20997},
year={2025}
}
- Mechanism of Task-oriented Information Removal in In-context Learning
Hakaze Cho, Haolin Yang, Gouki Minegishi, Naoya Inoue
Pre-print. Submitted to International Conference on Learning Representations (ICLR). 2026. 67 pages. [h5=362, IF=48.9]
[PDF] [arXiv] [Abstract] [Bibtex]
In-context Learning (ICL) is an emerging few-shot learning paradigm based on modern Language Models (LMs), yet its inner mechanism remains unclear. In this paper, we investigate the mechanism through a novel perspective of information removal. Specifically, we demonstrate that in the zero-shot scenario, LMs encode queries into non-selective representations in hidden states containing information for all possible tasks, leading to arbitrary outputs without focusing on the intended task, resulting in near-zero accuracy. Meanwhile, we find that selectively removing specific information from hidden states by a low-rank filter effectively steers LMs toward the intended task. Building on these findings, by measuring the hidden states on carefully designed metrics, we observe that few-shot ICL effectively simulates such task-oriented information removal processes, selectively removing the redundant information from entangled non-selective representations, and improving the output based on the demonstrations, which constitutes a key mechanism underlying ICL. Moreover, we identify essential attention heads inducing the removal operation, termed Denoising Heads, which enables the ablation experiments blocking the information removal operation from the inference, where the ICL accuracy significantly degrades, especially when the correct label is absent from the few-shot demonstrations, confirming both the critical role of the information removal mechanism and denoising heads.
@article{cho2025mechanism,
title={Mechanism of Task-oriented Information Removal in In-context Learning},
author={Cho, Hakaze and Yang, Haolin and Minegishi, Gouki and Inoue, Naoya},
journal={arXiv preprint arXiv:2509.21012},
year={2025}
}
Thesis
- On Automatic Post-Editing Models in the English-Chinese Translation of Biomedical Journal Articles.
Haolin Yang
Outstanding Undergraduate Thesis @ Tsinghua University. 2024. 80 pages.
Resume
Awards
- Outstanding Undergraduate Thesis @ School of Humanities, Tsinghua University. 2024.
- Outstanding Undergraduate @ Tsinghua University. 2024.
- Academic Execellence Scholarship @ Tsinghua University. 2022-2023.