I am a Ph.D. candidate in Computer Science at Harvard University, advised by Prof. Hima Lakkaraju. Before that, I was an undergraduate student in Computer Science at University of Illinois Urbana-Champaign, where I was fortunate to be advised by Prof. Bo Li.
Research Interests: My current research interests is Trustworthy Machine Learning.
Publications
Selected Publications
- Monitorability as a Free Gift: How RLVR Spontaneously Aligns Reasoning
Zidi Xiong, Shan Chen, Himabindu Lakkaraju
ICML 2026
- How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior
Zidi Xiong*, Yuping Lin*, Wenya Xie*, Pengfei He, Zirui Liu, Jiliang Tang, Himabindu Lakkaraju, Zhen Xiang
ACL 2026
Featured by MIT Technology Review China
- Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models
Zidi Xiong, Shan Chen, Zhenting Qi, Himabindu Lakkaraju
NeurIPS 2025
- RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li
ICML 2024.
Cited in the Llama Guard 2 model card
- DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models
Boxin Wang*, Weixin Chen*, Hengzhi Pei*, Chulin Xie*, Mintong Kang*, Chenhui Zhang*, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li.
NeurIPS 2023.
Oral Presentation
Outstanding Paper Award
For full publications, please see my Google Scholar.
Awards
- NeurIPS 2023 Outstanding Paper Award
Education
- Harvard University - Ph.D. in Computer Science 2024 - 2029 (expected)
- University of Illinois Urbana-Champaign - B.S. in Mathematics and Computer Science 2019 - 2023.
Industrial Experience
- Anuttacon - Research Intern
2026.01-2026.03
- Microsoft, Azure Responsible AI - Research Intern
2024.03-2024.06
- Baidu Research, CCL Lab - Research Intern
2021.11-2022.06
Email: zidixiong@g.harvard.edu