Zidi Xiong

I am a Ph.D. candidate in Computer Science at Harvard University, advised by Prof. Hima Lakkaraju. Before that, I was an undergraduate student in Computer Science at University of Illinois Urbana-Champaign, where I was fortunate to be advised by Prof. Bo Li.

Research Interests: My current research interests is Trustworthy Machine Learning.

Publications

Selected Publications

Monitorability as a Free Gift: How RLVR Spontaneously Aligns Reasoning
Zidi Xiong, Shan Chen, Himabindu Lakkaraju
ICML 2026
How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior
Zidi Xiong*, Yuping Lin*, Wenya Xie*, Pengfei He, Zirui Liu, Jiliang Tang, Himabindu Lakkaraju, Zhen Xiang
ACL 2026
Oral Presentation
Featured by MIT Technology Review China
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models
Zidi Xiong, Shan Chen, Zhenting Qi, Himabindu Lakkaraju
NeurIPS 2025
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li
ICML 2024.
Cited in the Llama Guard 2 model card
DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models
Boxin Wang*, Weixin Chen*, Hengzhi Pei*, Chulin Xie*, Mintong Kang*, Chenhui Zhang*, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li.
NeurIPS 2023.
Oral Presentation
Outstanding Paper Award

For full publications, please see my Google Scholar.

Awards

NeurIPS 2023 Outstanding Paper Award

Education

Harvard University - Ph.D. in Computer Science 2024 - 2029 (expected)
University of Illinois Urbana-Champaign - B.S. in Mathematics and Computer Science 2019 - 2023.

Industrial Experience

Anuttacon - Research Intern
2026.01-2026.03
Microsoft, Azure Responsible AI - Research Intern
2024.03-2024.06
Baidu Research, CCL Lab - Research Intern
2021.11-2022.06

Contact

Email: zidixiong@g.harvard.edu