I currently work at Salesforce as a research scientist. Before that, I worked at Amazon as an applied scientist. I received my Ph.D. degree at MIT Department of Electrical Engineering and Computer Science (EECS), advised by Prof. Julie Shah, from which I also received my Master of Science degree in 2019. I received my Bachelor of Science in Engineering degree from Duke University, with double major in Computer Sciene and Electrical & Computer Engineering. I worked with Prof. George Konidaris and Prof. Kris Hauser on my undergraduate research on robotics.
My long-term research goal is to make machine learning models and systems reliable and responsible. These days, I am mainly working on three particular directions:
- LLM reasoning, especially around LLM-as-judges, [Show Only]
- (Mechanistic) interpretability of LLM, [Show Only]
- Trustworthy and societal implications for LLMs. [Show Only]
* Equal Contribution
-
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou*, Austin Xu*, Peifeng Wang, Caiming Xiong, Shafiq Joty
International Conference on Machine Learning (ICML), 2025
[Paper] [Code]
-
Search Engines in the AI Era: A Qualitative Understanding to the False Promise of Factual and Verifiable Source-Cited Responses in LLM-based Search
Pranav Narayanan Venkit, Philippe Laban, Yilun Zhou, Yixin Mao, Chien-Sheng Wu
ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2025
[Paper]
-
BingoGuard: LLM Content Moderation Tools with Risk Levels
Fan Yin, Philippe Laban, Xiangyu Peng, Yilun Zhou, Yixin Mao, Vaibhav Vats, Linnea Ross, Divyansh Agarwal, Caiming Xiong, Chien-Sheng Wu
International Conference on Learning Representations (ICLR), 2025
[Paper]
-
Direct Judgement Preference Optimization
Peifeng Wang*, Austin Xu*, Yilun Zhou, Caiming Xiong, Shafiq Joty
arXiv preprint: 2409.14664, 2024
[Paper]
-
Shared Imagination: LLMs Hallucinate Alike
Yilun Zhou, Caiming Xiong, Silvio Savarese, Chien-Sheng Wu
arXiv preprint: 2407.16604, 2024
[Paper] [Website]
-
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
Rithesh Murthy*, Liangwei Yang*, Juntao Tan*, Tulika Manoj Awalgaonkar*, Yilun Zhou, Shelby Heinecke, Sachin Desai, Jason Wu, Ran Xu, Sarah Tan, Jianguo Zhang, Zhiwei Liu, Shirley Kokane, Zuxin Liu, Ming Zhu, Huan Wang, Caiming Xiong, Silvio Savarese
arXiv preprint: 2406.10290, 2024
[Paper]
-
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai, Yilun Zhou, Shi Feng, Abulhair Saparov, Ziyu Yao
arXiv preprint: 2407.02646, 2024
[Paper] [Website]
-
CHAMP: A Competition-Level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities
Yujun Mao, Yoon Kim, Yilun Zhou
Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024
Preliminary version in NeurIPS 2023 Workshop on Mathematical Reasoning and AI (MATH-AI)
[Paper] [Code] [Website]
-
Evaluating the Utility of Model Explanations for Model Development
Shawn Im, Jacob Andreas, Yilun Zhou
NeurIPS Workshop on Attributing Model Behavior at Scale (ATTRIB), 2023
[Paper]
-
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
Shiyuan Huang, Siddarth Mamidanna, Shreedhar Jangam, Yilun Zhou, Leilani H. Gilpin
arXiv preprint: 2310.11207, 2023
[Paper]
-
Iterative Partial Fulfillment of Counterfactual Explanations: Benefits and Risks
Yilun Zhou
AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES), 2023
[Paper]
-
Improving Generalization in Language Model-Based Text-to-SQL Semantic Parsing: Two Simple Semantic Boundary-Based Techniques
Daking Rai, Bailin Wang, Yilun Zhou, Ziyu Yao
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
[Paper] [Code]
-
Techniques for Interpretability and Transparency of Black-Box Models
Yilun Zhou
MIT Ph.D. Thesis, 2023
[Thesis]
-
The Solvability of Interpretability Evaluation Metrics
Yilun Zhou, Julie Shah
Conference of the European Chapter of the Association for Computational Linguistics (EACL) Findings, 2023
[Paper] [Code] [Website]
-
Explaining Large Language Model-Based Neural Semantic Parsers
Daking Rai, Yilun Zhou, Bailin Wang, Ziyu Yao
AAAI Conference on Artificial Intelligence: Student Abstract and Poster Program, 2023
[Paper]
-
ExSum: From Local Explanations to Model Understanding
Yilun Zhou, Marco Tulio Ribeiro, Julie Shah
Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT), 2022
[Paper] [Code] [Video] [Website] [MIT News]
-
The Irrationality of Neural Rationale Models
Yiming Zheng, Serena Booth, Julie Shah, Yilun Zhou
NAACL Workshop on Trustworthy Natural Language Processing (TrustNLP), 2022
[Paper] [Poster] [Code]
-
Do Feature Attribution Methods Correctly Attribute Features?
Yilun Zhou, Serena Booth, Marco Tulio Ribeiro, Julie Shah
AAAI Conference on Artificial Intelligence (AAAI), 2022
Preliminary version in NeurIPS 2021 Workshop on Explainable AI Approaches for Debugging and Diagnosis
[Paper] [Poster] [Code] [Video] [Website] [MIT News]
-
Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment
Ganesh Ghalme*, Vineet Nair*, Vishakha Patil*, Yilun Zhou*
International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2022
[Paper] [Code] [Website]
-
Latent Space Alignment Using Adversarially Guided Self-Play
Mycal Tucker, Yilun Zhou, Julie Shah
International Journal of Human-Computer Interaction (IJHCI), 2022
[Paper]
-
RoCUS: Robot Controller Understanding via Sampling
Yilun Zhou, Serena Booth, Nadia Figueroa, Julie Shah
Conference on Robot Learning (CoRL), 2021
[Paper] [Poster] [Code] [Video] [Website]
-
Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example
Serena Booth*, Yilun Zhou*, Ankit Shah, Julie Shah
AAAI Conference on Artificial Intelligence (AAAI), 2021
Preliminary version in AAAI 2020 Workshop on Statistical Relational AI
[Paper] [Poster] [Code] [MIT News]
-
Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms
Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
[Paper] [Poster] [Code] [Video]
-
Learning Household Task Knowledge from WikiHow Descriptions
Yilun Zhou, Julie Shah, Steven Schockaert
International Joint Conference on Artificial Intelligence (IJCAI) Workshop on Semantic Deep Learning, 2019
[Paper] [Code]
-
Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness
Yilun Zhou, Steven Schockaert, Julie Shah
The Web Conference (WWW), 2019
[Paper] [Code]
-
Representing, Learning, and Controlling Complex Object Interactions
Yilun Zhou, Benjamin Burchfiel, George Konidaris
Autonomous Robots (AuRo), 2018
Original version in Robotics: Science and Systems (RSS), 2016
[Paper] [Video]
-
6DOF Grasp Planning by Optimizing a Deep Learning Scoring Function
Yilun Zhou, Kris Hauser
Robotics: Science and Systems (RSS) Workshop on Revisiting Contact - Turning a Problem into a Solution, 2017
[Paper] [Poster]
-
Incorporating Side-Channel Information into Convolutional Neural Networks for Robotic Tasks
Yilun Zhou, Kris Hauser
IEEE International Conference on Robotics and Automation (ICRA), 2017
[Paper] [Code]
-
Asymptotically Optimal Planning by Feasible Kinodynamic Planning in a State-Cost Space
Kris Hauser, Yilun Zhou
IEEE Transactions on Robotics (TRO), 2016
[Paper] [Code] [Website]