Publications and Preprints

2026

  1. Preprint
    The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents
    Weihao Xuan*, Qingcheng Zeng*, Heli Qi, Yunze Xiao, Junjue Wang, and Naoto Yokoya†
    2026
  2. EACL
    The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
    Kefan Yu*, Qingcheng Zeng*†, Weihao Xuan, Wanxin Li, Jingyi Wu, and Rob Voigt
    2026
  3. EACL Findings
    DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router
    Minghao Guo, Qingcheng Zeng, Xujiang Zhao, Yanchi Liu, Wenchao Yu, Mengnan Du, Haifeng Chen, and Wei Cheng
    2026
  4. Preprint
    NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
    Jiayu Liu, Rui Wang, Qing Zong, Qingcheng Zeng, Tianshi Zheng, Haochen Shi, Dadi Guo, Baixuan Xu, Chunyang Li, and Yangqiu Song
    2026
  5. Preprint
    Toward Global Large Language Models in Medicine
    Rui Yang, Huitao Li, Weihao Xuan, Heli Qi, Xin Li, Kunyu Yu, Yingjian Chen, Rongrong Wang, Jacques Behmoaras, Tianxi Cai, Bibhas Chakraborty, Qingyu Chen, Lionel Tim-Ee Cheng, Marie-Louise Damwanza, Chido Dzinotyiwei, Aosong Feng, Chuan Hong, Yusuke Iwasawa, Yuhe Ke, Linah Kitala, and 30 more authors
    2026

2025

  1. EMNLP
    Thinking Out Loud: Do Reasoning Models Know When They’re Right?
    Qingcheng Zeng*†, Weihao Xuan*, Leyang Cui, and Rob Voigt
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
  2. EMNLP
    Seeing is Believing, but How Much? A Comprehensive Analysis of Verbalized Calibration in Vision-Language Models
    Weihao Xuan*, Qingcheng Zeng*, Heli Qi, Junjue Wang, and Naoto Yokoya†
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
  3. EMNLP
    MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation
    Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao, Aosong Feng, Dairui Liu, Yun Xing, Junjue Wang, Fan Gao, Jinghui Lu, Yuang Jiang, Huitao Li, Xin Li, Kunyu Yu, Ruihai Dong, Shangding Gu, Yuekang Li, Xiaofei Xie, Felix Juefei-Xu, and 12 more authors
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
  4. EMNLP
    Good Intentions Beyond ACL: Who Does NLP for Social Good, and Where?
    Grace LeFevre, Qingcheng Zeng, Adam Leif, Jason Jewell, Denis Peskoff, and Rob Voigt
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
  5. CIKM
    Uncertainty Quantification for Multiple-Choice Questions is Just One-Token Deep
    Qingcheng Zeng*, Mingyu Jin*, Qinkai Yu, Zhenting Wang, Wenyue Hua, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Fan Yang, Kaize Ding†, Ruixiang Tang, and Yongfeng Zhang
    In Proceedings of the 34th ACM International Conference on Information and Knowledge Management, Nov 2025
  6. CIKM
    Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers
    Haoyu Wu*, Qingcheng Zeng*, and Kaize Ding†
    In Proceedings of the 34th ACM International Conference on Information and Knowledge Management, Nov 2025
  7. ACL
    Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility
    Suet-Ying Lam*, Qingcheng Zeng*†, Jingyi Wu, and Rob Voigt
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Jul 2025
  8. XLLM
    Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis
    Daoyang Li, Haiyan Zhao, Qingcheng Zeng, and Mengnan Du
    In Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025), Aug 2025
  9. COLING
    Exploring Concept Depth: How Large Language Models Acquire Knowledge and Concept at Different Layers?
    Mingyu Jin*, Qinkai Yu*, Jingyuan Huang*, Qingcheng Zeng*, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, and Yongfeng Zhang
    In Proceedings of the 31st International Conference on Computational Linguistics, Jan 2025
  10. Autism
    Pre-trained artificial intelligence language model represents pragmatic language variability central to autism and genetically related phenotypes
    Joseph CY Lau, Emily Landau, Qingcheng Zeng, Ruichun Zhang, Stephanie Crawford, Rob Voigt, and Molly Losh
    Autism, Jan 2025
  11. AACL Findings
    Sympathy over Polarization: A Computational Discourse Analysis of Social Media Posts about the July 2024 Trump Assassination Attempt
    Qingcheng Zeng*†, Guanhong Liu*, Zhaoqian Xue, Diego Ford, Rob Voigt, Loni Hagen, and Lingyao Li†
    In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, Dec 2025
  12. SSRN
    A Century of Inflation Narratives
    Mourad Heddaya, Chenhao Tan, Rob Voigt, Qingcheng Zeng, and Alexander Zentefis
    Available at SSRN 5259107, Dec 2025
  13. Preprint
    Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
    Aaron Tu, Weihao Xuan, Heli Qi, Xu Huang, Qingcheng Zeng, Shayan Talaei, Yijia Xiao, Peng Xia, Xiangru Tang, Yuchen Zhuang, Bing Hu, Hanqun Cao, Wenqi Shi, Tianang Leng, Rui Yang, Yingjian Chen, Ziqi Wang, Irene Li, Nan Liu, Huaxiu Yao, and 7 more authors
    Dec 2025
  14. NeurIPS
    CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs
    Sijia Chen, Xiaomin Li, Mengxue Zhang, Eric Hanchen Jiang, Qingcheng Zeng, and Chen-Hsiang Yu
    Dec 2025
  15. Preprint
    Large Language Models Are More Persuasive Than Incentivized Human Persuaders
    Philipp Schoenegger, Francesco Salvi, Jiacheng Liu, Xiaoli Nan, Ramit Debnath, Barbara Fasolo, Evelina Leivada, Gabriel Recchia, Fritz Günther, Ali Zarifhonarvar, Joe Kwon, Zahoor Ul Islam, Marco Dehnert, Daryl Y. H. Lee, Madeline G. Reinecke, David G. Kamper, Mert Kobaş, Adam Sandford, Jonas Kgomo, Luke Hewitt, and 20 more authors
    Dec 2025
  16. NeurIPS
    ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
    Shulin Huang, Linyi Yang, Yan Song, Shuang Chen, Leyang Cui, Ziyu Wan, Qingcheng Zeng, Ying Wen, Kun Shao, Weinan Zhang, Jun Wang, and Yue Zhang
    Dec 2025
  17. Preprint
    VeriGUI: Verifiable Long-Chain GUI Dataset
    Shunyu Liu, Minghao Liu, Huichi Zhou, Zhenyu Cui, Yang Zhou, Yuhao Zhou, Wendong Fan, Ge Zhang, Jiajun Shi, Weihao Xuan, Jiaxing Huang, Shuang Luo, Fang Wu, Heli Qi, Qingcheng Zeng, Ziqi Ren, Jialiang Gao, Jindi Lv, Junjie Wang, Aosong Feng, and 12 more authors
    Dec 2025
  18. Preprint
    Toward Equitable Access: Leveraging Crowdsourced Reviews to Investigate Public Perceptions of Health Resource Accessibility
    Zhaoqian Xue, Guanhong Liu, Chong Zhang, Kai Wei, Qingcheng Zeng, Songhua Hu, Wenyue Hua, Lizhou Fan, Yongfeng Zhang, and Lingyao Li
    Dec 2025

2024

  1. EMNLP
    Adaptive Axes: A Pipeline for In-domain Social Stereotype Analysis
    Qingcheng Zeng, Mingyu Jin, and Rob Voigt
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024
  2. WNU
    Causal Micro-Narratives
    Mourad Heddaya, Qingcheng Zeng, Alexander Zentefis, Rob Voigt, and Chenhao Tan
    In Proceedings of the 6th Workshop on Narrative Understanding, Nov 2024
  3. BioNLP
    KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques
    Rui Yang, Haoran Liu, Edison Marrese-Taylor, Qingcheng Zeng, Yuhe Ke, Wanxin Li, Lechao Cheng, Qingyu Chen, James Caverlee, Yutaka Matsuo, and Irene Li
    In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, Aug 2024
  4. ACL Findings
    Evaluating Large Language Models on Wikipedia-Style Survey Generation
    Fan Gao, Hang Jiang, Rui Yang, Qingcheng Zeng, Jinghui Lu, Moritz Blum, Tianwei She, Yuang Jiang, and Irene Li
    In Findings of the Association for Computational Linguistics: ACL 2024, Aug 2024
  5. NEJM AI
    Clinical Text Datasets for Medical Artificial Intelligence and Large Language Models — A Systematic Review
    Jiageng Wu, Xiaocong Liu, Minghui Li, Wanxin Li, Zichang Su, Shixu Lin, Lucas Garay, Zhiyun Zhang, Yujie Zhang, Qingcheng Zeng, Jie Shen, Changzheng Yuan, and Jie Yang
    NEJM AI, Aug 2024
  6. JMIR
    Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study
    Rui Yang*, Qingcheng Zeng*, Keen You, Yujie Qiao, Lucas Huang, Chia-Chun Hsieh, Benjamin Rosand, Jeremy Goldwasser, Amisha Dave, Tiarnan Keenan, Yuhe Ke, Chuan Hong, Nan Liu, Emily Chew, Dragomir Radev, Zhiyong Lu, Hua Xu, Qingyu Chen, and Irene Li†
    J Med Internet Res, Oct 2024

2023

  1. IJCAI
    GreenPLM: Cross-lingual Transfer of Monolingual Pre-trained Language Models at Almost No Cost
    Qingcheng Zeng*, Lucas Garay*, Peilin Zhou*, Dading Chong, Yining Hua, Jiageng Wu, Yikang Pan, Han Zhou, Rob Voigt, and Jie Yang†
    In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Oct 2023
  2. ACL Findings
    Large Language Models Are Partially Primed in Pronoun Interpretation
    Suet-Ying Lam*, Qingcheng Zeng*†, Kexun Zhang*, Chenyu You, and Rob Voigt
    In Findings of the Association for Computational Linguistics: ACL 2023, Jul 2023
  3. ICASSP
    Masked Spectrogram Prediction for Self-Supervised Audio Pre-Training
    Dading Chong, Helin Wang, Peilin Zhou, and Qingcheng Zeng
    In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jul 2023

2022

  1. COLING
    A Survey in Automatic Irony Processing: Linguistic, Cognitive, and Multi-X Perspectives
    Qingcheng Zeng, and An-Ran Li
    In Proceedings of the 29th International Conference on Computational Linguistics, Oct 2022
  2. Interspeech
    Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective
    Qingcheng Zeng, Dading Chong, Peilin Zhou, and Jie Yang
    In Interspeech 2022, Oct 2022
  3. Interspeech
    Calibrate and Refine! A Novel and Agile Framework for ASR Error Robust Intent Detection
    Peilin Zhou, Dading Chong, Helin Wang, and Qingcheng Zeng
    In Interspeech 2022, Oct 2022
  4. Preprint
    Filter and Evolve: Progressive Pseudo Label Refining for Semi-Supervised Automatic Speech Recognition
    Zezhong Jin, Dading Zhong, Xiao Song, Zhaoyi Liu, Naipeng Ye, and Qingcheng Zeng
    Oct 2022