Publications and Preprints
2026
- PreprintThe Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents2026
- EACLThe Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models2026
- EACL FindingsDeepSieve: Information Sieving via LLM-as-a-Knowledge-Router2026
- PreprintNAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems2026
- PreprintToward Global Large Language Models in Medicine2026
2025
- EMNLPThinking Out Loud: Do Reasoning Models Know When They’re Right?In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
- EMNLPSeeing is Believing, but How Much? A Comprehensive Analysis of Verbalized Calibration in Vision-Language ModelsIn Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
- EMNLPMMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model EvaluationIn Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
- EMNLPGood Intentions Beyond ACL: Who Does NLP for Social Good, and Where?In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025
- CIKMUncertainty Quantification for Multiple-Choice Questions is Just One-Token DeepIn Proceedings of the 34th ACM International Conference on Information and Knowledge Management, Nov 2025
- CIKMFact or Facsimile? Evaluating the Factual Robustness of Modern RetrieversIn Proceedings of the 34th ACM International Conference on Information and Knowledge Management, Nov 2025
- ACLLeveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive PlausibilityIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Jul 2025
- XLLMExploring Multilingual Probing in Large Language Models: A Cross-Language AnalysisIn Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025), Aug 2025
- COLINGExploring Concept Depth: How Large Language Models Acquire Knowledge and Concept at Different Layers?In Proceedings of the 31st International Conference on Computational Linguistics, Jan 2025
- AutismPre-trained artificial intelligence language model represents pragmatic language variability central to autism and genetically related phenotypesAutism, Jan 2025
- AACL FindingsSympathy over Polarization: A Computational Discourse Analysis of Social Media Posts about the July 2024 Trump Assassination AttemptIn Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, Dec 2025
- SSRNA Century of Inflation NarrativesAvailable at SSRN 5259107, Dec 2025
- Preprint
- NeurIPSCARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMsDec 2025
- PreprintLarge Language Models Are More Persuasive Than Incentivized Human PersuadersDec 2025
- NeurIPSThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM ReasoningDec 2025
- PreprintVeriGUI: Verifiable Long-Chain GUI DatasetDec 2025
- PreprintToward Equitable Access: Leveraging Crowdsourced Reviews to Investigate Public Perceptions of Health Resource AccessibilityDec 2025
2024
- EMNLPAdaptive Axes: A Pipeline for In-domain Social Stereotype AnalysisIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024
- WNU
- BioNLPKG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking TechniquesIn Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, Aug 2024
- ACL FindingsEvaluating Large Language Models on Wikipedia-Style Survey GenerationIn Findings of the Association for Computational Linguistics: ACL 2024, Aug 2024
- NEJM AIClinical Text Datasets for Medical Artificial Intelligence and Large Language Models — A Systematic ReviewNEJM AI, Aug 2024
- JMIRAscle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation StudyJ Med Internet Res, Oct 2024
2023
- IJCAIGreenPLM: Cross-lingual Transfer of Monolingual Pre-trained Language Models at Almost No CostIn Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Oct 2023
- ACL FindingsLarge Language Models Are Partially Primed in Pronoun InterpretationIn Findings of the Association for Computational Linguistics: ACL 2023, Jul 2023
- ICASSPMasked Spectrogram Prediction for Self-Supervised Audio Pre-TrainingIn ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jul 2023
2022
- COLINGA Survey in Automatic Irony Processing: Linguistic, Cognitive, and Multi-X PerspectivesIn Proceedings of the 29th International Conference on Computational Linguistics, Oct 2022
- InterspeechLow-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics PerspectiveIn Interspeech 2022, Oct 2022
- InterspeechCalibrate and Refine! A Novel and Agile Framework for ASR Error Robust Intent DetectionIn Interspeech 2022, Oct 2022
- PreprintFilter and Evolve: Progressive Pseudo Label Refining for Semi-Supervised Automatic Speech RecognitionOct 2022