Jielin Qiu

Senior Research Scientist
Salesforce AI Research

Email / Google Scholar / Github / LinkedIn

About

My research interests lie in Multimodal Machine Learning. The central goal of my research is to design scalable inference and learning algorithms to connect language, perception, and control for robust multimodal learning. My current research lies in the foundations of multimodal learning with applications in multimedia, computer vision, natural language processing, healthcare, and embodied AI.

I received my Ph.D. from the Computer Science Department at School of Computer Science, Carnegie Mellon University. I was extremely fortunate to be advised by Prof. Lei Li and Prof. Christos Faloutsos. Before that, I received my B.Eng. from Shanghai Jiao Tong University, advised by Prof. Bao-Liang Lu. I've worked as a research intern at Google, Meta, Microsoft, Amazon Web Services, and Adobe. My research was generously supported by CMU CSD fellowships and fundings from DARPA, NSF, Adobe, Allegheny Health Network, and Cleveland Clinic.

News

[2025-04] We release Higgs-Audio, a powerful model for audio understanding and generation.
[2024-11] MMWatermark Robustness gets accepted by Journal of Data-centric Machine Learning Research (DMLR) 2024.
[2024-09] SnapNTell gets accepted by EMNLP 2024 Findings.
[2024-04] Defended my PhD thesis. Huge thanks to my amazing advisors Prof. Lei Li and Prof. Christos Faloutsos, and thesis committee Prof. Yonatan Bisk and Prof. William Wang.
[2024-04] MMSum dataset gets accepted by CVPR 2024 as Poster Highlight (Top 11.9%). Check our MMSum dataset!
[2024-03] Embodied Policy Learning with Language-based Scene Summarization gets accepted by NAACL 2024.
[2024-01] MMRobustness gets accepted as the very first paper at Journal of Data-centric Machine Learning Research (DMLR) 2024. Check our MMRobustness benchmark!
[2023-11] One paper about Cardiovascular record retrieval gets accepted by PMLR ML4H 2023.
[2023-10] Start a research internship at Google.
[2023-10] One paper about human languages and brain signals gets accepted by EMNLP Findings 2023.
[2023-06] One paper accepted as spotlight by ICML 2023 Workshop on Interactive Learning with Implicit Human Feedback.
[2023-06] Two papers accepted by ICML 2023 Workshop on Machine Learning for Multimodal Healthcare Data.
[2023-05] Start a research internship at Meta.
[2023-05] One paper about multimodal summarization by Optimal Transport gets accepted by ACL Findings 2023.
[2023-04] One paper about data augmentation on Geodesics gets accepted by ICML 2023.
[2023-04] Invited talk at Microsoft Research Cambridge.
[2023-02] One paper accepted by CVPR 2023.
[2023-02] One paper accepted by ICASSP 2023.
[2023-01] Start a research internship at Microsoft.
[2023-01] One paper accepted by EACL Findings 2023.
[2023-01] One paper accepted by AISTATS 2023.
[2022-10] One paper accepted by WACV 2023.
[2022-10] One paper accepted by NeurIPS 2022 Workshop on Distribution Shifts.
[2022-10] Top Reviewers in NeurIPS 2022.
[2022-06] One paper accepted by MLHC 2022.
[2022-05] Start a research internship at AWS AI.
[2022-05] One paper accepted by ICML 2022 workshop on Principles of Distribution Shift.
[2022-04] One paper accepted by ICLR 2022 Workshop on Socially Responsible Machine Learning.
[2021-09] Receive a gift funding from Adobe. Thanks, Adobe!
[2021-05] Start a research internship at Adobe research.

Selected Publications

* marked as equal contribution

MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning
Yiqing Liang, Jielin Qiu, Wenhao Ding, Zuxin Liu, James Tompkin, Mengdi Xu, Mengzhou Xia, Zhengzhong Tu, Laixi Shi, Jiacheng Zhu
Under Review
[paper]

Evaluating Durability: Benchmark Insights into Image and Text Watermarking
Jielin Qiu*, William Han*, Xuandong Zhao, Shangbang Long,
Christos Faloutsos, Lei Li
Journal of Data-centric Machine Learning Research (DMLR) 2024
[paper] [code]

SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Jielin Qiu, Andrea Madotto, Zhaojiang Lin, Paul Crook, Ethan Xu,
Luna Dong, Christos Faloutsos, Lei Li, Babak Damavandi, Seungwhan Moon
EMNLP 2024 Findings
[paper]

Embodied Executable Policy Learning with Language-based Scene Summarization
Jielin Qiu*, Mengdi Xu*, William Han*, Seungwhan Moon, Ding Zhao
NAACL 2024
ICML 2023 Workshop on Interactive Learning with Implicit Human Feedback (spotlight)
[paper] [code]

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Ding Zhao, Bo Li, Lijuan Wang
CVPR 2024 (Poster Highlight 11.9%)
[paper] [website] [dataset] [code]

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
Jielin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao,
Bo Li, Mu Li
Journal of Data-centric Machine Learning Research (DMLR) 2024
[paper] [website] [code]

Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition
Jielin Qiu, William Han, Winfred Wang, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Christos Faloutsos, Lei Li, Lijuan Wang
Under Review
[paper]

Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment
Jielin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
ACL 2023 Findings
[paper] [press]

Can Brain Signals Reveal Inner Alignment with Human Languages?
William Han*, Jielin Qiu*, Jiacheng Zhu, Mengdi Xu, Douglas Weber,
Bo Li, Ding Zhao
EMNLP 2023 Findings
[paper] [code]

Automated Cardiovascular Record Retrieval by Multimodal Learning between Electrocardiogram and Clinical Report
Jielin Qiu*, Jiacheng Zhu*, Shiqi Liu, William Han, Jingqi Zhang, Chaojing Duan, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
PMLR Proceedings of Machine Learning for Health 2023
[paper] [code]

Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging
Jielin Qiu*, Peide Huang*, Makiya Nakashima, Jaehyun Lee, Jiacheng Zhu, Wilson Tang, Pohao Chen, Christopher Nguyen, Byung-Hak Kim, Debbie Kwon, Douglas Weber, Ding Zhao, David Chen
ICML 2023 Workshop on Machine Learning for Multimodal Healthcare Data
[paper]

Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Jielin Qiu*, William Han*, Jiacheng Zhu, Mengdi Xu, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
EACL 2023 Findings
[paper] [code]

Cardiac Disease Diagnosis on Imbalanced Electrocardiography Data Through Optimal Transport Augmentation
Jielin Qiu*, Jiacheng Zhu*, Mengdi Xu, Peide Huang, Michael Rosenberg, Douglas Weber, Emerson Liu, Ding Zhao
ICASSP 2023
[paper]

LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos
Jielin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao, Hailin Jin
WACV 2023
[paper] [press]

Interpolation for Robust Learning: Data Augmentation on Geodesics
Jiacheng Zhu, Jielin Qiu, Aritra Guha, Zhuolin Yang, XuanLong Nguyen,
Bo Li, Ding Zhao
ICML 2023
[paper]

Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He, Jun Wang, Jielin Qiu, Abhinav Shrivastava, Trung Bui, Zhaowen Wang
CVPR 2023
[paper] [code]

Benchmarking Robustness under Distribution Shift of Multimodal Image-Text Models
Jielin Qiu, Yi Zhu, Xingjian Shi, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li
NeurIPS 2022 Workshop on Distribution Shifts
[paper] [press] [code]

GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
Jiacheng Zhu*, Jielin Qiu*, Zhuolin Yang, Douglas Weber, Michael Rosenberg, Emerson Liu, Bo Li, Ding Zhao
MLHC 2022
[paper]

Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables
Mengdi Xu, Peide Huang, Yaru Niu, Visak Kumar, Jielin Qiu, Chao Fang, Kuan-Hui Lee, Xuewei Qi, Henry Lam, Bo Li, Ding Zhao
AISTATS 2023
[paper] [code]

Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
Jiacheng Zhu*, Jielin Qiu*, Zhuolin Yang, Michael Rosenberg, Emerson Liu, Bo Li, Ding Zhao
ICLR 2022 Workshop on Socially Responsible Machine Learning (SRML)
[paper]

Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition
Wei Liu, Jielin Qiu, Wei-Long Zheng, Bao-Liang Lu
IEEE Transactions on Cognitive and Developmental Systems 2021
[paper] [code]

Visual Sequence Learning in Hierarchical Prediction Networks and Primate Visual Cortex
Jielin Qiu, Ge Huang, Tai Sing Lee
NeurIPS 2019
[paper]

Investigating Sex Differences in Classification of Five Emotions from EEG and Eye Movement Signals
Lan-Qing Bao, Jielin Qiu, Hao Tang, Wei-Long Zheng, Bao-Liang Lu
EMBC 2019
[paper] [code]

Multi-view Emotion Recognition Using Deep Canonical Correlation Analysis
Jielin Qiu, Wei Liu, Bao-Liang Lu
ICONIP 2018
[paper] [code]

Services

Area Chair: ACL Rolling Review (ARR)

Conference Reviewer: ICML, NeurIPS, ICLR, CVPR, ECCV, ICCV, WACV, ACL Rolling Review (ARR), ACL, EMNLP, EACL, AAAI, ACM MM, KDD, AISTATS, ICASSP, CHIL, MICCAI, MLHC

Journal Reviewer: Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Transactions on Machine Learning Research (TMLR), Journal of Data-centric Machine Learning Research (DMLR), IEEE Transactions on Neural Networks and Learning Systems.

Committee: NeurIPS 2022 virtual deep-dive session chair, CMU RISS Committee.

Teaching

Teaching Assistant of CMU 16-824 Visual Learning and Recognition, Instructor: Prof. Jun-Yan Zhu, Fall 2021

Teaching Assistant of CMU 11-777 MultiModal Machine Learning, Instructor: Prof. Yonatan Bisk, Spring 2021