📧 Email: [email protected], [email protected]
🔗 Links: Google Scholar DBLP GitHub
📜 Publications
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks
Mengzhao Jia, Wenhao Yu, Kaixin Ma, Tianqing Fang, Zhihan Zhang, Siru Ouyang, Hongming Zhang, Meng Jiang, Dong Yu
Leopard
Multimodal Activation: Awakening Dialog Robots without Wake Words
Liqiang Nie, Mengzhao Jia, Xuemeng Song, Ganglu Wu, Harry Cheng, and Jian Gu. (SIGIR 2021)
Multimodal Activation
Knowledge-enhanced Memory Model for Emotional Support Conversation
Mengzhao Jia, Qianglong Chen, Liqiang Jing, Dawei Fu, Renyu Li
Knowledge-enhanced ESC
Query-Oriented Micro-Video Summarization
Mengzhao Jia, Yinwei Wei, Xuemeng Song, Teng Sun, Min Zhang, Liqiang Nie (TPAMI)
Query-Oriented Video Sum
⚒️ Intern & Activities Experience
Research Intern, Tencent**, Seattle, US**
05/2024 - 09/2024
- Conduct large-scale training to develop a multimodal large language model targeting on multi-image text-rich scenarios.