Publications
2025
Mitigating Hallucination in Multimodal LLMs with Layer Contrastive Decoding
Bingkui Tong, Jiaer Xia, Kaiyang Zhou
NeurIPS 2025 Workshop on Multimodal Algorithmic Reasoning
pdf |
code
Measuring Epistemic Humility in Multimodal Large Language Models
Bingkui Tong, Jiaer Xia, Sifeng Shang, Kaiyang Zhou
arXiv
pdf |
code |
dataset |
blog
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
Jiaer Xia, Bingkui Tong, Yuhang Zang, Rui Shao, Kaiyang Zhou
International Conference on Computer Vision (ICCV), 2025 (Highlight)
pdf |
code |
blog
Training-Free Watermarking for Autoregressive Image Generation
Yu Tong, Zihao Pan, Shuai Yang, Kaiyang Zhou
arXiv
pdf |
code |
blog
Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning
Jiaer Xia, Yuhang Zang, Peng Gao, Yixuan Li, Kaiyang Zhou
arXiv
pdf |
code |
model |
blog
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization
Sifeng Shang, Jiayi Zhou, Chenyu Lin, Minxian Li, Kaiyang Zhou
arXiv
pdf |
code |
blog