About
Multimodal AI Foundations is a long-term research initiative dedicated to advancing the field of multimodal artificial intelligence. Our work focuses on developing foundational machine learning methods and systems that integrate vision, language, and other modalities, with a strong emphasis on robustness, efficiency, and safety. By pushing the boundaries of how AI models perceive, reason, and interact across diverse data types, we aim to build generalizable and trustworthy AI systems that can operate reliably in complex real-world environments.
This initiative is led by Prof. Kaiyang Zhou and his team at Hong Kong Baptist University.