News

Lipreading aims to predict speech content based on lip movement without replying on audio. This paper focuses on the Task 2 of the Grand Challenge on the chat-scenario Chinese lipreading in ICME 2024, ...
Large Multimodal Models (LMMs) excel in English multimedia tasks but face challenges in adapting to other languages due to linguistic diversity, limited non-English multimodal data, and high training ...