About us

Challenge Organizers

Wei Xue

The Hong Kong University of Science and Technology

Contactweixue@ust.hk

Wei Xue is an Assistant Professor at the Hong Kong University of Science and Technology. His research interests include audio processing, AI music, foundation models, generative AI, and multimodal audio understanding and generation.

Junlan Feng

China Mobile

Contactfengjunlan@cmjt.chinamobile.com

Junlan Feng is an IEEE Fellow and Chief Scientist at China Mobile. Her research interests include artificial intelligence, big data, and AI technologies for the communications industry, with a particular focus on the development of the Jiutian AI Platform.

Shilei Zhang

Jiutian Artificial Intelligence Technology (Beijing) Co., Ltd., China Mobile

Contactzhangshilei@cmjt.chinamobile.com

Shilei Zhang is the speech technology lead at China Mobile Jiutian Research. His research interests include speech recognition, voiceprint recognition, and speech large language models, with a particular focus on speech technology development for the Jiutian AI Platform.

Yue Wang

China Mobile (Hong Kong) Innovation Research Institute

Contactyuewang@cmi.chinamobile.com

Yue Wang is an AI Technical Research Senior Executive at China Mobile (Hong Kong) Innovation Research Institute. Her research interests include speech-to-speech translation and numerical optimization algorithms for large language models, with a particular focus on audio translation and efficient LLM optimization.

Ruosong Yang

China Mobile (Hong Kong) Innovation Research Institute

Contactyangruosong@cmi.chinamobile.com

Ruosong Yang is an AI Technical Research Senior Executive at China Mobile (Hong Kong) Innovation Research Institute. His research interests include large language models and multi-agent systems, with a particular focus on LLM-based multi-agent applications.

Bei Liu

The Hong Kong University of Science and Technology

Contactbeiliu@ust.hk

Bei Liu is currently a Postdoctoral Fellow at The Hong Kong University of Science and Technology. His research interests include speech processing, spoken dialogue systems, large language models, model compression, and inference optimization, with a particular focus on efficient speech and language modeling.

Liumeng Xue

Nanjing University

Contactlmxue@nju.edu.cn

Liumeng Xue is an Assistant Professor at Nanjing University. Her research interests include speech, music, and general audio understanding and generation, with a particular focus on controllable speech generation, emotional and expressive speech generation.

Sitong Cheng

The Hong Kong University of Science and Technology

Contactschengaq@connect.ust.hk

Sitong Cheng is a Ph.D. student at The Hong Kong University of Science and Technology. His research focuses on text-to-speech synthesis and speech-to-speech translation, with the goal of enabling more natural, seamless, and expressive voice communication.

Jiahao Pan

The Hong Kong University of Science and Technology

Contactjpanbb@connect.ust.hk

Jiahao Pan is a Ph.D. student at The Hong Kong University of Science and Technology. His research focuses on audio and music understanding and generation, covering music generation, speech enhancement, music separation, and multimodal audio-visual generation.

Weizhen Bian

The Hong Kong University of Science and Technology

Contactweixue@ust.hk

Weizhen Bian is a Ph.D. student at The Hong Kong University of Science and Technology. His research focuses on text-to-speech synthesis, speech generation, and spoken language processing.

Boyi Kang

The Hong Kong University of Science and Technology

Contactbkangaa@connect.ust.hk

Boyi Kang is a Ph.D. student at Hong Kong University of Science and Technology. His research interests include speech, music, and general audio understanding and generation and multimodal intelligence.

Bin Long

Hong Kong Generative AI Research & Development Center

Contactlblongbin@163.com

Bin Long is a backend engineer at Hong Kong Generative AI Research & Development Center. His work focuses on scalable AI model service infrastructure, with a particular focus on unified access and deployment for LLM, TTS, ASR, and other speech and language model services.