<ul><li>Researchers from the Hong Kong University of Science and Technology and Moonshot AI have developed a new AI model called AudioX.</li><li>AudioX is a unified model that generates audio and music using multimodal inputs, such as text, video, image, music, and audio.</li><li>The model offers various use cases, including text-to-audio, text-and-video-to-audio, and video-to-audio conversion.</li><li>The researchers aim to address the scarcity of high-quality multi-modal data and improve the field of multi-modal audio generation.</li></ul>

Researchers Unveil AudioX—AI Model That Converts Anything to Audio, Music

Discover more