ByteDance’s next-gen AI model can generate clips based on text, images, audio, and video

The Verge
ByteDance launched Seedance 2.0, a next-generation AI video generator supporting multimodal prompts including text, images, audio, and video.

Summary

ByteDance, the company behind TikTok, has introduced Seedance 2.0, its next-generation AI model for video generation that supports multimodal inputs, allowing users to combine text, images, video, and audio in prompts. The company claims significant improvements in generation quality, especially for complex scenes and instruction following, enabling the creation of up to 15-second clips that account for camera movement and motion. Users can refine prompts with up to nine images, three video clips, and three audio clips. This launch places Seedance 2.0 in competition with recent advancements from Google Veo 3 and OpenAI's Sora 2. Early user demonstrations show the model generating diverse content, including realistic action sequences and anime styles, though copyright implications remain unclear. Currently, Seedance 2.0 is accessible only via ByteDance's Dreamina AI platform and its assistant, Doubao, with its future availability on TikTok uncertain.

(Source:The Verge)