ByteDance's new OmniHuman AI can turn a single image into a video

ByteDance, the company behind TikTok, has recently displayed a new AI video creator called OmniHuman-1 that is capable of making a lifelike video that can sing, use musical instruments, talk, make hand gestures, dance, and more, all of these from a single photo.

The OmniHuman’s training involved 18,700 hours of human video data.

In an article published in arXiv titled “OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models,” the researchers are proposing an end-to-end multimodality-conditioned human video generation framework that can generate human videos based on a single image or motion signal. This is their response to overcome the issues previously encountered by earlier models.

Related

“OmniHuman supports various visual and audio styles. It can generate realistic human videos at any aspect ratio and body proportion (portrait, half-body, full body all in one), with realism stemming from comprehensive aspects including motion, lighting, and texture details,” said OmniHuman’s GitHub repository.

The project page also included a wide array of samples demonstrating the capabilities of OmniHuman. Aside from human subjects, the AI video model can also make cartoons, artificial objects, or animals move.

ByteDance’s OmniHuman has a lot of potential applications in various settings, like education, sales and marketing, gaming, and entertainment. There’s also a possibility of incorporating it with TikTok. However, it also raises ethical concerns about using this technology to scam and fake political videos.

Please note that as of the writing of this article, there is no official release of the service or downloads for OmniHuman-1, so be careful with fake and potentially harmful downloads online.

ByteDance’s new OmniHuman AI can turn a single image into a video

Leave a comment