Text-to-Video is the new thing with Gen2!

I've been playing with image synthesis for a year now, mostly around text-to-image. I've created worlds with Midjourney, Dalle, Stable Diffusion, and anything else I could get my hands on.

However, recently I was accepted into the RunwayML Gen-2 beta which is a Midjourney style ( discord interface ) prompt-to-video system. It's really exciting to get into and play around, but I tell you it's near impossible to control or get anything that doesn't look contorted and warped. Artifacts of the current iterations of the medium. If early Midjourney had distorted hands.. Gen2 turns that distortion to 11! LOL.

Regardless, I love it. I am embracing the artifacts, as I did in the early days of Disco Diffusion and Midjourney. I'm using them as trademarks to the style. It's still very challenging to control composition, cameras, or pretty much anything currently. But with patience and LOTS of iterations you can eek out some useable synthesized footage.

To that, my first serious swing at it yielded these two sequences.

The Hitman

and Invasion!

Both were created one shot at time using nothing by text prompts describing the scene, the performance, the camera, and the Noir or retro Sci-Fi style I wanted to capture.

Although these are raw, they are unmistakably unique in their synthesis. Kind of like a feeling of these styles, while being uncanny in everyway. I love them!!

This is already impressive to me at this stage how far it's come, and I know it'll improve greatly in the coming months. Stay tuned!

j.doodles

Search This Blog

Text-to-Video is the new thing with Gen2!

Comments

Post a Comment