For almost a year now I've been playing with various applications of Image Synthesizing workflows.
At the start of 2023, I wanted to create something positive and encouraging. So I thought what a great opportunity to combine the two? That's where this idea originated from.
The process was something like this:
I reached out and reconnected with a friend I've known for a long long time who is a successful spoken word poet. Buddy Wakefield. Told him my plan, and asked if he had any inspiring work that I could build visuals to.
"We Were Emergencies" is part of a longer piece Buddy performs when he tours, and I thought it was great for what I had planned.
So first, I feed Buddy's audio performance into "Video Killed the Radio Star" Colab, which uses A.I. to parse out lines from audio with a 'whisper' model and then uses each line as a prompt for a frame. Then it takes those frames and builds and edit to the audio in sort of a 'board-o-matic'. The output styleframes can be seen here:
Once I had an edit, and the synthesized styleframes I went to work building out cameras for each shot. Some were very simple pans and dolly moves, others I built out in Maya, or motion captured from my phone. All fed into the Stable Diffusion variant 'Deforum Diffusion'. Each sequence was processed and rendered out, to give me the final resulting video:
'We Were Emergencies"