I love the Olympics! Whenever I get a chance, I try to catch as much of the gymnastics events as possible. I've been a huge fan since I was a kid. And of course, with Simone Biles competing this year, I was even more glued to the screen! It got me thinking – could generative AI capture even a little bit of Simone's incredible talent?
My goal: recreate the opening sequence from Simone Biles's gold-medal-winning floor routine.
Prompting the Model
To do this, I decided to use a combination of Runway and Gemini to help me rise to the challenge. First, I needed to translate Simone's moves into a detailed description. To do this, I used Gemini 1.5 Flash to help me describe the opening sequence. My first prompt was all about getting a detailed description of the move:
"This is a video of the winning gymnastics floor routine from the Olympics this year. My goal is to recreate the floor routine in this video using a video generation model. To do this, I want you to watch the entire video then please describe the video in great detail so that someone who has never seen it can imagine what is happening in it and recreate it.”
The resulting description was good, but it wasn't formatted in a way that would work well with Runway's video generation model. Runway provides a guiide to follow, so, I gave Gemini another prompt, guiding it towards the desired format:
"The video generation model I'm going to use suggests writing a prompt in the following structure:
[camera movement]: [establishing scene]. [additional details].An example of this would be:
Low angle static shot: The camera is angled up at a woman wearing all orange as she stands in a tropical rainforest with colorful flora. The dramatic sky is overcast and gray.”
The AI's Interpretation
The resulting prompt was promising, but Runway's safety filters had other ideas. My first through fifth attempts at re-writing the prompt got flagged as violating their content policy. This wasn’t too surprising to me as I know that moderation and filters are hard things to get right - that said, I made sure to file lots of feedback! After even more tweaking, I landed on a prompt that worked:
“Wide shot, tracking: A woman in a sparkly, star-themed gymnastics outfit, stands on a blue and pink gymnastics floor mat in a massive arena. She powerfully runs and leaps across the floor, executing a double pike twist, a back handspring layout back tuck, and a stunning arabian double front. The crowd is roaring and cheering, the arena lights are bright, and confetti falls from the ceiling.”
And here's the AI's masterpiece:
Let's just say Simone's skills remain unmatched, even in the digital realm. It seems the nuances of Olympic-level gymnastics are as challenging for AI to grasp as they are for us mere mortals!