I Made a 60-Second AI Short Film in a Weekend. Here’s the Stack.

The experiment

I gave myself a weekend — 48 hours, Saturday morning to Sunday night — to make a 60-second short film using only AI tools. The constraint was that it had to be something I’d be willing to put on my LinkedIn without lying about how it was made. No “directed by” credit; I’m not pretending I’m a filmmaker now. But also no apologetic disclaimer in the description either. Just a real 60 seconds of cinematic content that holds up.

The story was a sci-fi vignette I’d been turning over in my head for months: a kid in a near-future city finds a paper map. That was it. Beginning, middle, end. 60 seconds.

Day 0 baseline

Going in, I’d used Midjourney plenty for stills, Runway and Veo for short clips, ElevenLabs for voice projects, Suno for music. What I’d never done was stitch all of them into a finished piece. I had a working DaVinci Resolve install, an iPad for sketching, and roughly $200 of expected AI generation budget on the spreadsheet.

Saturday — writing and shotlist

Spent the first ninety minutes writing the script. 60 seconds of screen time is roughly 90-120 words of voice-over plus visuals — about half a page. I wrote it in Claude with a “rewrite this for cinematic voiceover, no exposition, all sensory” prompt cycle. The third revision was the keeper.

Then I drew a 12-shot shotlist on the iPad. Wide establishing, push-in on the kid, close on the map, montage of the city, lift on the discovery, beat, end card. Old-fashioned but the only way I know to keep ambitious shots from breaking the budget.

Saturday afternoon — image and video generation

This is where the weekend’s money mostly went.

Stills first, then animated. Midjourney v6 for all 12 establishing stills. About 4-5 takes per shot to land each one. Style-reference set to keep consistency. Then I fed the best still per shot into Runway Gen-4 with motion direction — “slow push-in, camera at eye level, drone-like float.”

Veo 3 for the dialogue beat. The kid’s “oh” reaction shot was the only one with synced audio. Veo handled this without me having to dub in post.

Iteration costs reality. I expected to generate each shot 4-5 times. Some needed 12. Two needed me to give up and recompose with a different angle. By Saturday night I’d spent $147 on generation and had 11 of 12 shots locked.

Style consistency was the hardest part. The kid in shot 3 doesn’t look quite like the kid in shot 8 unless you train style references hard. Runway Gen-4’s character reference feature was the killer feature here.

Sunday morning — audio and voiceover

Voice in ElevenLabs. I used a licensed voice — not a clone of anyone real — that fit the slightly-melancholic narrator I wanted. Three takes of the script, picked the best, used the “emotion control” sliders to add a beat of breath at the right moments.

Music in Suno. I prompted Suno v4.5 with “cinematic, sparse piano with ambient pads, melancholy but hopeful, 60 seconds, no vocals.” First generation was too busy. Third was beautiful. Total cost: $0.50 in credits.

Sound design from Freesound. Footsteps, paper crinkle, distant city hum, all from the Creative Commons library. Mixed in DaVinci’s audio editor.

Sunday afternoon — edit and color

Cut everything in DaVinci Resolve over four hours. The voice-over and music carry the timing. I matched cuts to the rhythm of the score, not the other way around.

Topaz Video AI for upscaling the three shots that came out at 720p when I needed 1080p. Worth the $1.50 of credits.

DaVinci’s AI color match across the 12 shots. I tuned by hand on three; the rest matched automatically. The 12 shots looked like 12 shots; the color match made them look like one film.

Final render at 1080p, H.264, 8 Mbps. The file was 31MB.

The data

| Resource | Spend | |—|—| | Midjourney credits | $24 | | Runway Gen-4 | $98 | | Veo 3 | $32 | | ElevenLabs voice | $11 | | Suno music | $0.50 | | Topaz Video AI | $1.50 | | DaVinci Resolve | $0 (Studio license I already had) | | Total cost | $167 | | Total time | ~16 hours over 2 days |

The finished film is about 65 seconds. I posted it to LinkedIn with the spend and process disclosed. It got 47K views and a number of “wait, that was AI?” replies that I take as the metric of success.

Would I do it again?

Yes. With three changes. I’d write the script over a week, not 90 minutes — the script was the bottleneck on emotional impact, and AI tools couldn’t fix bad writing no matter how good the visuals looked. I’d budget another weekend, because the second weekend would be all about iteration and polish, not generation. And I’d commit to a longer piece next time — 3 minutes — because the workflow taught me that the marginal cost of one more minute is mostly story development time, not AI spend.

If you’re a filmmaker reading this and feeling threatened: don’t. The shots I made still need a person who can write a script worth watching. AI gave me the camera. I still had to know what to point it at.