To be honest, I think the performance of Gemini Omni Flash is still not as good as Seedance 2.0. You can try using both models on this platform. https://omnivideoai.co
While at a cursory glance it looks as impressive as always, subtle spatial errors, and geometry that changes as it goes out of sight and comes back again hints at the fact that Google has still yet to solve the problem of deep spatial understanding.
Which considering just how pretty and detailed this whole thing looks, imo points at a fundamental issue at how these things are trained - it's as if there's no structure to its knowledge and training, like how an artist trained to draw would first try to understand simple 2d composition, then perspective, then light and shadow, mastering each concept and gradually building up a hierarchical understanding - it seems like its trying to learn everything at once.
I would rather see an AI model that I could give a floorplan of a building and it would generate an accurate flythrough on any path, even if it looked like butt.
Im not just talking out of my arse, I did work for a while in data science/engineering, and one of the big lessons people needed to be reminded of is to clean/downsample the data - a dataset consisting of a million samples could very well take 1000x as long to process as if we downsampled the whole thing to just a couple of thousand samples and we could learn the same conclusions with the fraction of expended time/effort.
I'm sure there's a similar logic in RL, that if you dump a trillion samples into the datacenter that consumes the same power as a city, what the model learns is what it could've learned with a much more curated training set and directed approaches.
At first usage I'm not impressed. I've probably spent a couple grand on Seedance 2 to date, and I can't find anything google omni flash does better than Seedance from running a handful of samples through the system. You can find some of the videos I've made in my HN bio link.
> Prompt: Make it look like the weird shape of my hand hole super zooms and magnifies the ground it's looking at in sharper quality.
There's got to be a reason this is phrased so insanely, right?
What's the end goal of video generation? It feels unnecessary. Text generation leads to AI that can replace workers. Video generation is bad and only for video content generation, like movie and tv show production?
At the bottom there is a "Try in Youtube Shorts" button.
Oh god...
We could be solving fusion power and instead we’re generating videos of birds in space or something. The market is a harsh mistress sometimes.
I'm an AI optimist. But AI video is probably the one thing that does depress me. Seeing that we can make anything visually, there's nothing that impresses me visually. I watch a video that two years ago I would've thought was really cool, and now my first thought is, "Yawn, is this AI?".
Video, more than anything else, is the place where I really care if something is AI or not. If I could get a TikTok that had no AI usage -- I'd be in. Which is weird for me, because I'm typically the guy who is all-in on AI.
> I can create more videos as soon as your limit resets. Check your usage in Settings.
I have not used Gemini in a month.
> I can create more videos as soon as your limit resets. Check your usage in Settings
I did not create any videos yet.
Google, building great AI that nobody can try out.
But thx for the press release.
Browser crashes while scrolling because of all the auto playing videos. Please use IntersectionObserver to pause the video when not in display.
It's funny how they specifically use the phrase "output that follows real-world physics" to describe the marble rolling video. At the end of the zigzag track, the marble jumps up for no reason. In a couple of other places it speeds up with no apparent energy source. It's still an amazing result, but they could have picked a better example for this claim!
What I'm hoping/waiting for is IMDB users creating alternative endings of movies.
It could make the comments section even more fun.
I think Hollywood is in for a rough era. The disruption is happening at break neck speeds.
Interestingly the `o` in GPT-o4 stood for Omni too (which I never realized until yesterday when reading random 3rd party documentation)
Even though I don't have words to express how impressive this capability looks. I am genuinely scared at the harmful use cases of this.
So it's really good, and we have reason to believe, never again, anything that happens in a video. Unless there's a super-product somewhere to authenticate footage?
When I click the link, the website crashes on my iPhone 13 iOS Chrome lol
Who is creative enough to drive this in any meaningful way?
Certainly not me - you have to be a great artist /designer to even imagine what to do with it.
Does anyone else feel like Google is just always a dollar short and a day late here? Maybe not a dollar short, but it's like they've consistently been focused on the wrong thing. First they missed chatbots, now they're missing coding agents while they double down on chatbots and video gen (which OpenAI has already basically abandoned). Maybe this strategy is actually genius and I'm too stupid to grasp it.
The people that think this output looks good are the same people that "don't get" art.
From a technical perspective, it's very impressive, no doubt. But from an artistic perspective I thought all of these examples on the site look bad.
In my day job I program rigid body behaviour in real time amongst other simulations. I think rigid body contact is hard to learn as it is inherently discontinuous.. something you discover when trying to code a solver.
As such I always use this prompt as a test: "A video of a jenga brick tower falling over as a brick is removed. The physics of each brick must be realistic."
It gave me a video of where bricks suddenly disapper or morph into others[1]. The linked video is after 2-3 iterations of me insisting on realistic physics. If you are just glancing at this, you would believe it is realistic.
That said this is still very impressive and one more step towards .. IDK what. But I am a bit reasurred that at least my job won't be fully replaced with AI :)
[1] https://streamable.com/2em1r3