Great release! It's awesome that they trained smaller models. With some effort I was able to get them running on my generative sampler/groovebox project this morning (shameless plug: https://engram.audio)
Also appreciate the attention to detail with licensing in the training set. This is an important sticking point – both commercially and ethically – for any product that integrates this type of model.
"Two early 20th century authors are talking while walking downtown Paris, occasionally noticing landmarks, while we hear horse hooves as well as a few cars"
https://stableaudio.com/1/share/b4eeaa11-cf29-4e09-88cd-a058...
Are the released models models useful? I'm worried about their description of the output. I haven't had a chance to try them yet and unable to run them on hugging face at the moment. The https://stableaudio.com/ sample on their website could be used as sample materials for song making but definitely lacking frequency range expected today as a final product.
It is insanely fast. Less than 2 seconds for 120 seconds of audio in my 3090.
It sounds too much like general midi. It is better for electronica than for any other genre.
Impressive nonetheless
"We also support inpainting, enabling targeted audio editing and the continuation of short recordings."
I didn't know there were models for that. Very cool!
How I yearn for an open source alternative to Suno.AI, and something that can create super niche sound effects. This feels like Suno 1.0 levels of quality but maybe it can get there?
This is a very small item, but I found it interesting that the paper does not credit Stability AI in the author bylines.
hmm stableaudio.com seems to be dead tho? at least it is for me
Stability.ai is still around?
I thought they died because they gave away everything for free with no revenue model.
Emad trained a lot of really great models, but he just gave them away. This cost enormous sums of money.
I wish for a world where Stability gave away the weights, but had a monetization loop to keep going. Imagine if we had OpenAI, Anthropic, and Stability to counter Google. And imagine if the US had a sizable open weights company.
[flagged]
Blog post: https://stability.ai/news-updates/meet-stable-audio-3-the-mo...
A bit bizarre that there's not a single audio example in that post. But the model is available on their gen-AI service: https://stableaudio.com/