The article makes it unclear if they are building a new model or if it's just the API. But I am guessing it's the API.
So it's "release to developers" rather than "new AI model". They cannot ship the API.
I would assume you would just provide an OpenAI compatible endpoint or two? But maybe they are not doing it that way.
Who knows what they are doing though. Maybe Meta has some kind of global API mesh thing and they can't quite make it work with vLLM or Sglang or something. Maybe they are building out a whole metered cloud IaaS for AI from scratch and that's just how long it takes. Maybe it's not technical complexity and just one of the managers is a problem.
Maybe they are delaying the API release until another more competitive model finishes training and testing.
API server is not hard problem and not make sense for indefinite postpone. I think the more likely explanation is model quality.
Too bad for Meta, and very sad day Llama.