realistically any 'huge' frontier model that takes a rack of H100s to infer against is probably going to have downtime no matter who runs it.
downtime is always going to 'scale' poorly against loads that require a lot of hardware thrown at them, even with lots of good fail-over -- probably worse for the small vendors because they don't have the contracts supplying them with hardware first so availability is already at a premium for them.
so, I guess i'm saying yeah I hope frontier-level-models get out soon in the open arenas, but I suspect the same or similar level of exclusivity will exist as long as they take that much compute to operate.