logoalt Hacker News

stavrosyesterday at 12:26 AM1 replyview on HN

OpenAI offers that, or at least used to. You can batch all your inference and get much lower prices.


Replies

airspressoyesterday at 10:31 AM

Still do. Great for workloads where it's okay to bundle a bunch of requests and wait some hours (up to 24h, usually done faster) for all of them to complete.