logoalt Hacker News

Cloudflare's AI Platform: an inference layer designed for agents

256 pointsby nikitociyesterday at 1:17 PM60 commentsview on HN

Comments

mips_avataryesterday at 8:30 PM

So it's basically just openrouter with cloudflare argo networking? I feel like they could do some much more interesting stuff with their replicate acquisition. Application specific RL is getting so good but there's no good way to deploy these models in a scalable way. Even the providers like fireworks which claim to let you deploy LORAs in a scalable way can't do it. For now I literally have to host base load on my application on a rack of 3090s in my garage which seems silly but it saves me $1k a month.

show 2 replies
whereistejasyesterday at 2:38 PM

This actually looks very useful. Cloudflare seems to be brining together a great set of tools. Not to mention, D2 is literally the only sqlite-as-a-service solution out there whose reliability is great and free tier limits are generous.

show 5 replies
james2doyleyesterday at 6:35 PM

I find it really confusing that the worker AI models on here: https://developers.cloudflare.com/workers-ai/models/ do not have full overlap with the ones on here: https://developers.cloudflare.com/ai/models/

Yes, you can see the same "hosted" ones on there, but when you look at the models endpoint, there are much less options at the "workers-ai/*" namespace. Is that intentional?

show 1 reply
RITESH1985yesterday at 9:44 PM

The inference layer question is getting solved fast. The harder problem coming next is the governance layer — what agents are authorised to do and proving it later. Curious if Cloudflare is thinking about this layer too.

show 1 reply
datadrivenangelyesterday at 6:36 PM

Good to see their purchase of Replicate paying off!

bm-rfyesterday at 2:01 PM

Not seeing any pricing info on the models[1] page. Wonder how much of a lift this is over paying providers directly. Perhaps Cloudflare is doing this at cost? Also interesting that zero data retention is not on by default, and is not supported with all providers[2]. Finally, would be great if this could return OpenAI AND Anthropic style completions.

[1] https://developers.cloudflare.com/ai/models/

[2] https://developers.cloudflare.com/ai-gateway/features/unifie...

show 2 replies
messhyesterday at 8:25 PM

So, is this similar to openrouter?

show 2 replies
kinnthyesterday at 11:52 PM

openrouter works perfectly well for me called by cloudflare workers. open router also has superior cascading and waterfalling if models are offline. Not sure they have that working from V1.

I love everything about openrouter. So kinda a fan boy.

pprotasyesterday at 1:38 PM

Can't wait for the free tier!

show 1 reply
ramesh31yesterday at 2:41 PM

Big, could be a viable Bedrock alternative. Probably better uptime than Anthropic or AWS, too.

throwpoasteryesterday at 2:09 PM

Anthropic gonna acquire Cloudflare for stock. Solves their infrastructure problems in one shot.

show 2 replies
Jack5500yesterday at 1:42 PM

Sadly no mention on regions.

show 1 reply
6thbityesterday at 2:09 PM

don’t attach to a single AI provider when you can attach to cloudflare as your single AI gateway provider!

rant aside, they are greatly positioned network wise to offer this service, i wonder about their princing and potential markup on top of token usage?

i presume they wont let you “manage all your AI spend in one place” for free.

show 1 reply
wahnfriedenyesterday at 2:18 PM

No spending limit / no ability to set a budget, unlike Google or OpenAI. Be prepared for an eye-watering invoice if you have a bug or get hacked.

edit: Why downvote? It's correct, and it's a risk that competitors handle better, including for their CDN products (compared to Bunny CDN). Maybe you are just used to the risk and haven't felt the burn yourself yet. Or you have the mistaken notion that there is no price at which temporary downtime is worthwhile to avoid paying.

show 2 replies
reconnectingyesterday at 9:33 PM

`Unified inference layer` is a polite way to say: "proxy that knows every prompt and every response".

ernsheongyesterday at 2:48 PM

What is Cloudflare trying to be? Everything everywhere all at once?

show 3 replies
mbtrucksyesterday at 3:02 PM

Can I set a hard cost limit ? Else I'm not interested, don't be like googles mess of billing.

show 1 reply
mbtrucksyesterday at 3:02 PM

Can I set a hard cost limit per day ? With no drift, else I'm not interested.

show 1 reply
stultyesterday at 3:08 PM

A few weeks ago, I ran into a bug with Cloudflare's DNS server not detecting when I updated the records with the registrar. The bug was 100% on their end, entirely unsolvable by me, yet they have made it literally impossible to contact them to file a bug report. Their standard user help workflow dead-ended by forcing me to talk to their absolutely useless AI help chatbot, which proceeded to regurgitate their FAQ (inaccurately, uselessly), then referred me to a phone number that was disconnected/not in service, then gave me an email address that auto-replied it was no longer in use, then just looped back to the FAQ. There was no way for me to even send them an email to let them know they have a major bug.

I immediately pulled all my sites off of Cloudflare and I will never use that godawful nightmare of a company for anything ever again. If they can't even host a generic help bot without screwing it up that badly, why would I ever use them for anything at all, never mind an AI platform?

show 1 reply
kantaroyesterday at 11:17 PM

[dead]

ZihangZtoday at 1:13 AM

[dead]