logoalt Hacker News

profsummergigyesterday at 9:45 PM22 repliesview on HN

IMHO, the biggest problem with the future of open weights models is that currently, open weights models are the result of philanthropy by some private org. (e.g. DeepSeek).

The spigot can be turned off at any time.

Until there's some sort of "community owned hardware", open weights models are always at risk of being discontinued.


Replies

NitpickLawyeryesterday at 9:53 PM

Yeah, but the biggest plus for open models is that they can never be taken away. In other words, whatever capabilities they reach (even if there will never be another model), those stay forever. That can't be said for API-based models where a provider can sunset models whenever they feel like (i.e. gpt5-mini will soon be gone, and replaced by a more expensive 5.4-mini, same for goog, etc).

And there will always be incentivised parties that release models. Nvda for one has every incentive to keep the nemotron line going, as they're directly profiting from people running this. And the models aren't really far from open SotA anyway.

Goog will probably continue to release the small models, since they'll use them for browser stuff anyway, and know that they'll leak. So for them it's a win-win to release the small models and gain some dev market share.

And the chinese labs also have incentives to keep releasing models, and will likely continue to get gov support to do so (yay commercial wars between nations).

show 6 replies
mirekrusintoday at 8:25 PM

Closed weight models are a result of philanthropy by some private investors.

c0rruptbytestoday at 8:01 AM

Deepseek isn't philanthropy, it's a hedgefund trying to short the western AI market by saying "hey we can do 90% of they can (arguably better at a density metric) for a 1/10th of the cost"

it's my theory at least, the Hindenburg Research of AI

show 1 reply
jamiedborin1today at 9:38 AM

I am the original author of the post - thanks for reading it!

I think the future of open weights models will be similar to fabless chip design companies. There will be companies that can train models and they will licence those models to inference companies that manage the APIs.

The inference companies need much less capital and the training companies dont need to divert resources from training to inference.

Some of the Chinese model training companies are already doing this and licencing their models to inference providers.

show 3 replies
fridderyesterday at 9:59 PM

We need a SETI@Home but for model training

show 6 replies
throwawayffffasyesterday at 11:45 PM

I don't think that's the case, it's not philanthropy, they are getting something out of it. The labs are learning from one another from the shared models.

Plus I am certain it makes financial sense. I am guessing here but fully utilizing a subscriptions limits probably costs the operator more money than the subscription revenue, that is why anthropic is making such a big stink about the chinese data harvesting. By releasing the weights, you are relieving yourself from that burden because the competition does not need to hammer your subscription service they can just download your model and analyze it and run it all day.

Also for the largest models it makes no sense to run it yourself unless you are a major player. Renting the hardware is ludicrously more expensive than their subscription tens of thousands of dollars. And buying the hardware to run them is in the hundreds of thousands of dollars.

show 1 reply
Shitty-kittyyesterday at 9:53 PM

It's just a smart business decision that allows their models to compete and gain market-share against much pricier private models. No philanthropy there.

show 1 reply
aleccotoday at 5:58 AM

> Until there's some sort of "community owned hardware"

The hardware is already available for renting at reasonable prices. We need community funding. I wish people pooled a fraction of the money they burn on local GPU rigs on funding training/testing/etc.

A big problem is like in open source: it's way too atomized. Just one competitive ground-up community LLM would require tens of millions $. But who gets to pick?

IMHO the only chance is highly specialized and smaller LLMs instead. And this is still millions to train.

And remember LLMs are competitive for only a handful months.

recursiveyesterday at 10:04 PM

This seems backwards. Access to Fable can be removed. I don't see how an open weight model can ever be put back into the bag though.

show 1 reply
matheusmoreiratoday at 6:50 AM

I wish we had some kind of distributed training capability... Like Folding@home, but for LLMs.

show 1 reply
UncleOxidantyesterday at 10:38 PM

> The spigot can be turned off at any time.

True. And it's possible that this has already happened at Alibaba Qwen - at least for the smaller models that people had a chance of running at home (122B and smaller).

show 2 replies
notnullorvoidyesterday at 10:17 PM

> Until there's some sort of "community owned hardware"

Or until some bright people figure out drastically more efficient means of training.

Eridrustoday at 5:21 AM

I think the bigger issue is the ever increasing capital requirements, which may cause even the closed weight companies to fall away from the frontier, e.g. Google & Meta are barely hanging on. For Google it feels a bit existential to remain at the frontier, but even then they're barely there.

I hope that we find ways of continuing to improve these models besides continuing to exponentially increase capex spend until all but one of your competitors falls away.

show 2 replies
40fourtoday at 4:16 AM

We should address the elephant in the room. The problem with the future of open weight models is not they are created as a result of philanthropy by some private org. All of the top contenders are created by the Chinese government.

I don’t think we should describe these companies as simply releasing these highly capable open weight models out of the goodness of their hearts

show 2 replies
gwerbintoday at 3:44 AM

Isn't this also true of a lot of FOSS software and libraries? tensorflow and pytorch for example, among many others.

slashdaveyesterday at 11:13 PM

Training these models is not a "hardware" problem.

show 2 replies
ForHackernewsyesterday at 10:15 PM

It's not pure philanthropy: https://gwern.net/complement

jmyeetyesterday at 11:17 PM

How is this a complaint? Once you have the model, you have the model. Download DeepSeek-R1 671B and you have it. You might not get improvements in the future, just like you may not ever get a future release of an open source project. Is that an indictment of open source?

But consider the alternative. OpenAI and Anthropic can shut off your account or API key at any time for any reason. How is this better? You have way more security when you're running your own model.

show 1 reply
alfiedotwtftoday at 5:07 AM

Exactly my worry. I’m optimistic in the future the EU, the EFF, the GNU, or the Linux Foundation could have been the umbrella to run a LARGE open model for everyone.

It’s sad to think that Mozilla spent years and millions doing virtual reality and AI, they would have been perfect to do this but let’s face it - who knows if Mozilla will be around even 5 years from now

krater23today at 12:01 PM

Have no fear,after the bubble bursted, there will be more than enough cheap hardware for espacially this.

willmaddentoday at 5:40 PM

[flagged]