Norway's 2 petabytes of Huawei flash storage and LLM training

316 points • by rbanffy • yesterday at 7:37 PM • 198 comments • view on HN

Comments

I'm a Norwegian, and I use the national library almost every day for searching through texts. They have truly one of the best working user interfaces (and functionality) for searching through the massive amounts of text.

➕ show 3 replies

KeplerBoy • yesterday at 10:11 PM

How true is this statement: "He asserted that any country with its own language that did not have a sovereign LLM trained in that language was at a disadvantage as a globally trained, English-speaking LLM would not know about that country’s history, news and culture that was described in the local language."

I thought all big players already train on basically everything remotely available to them no matter the language or quality, so his take sounds like an opinion formed in the early days of generally available LLMs.

➕ show 9 replies

solenoid0937 • yesterday at 8:39 PM

> The Olivia system is an HPE Cray Supercomputing EX system, with 448 GPUs and 64,512 CPU cores.

Training a sovereign LLM with this meager hardware as opposed to a LORA on some open source model seems like a huge mistake and a potential red flag.

There is no way these people have the resources to train a fully fledged LLM, so claiming that is their goal makes me think they don't intend for the LLM to be useful.

Which begs the question, whose money are they wasting - and why?

➕ show 12 replies

timmg • yesterday at 9:14 PM

I wonder if instead (or in parallel), Norway should build a set of training data and share it (for free) with all the model builders.

Seems like making the frontier models know Norwegian and their culture is a better (or additional!) way to reach the end they are going for here.

➕ show 2 replies

rafram • today at 2:54 AM

> Marius Husnes, the Head of IT Platform at the library (Nasjonlbiblioteket) discussed the project at Huawei’s ID Forum 2026 in Paris, saying that no commercial LLM provider was developing a local (Norwegian) language LLM. He asserted that any country with its own language that did not have a sovereign LLM trained in that language was at a disadvantage as a globally trained, English-speaking LLM would not know about that country’s history, news and culture that was described in the local language.

I am not overly confident that Marius Husnes knows what he’s talking about here.

➕ show 7 replies

seanvk • today at 5:28 AM

The Welsh language getting LLM training with Nemotron

https://www.bangor.ac.uk/news/2025-09-15-reaching-across-the...

rldjbpin • today at 9:26 AM

may not be the most efficient way to go about things, but there remains a seemingly obvious use case for non-latin languages to do things from scratch.

see sarvam.ai and their tokenisation improvements on local languages [1]. not every llm needs to help with coding, nor it needs to already become Babel fish.

language is culture, so i can see the motivation behind their initiative. it must be nice to afford to do this yourself.

[1] https://www.sarvam.ai/blogs/sarvam-30b-105b

➕ show 1 reply

danborn26 • today at 9:28 AM

This is a massive storage deployment. Given the I/O demands of LLM training, especially for checkpointing, moving to this scale of NVMe flash makes sense compared to traditional disk arrays.

Levitz • yesterday at 8:37 PM

>As Husnes put it; Norway is a small country solving a problem every non-English-speaking nation will face: how do you build AI that reflects your language, your culture and your history? AI needs custodians, not just builders.

I'm afraid the answer is, mostly you don't.

Such a thing requires strong political will that, at least in my environment, seems basically impossible to align.

The costs are prohibitive, but beyond that, the type of person who cares about local representation like that is either completely fine with letting foreign companies implement it (after all, you can use ChatGPT in Basque if you want to) or is against the idea of AI altogether.

➕ show 3 replies

yokoprime • today at 6:06 AM

The wording in this article is a bit strange, why the extreme focus on the brand of storage media? Also, the term LLM seems to be used in a very broad way here, are they actually building a language model from scratch, or are they fine-tuning?

dmos62 • today at 8:01 AM

Huawei? You'd think that the recent European revulsion from using overseas providers would have reached Norway's public sector too.

postepowanieadm • today at 6:51 AM

Norway isn't in the EU (no restrictions on Huawei) and has cheap electricity, could become an ai powerhouse.

dalemhurley • yesterday at 8:44 PM

How about that, they actually asked for permission to use data and the companies said yes.

➕ show 1 reply

arjie • yesterday at 8:53 PM

This can’t be right. 2 PB of flash is like $200k. It’s within reach of many individuals. Then again I guess you don’t need that much storage so maybe it is.

➕ show 3 replies

Den_VR • yesterday at 8:31 PM

> He asserted that any country with its own language that did not have a sovereign LLM trained in that language was at a disadvantage as a globally trained, English-speaking LLM would not know about that country’s history, news and culture that was described in the local language.

I don’t know this is true. But whatever sounds true enough and gets funding seems to be what flies these days.

➕ show 1 reply

petterroea • today at 6:34 AM

As a Norwegian I have never needed a Norwegian language model. Doing most things in Norwegian puts you at a disadvantage internationally anyways. Maybe this has value in schools, but wouldn't it just give kids more trust in relying on LLM's? My friends who work in education report that group work has become insufferable because many do not think critically and ask LLM to verify everything. I really don't see a benefit, but maybe they will find one - that is what research is for.

I am reminded that we recently concluded our experiment of forcing things to be digital on school was considered a flop. These things have a cost if we are wrong.

Schlagbohrer • today at 7:48 AM

Sapir-Worf hypothesis but for AI

kvam • yesterday at 8:40 PM

As a Norwegian this sounds like a mistake. Who will use this LLM? Where? For what? The underlying data could be made more easily searchable and digestible for agents in general if the goal is better knowledge of Norwegian culture.

➕ show 3 replies

6510 • today at 5:41 AM

What is called culture here will increasingly be propaganda. It reminds me of people cheering twitter as a replacement of RSS or using facebook to communicate with your customers rather than email. You won't know which will be the winning company, don't know who might control it in the future and we cant predict what it will cost. It doesn't take much to be very annoying.

ipsum2 • yesterday at 8:33 PM

This is how much storage the average r/datahoarder user has in their basement. Fewer than 100 hard drives.

➕ show 1 reply

DeathArrow • today at 6:07 AM

I thought US has already coerced most countries to not buy hardware from Huawei.

At least in my country, Chinese companies have been barred from official tenders and procurement.

kreyenborgi • yesterday at 8:38 PM

Ad for Huawei?

dzhiurgis • yesterday at 9:50 PM

That's about 350MB per capita. Humans can produce 2-6kb per hour. That's 13 years of non-stop typing. Wonder where it all comes from. I guess it's websites that aren't compressed / extracted.

➕ show 1 reply

jauntywundrkind • yesterday at 8:28 PM

384 core cpu cluster? 2 petabytes?

Dell just launched a 2U that fits almost 10 petabytes in it. It's probably not 384 core capable but that is very doable right now, Epyc chips are 192 cores each! https://www.techradar.com/pro/dell-launches-record-shatterin...

➕ show 2 replies

7e • yesterday at 8:24 PM

2 PB? They will not come close to training in on that amount. Maybe years from now.

➕ show 3 replies

yanhangyhy • today at 2:22 AM

so now Huawei is not a threat to 'democracy' anymore?

➕ show 1 reply

dakolli • yesterday at 11:23 PM

Even entire governments are captured by a mild LLM psychosis. Which is sad in the case of Norway. I lived in Norway for two years and always found their government to be highly rational, this is not a rational use of public funds (but I suppose they have plenty of capital).

Western society is completely captured by this form of psychosis and its going to bite us in the a* very soon.

I firmly believe all the Boomer leaders throughout the world are being sold a bag of lies by technocrats that "AI", specifically LLMs, are going to cure disease and death and therefor they are willing to handover all control to the technocrats. Fckin croakers at it again.

➕ show 1 reply

sspoisk • today at 9:35 AM

[flagged]

hottrends • yesterday at 10:48 PM

[flagged]

huss-mo • yesterday at 9:53 PM

[flagged]

hank808 • yesterday at 9:36 PM

Ehhh. None of this sounds right. Translation problems maybe. Lack or technical detail understanding maybe... I don't know. Probably not news.

alt Hacker News

Norway's 2 petabytes of Huawei flash storage and LLM training

Comments