Foreign LLMs are probably not trained on the Norwegian National Library. I regularly find things in ...

vintermann • today at 4:15 AM • 1 reply • view on HN

Foreign LLMs are probably not trained on the Norwegian National Library. I regularly find things in there (with regular keyword search, for genealogy) which neither search engines or language models know.

Of course I then usually put the information I'm interested in somewhere AI could scrape it. But it would take a long, long time to get everything interesting out of there.

Replies

intronic • today at 4:27 AM

Yep in the article it says ..the National Library .. has the single largest digital collection of Norwegian books, newspapers, web pages .. it is entitled to receive copies of every published book and broadcasted content. Its legal deposit mandate in this area extended beyond books, as it was duty-bound to collect and preserve all of Norway’s cultural heritage .. an agreement with Norwegian newspapers permitted LLM training on copyrighted content.

Husnes said: ”No private company has this.”

So yeah they seem to have proprietary data...

➕ show 1 reply

alt Hacker News

Replies