Think they will not train on the dull 2TB but use that as the data lake to start and then apply a mo...

sgt • yesterday at 8:48 PM • 1 reply • view on HN

Think they will not train on the dull 2TB but use that as the data lake to start and then apply a more targeted approach.

Replies

winddude • yesterday at 9:15 PM

if you read the article 2pb is available as flash storage in the data pipeline, used to dedupe, clean, normalize, etc, for training from 60pb of raw data.

alt Hacker News

Replies