This is why local AI is so important

reactordev • today at 11:00 AM • 9 replies • view on HN

Replies

It's already being trained on "public" (ethical or otherwise) data. So, it already has ingested that kind of "optimization" during pre-training and training.

I don't think you can fine-tune your way out of it.

➕ show 2 replies

rplnt • today at 11:10 AM

That doesn't solve this particular problem. Your local model was trained on reddit comments written by bots.

soloto • today at 11:14 AM

Local AI will have the bias that existed at the time of its training, which is different from no bias. For stuff that needs to be current, a local LLM would need to search the net regardless.

➕ show 1 reply

Schweigerose • today at 11:13 AM

How do you make sure that the model you run locally is not tainted? Is there even a way to confirm this without providing the complete training set?

➕ show 1 reply

jondea • today at 11:11 AM

It's less compromised, but it's still basing the answer on compromised queries. This is why I pay for independent reviews (e.g Which) where their incentives are more aligned with yours.

rdtsc • today at 11:14 AM

Not if the models come from Google. The ads will be implicit in the model. X is better that Y an Z would be easy to add to a the training set.

➕ show 1 reply

FergusArgyll • today at 11:10 AM

How does that help if it's using search? You get whatever the search engine outputs

weird-eye-issue • today at 11:53 AM

Local AI models pull in search results just like ChatGPT does ...

And they are trained on web data just like any other model...

alt Hacker News

Replies