logoalt Hacker News

reactordevtoday at 11:00 AM9 repliesview on HN

This is why local AI is so important


Replies

bayindirhtoday at 11:10 AM

It's already being trained on "public" (ethical or otherwise) data. So, it already has ingested that kind of "optimization" during pre-training and training.

I don't think you can fine-tune your way out of it.

show 2 replies
rplnttoday at 11:10 AM

That doesn't solve this particular problem. Your local model was trained on reddit comments written by bots.

solototoday at 11:14 AM

Local AI will have the bias that existed at the time of its training, which is different from no bias. For stuff that needs to be current, a local LLM would need to search the net regardless.

show 1 reply
Schweigerosetoday at 11:13 AM

How do you make sure that the model you run locally is not tainted? Is there even a way to confirm this without providing the complete training set?

show 1 reply
jondeatoday at 11:11 AM

It's less compromised, but it's still basing the answer on compromised queries. This is why I pay for independent reviews (e.g Which) where their incentives are more aligned with yours.

rdtsctoday at 11:14 AM

Not if the models come from Google. The ads will be implicit in the model. X is better that Y an Z would be easy to add to a the training set.

show 1 reply
FergusArgylltoday at 11:10 AM

How does that help if it's using search? You get whatever the search engine outputs

weird-eye-issuetoday at 11:53 AM

Local AI models pull in search results just like ChatGPT does ...

And they are trained on web data just like any other model...