logoalt Hacker News

anon373839yesterday at 10:51 PM3 repliesview on HN

> China's distillation labs

This notion that Chinese labs are merely distilling frontier models is quite an unwarranted slur. Those labs have published WAY more useful research than US labs on RL techniques, novel model architectures, training pipelines, etc. They have also hit intelligence-per-parameter densities that US labs have yet to attain.

Apart from that, merely training a model on outputs from another model, off policy and without the logits, doesn’t really work that well.

The Chinese labs know how to build frontier level models. GLM-5.2 shows that they no longer even need Nvidia chips to do it.


Replies

trollbridgetoday at 1:18 AM

It's one of those lies people tell themselves to make themselves feel better. "Oh, they're just copying my stuff."

Chinese labs are basically just telling everyone, out in the open, what they're doing and how to do it, and the answer from American frontier labs is "Well, they couldn't possibly be getting the results they're getting without just distilling our models," and the American labs aren't even trying to do some of the stuff like DS's aggressive caching to get costs down.

Vasloyesterday at 11:54 PM

I recently watched a video for one of these “Chinese Models” it kept insisting it was Claude when the user asked. Sorry, there’s no “slur” here but legit suspicion.

show 2 replies
halJordanyesterday at 11:53 PM

But have they? I understand that the Chinese side is illuminated and the American side is dark. I disagree that the Chinese labs have created anything that isn't in an American research lab or production dc. Sure the Chinese have published their findings and not for nothing. But are they novel? Unlikely imo

show 1 reply