logoalt Hacker News

stevefan1999today at 12:45 PM0 repliesview on HN

Well, knowledge distillation requires a teacher model and a student model and the student model attempts to learn and extract and (preferrably) compress the information of the teacher model, so it is possible for model collapse due to high SNR in between [1].

What I suggested is to steal the (possibly intermediate) weight in between by sniffing the network communication bus, which means MITM for getting the exact values. Or unless it turns out OpenAI or Anthropic leveraged homomorphic encryption, or I'm not certain how is Anthropic would safely allow Mythos to run on AWS without their control.

[1]: https://en.wikipedia.org/wiki/Knowledge_distillation