So, what I'm trying to understand, and I can't find any clear information about that in th...

YeGoblynQueenne • today at 2:49 PM • 2 replies • view on HN

So, what I'm trying to understand, and I can't find any clear information about that in the article, is how they "compiled" e.g. the Sudoku solver into a Transformer's weights. Did they do it manually? Say, they took the source of a hand-coded Sudoku solver and put it through their code-to-weight compiler, and thus compiled the code to the Transformer weights? Or did they go the Good, Old-Fashioned, Deep Learning way and train their Transformer to learn a ("100% correct"!) Sudoku solver from examples? And, if the latter, where's the details of the training? What did they train with? What did they train on? How did they train? etc etc.

Very light on details that article is.

Replies

MadnessASAP • today at 4:56 PM

My interpretation is that they built a simple virtual machine directly into the weights, then compiled a WASM runtime for that machine, then compiled the solver to that runtime.

gavinray • today at 7:01 PM

The article states they trained a WASM interpreter and programs are represented as WASM bytecode

➕ show 1 reply

alt Hacker News

Replies