> Generated in 0.008s • 14,293 tok/s Chat Jimmy runs ~300X faster than the ~50 tok/s ...

13rac1 • yesterday at 5:47 AM • 1 reply • view on HN

> Generated in 0.008s • 14,293 tok/s

Chat Jimmy runs ~300X faster than the ~50 tok/s you are used to. What could you do differently when you are able to generate code 3,000 - 30,000X as fast as you could code it yourself? What if it was all good quality code? What would you do differently if it were 100,000X faster? mtok/s? gtok/s?

Replies

cyanydeez • yesterday at 12:29 PM

refine that to: what if your harness grew to encompass a larger, slower model and adapted to both the model and the project. thats where i expect the harness to go.

use the big models to code an adaptive small model. train it to use and build tools. give it a standard temple language for any project and bake it into a chip.

right now, LLMs are great because they dont need much data pruning, but once they break through to the functional components, the first thing to do is train a well scoped harness builder.

alt Hacker News

Replies