logoalt Hacker News

danieltanfh95last Monday at 12:13 PM0 repliesview on HN

https://github.com/danieltanfh95/replsh

LLMs are surprisingly bad at using REPLs, so I made a CLI that handles sync, streaming, async REPL evals support over docker, ssh, local, and supporting python and clojure. Also proudly would like to claim that that it was successfully in maintaining agent quality because they are grounded in code.

https://github.com/danieltanfh95/agent-lineage-evolution/ aka `succession`

My solution for infinite context, and persistent instruction following (very important for replsh and grounding, LLMs are very bad at using tools outside of their training/harness) is to build a persistent and self-resolving identity for the agent.

These two tools now power my day and are very crucial in allowing me to use claude models outside of their supposed "nerfs":

1. succession handles instruction drifts that will only get worse as LLMs get better at reasoning (this seems counter intuitive until you realise that claude.md etc is only injected at the start of the instruction and the significant distance grows)

2. replsh grounds the llm and avoids pure mental tracing and hallucination while allowing the llm to test while coding.

3. clojure is surprisingly the most productive language i am using LLMs for, given its package of data driven design, domain driven design, emphasis on data shape and layers, lack of syntax and overall lesser code written leading to less bugs.