The main reason I am building my own agentic environment is that I need full control and reproducibility of what I am building.
Post November and post openclaw agentic environments need to be built differently, and for selfhosting models the context size problem really requires a strong harness which intelligently helps reduce context size.
Planner/orchestrator architecture, agent to agent summarizer, specification based tools (fck all this markdown memory bullshit btw), tool call shrinking, and workflow management are all really important because of the context size problem.
Nobody has enough VRAM for the large K/V caches, and nobody can afford f16/f32 caches in terms of memory, which are also necessary for longer conversations. MoE 30b models have improved so much though, qwen 3/3.6 coder is the real champion doing almost the same things with less than 1/10th the memory requirements. Just think about that in terms of engineering and what your bet is going to be. Haiku pales in comparison.
Currently my focus with exocomp is trying to figure out how I can record, replay, restart, and debug workflow sessions of agents in a better manner so that I as a human can understand what's going on. Currently I think that UI will be something like a gantt chart where you have a graph with connections representing agent to agent communication. And yes, that's a lot of fiddling with SVG as it turns out, so I'm not quite there yet.
Anyways, in case you're interested. I'm manually building this env and trying to unit test the critical parts. [1]