Have you thought about ways to include the sessions / reasoning traces from agents into this storage layer? I can imagine giving an rag system on top of that + LLM publications could help future agents figure out how to get around problems that previous runs ran into.
Could serve as an annealing step - trying a different earlier branch in reasoning if new information increases the value of that path.