Cool I made this thing a while back but I really like your fork spawn parallelism
https://github.com/waynenilsen/crumbler
This uses recursive task decomposition but is single thread by design. Honestly fast enough for me and makes it easier to reason about