Eh, reduction counting isn't magic. Golang manages similar preemption semantics without counting that many operations (some tight loops do have barriers inserted every so often, but that's the exception and not the rule). And reduction counting has some serious costs! It slows the runtime down a shitload (and the BEAM is already in the bottom half of interpreted language runtimes by speed) and makes lots of JIT-flavored runtime optimizations slower or harder to implement.
I like immutability too; I wish Java and Golang did more of it. It costs a lot in terms of unexpected copies in the BEAM though, there's less copy-elision optimization than you'd think. That especially bites if you're doing a ton of message passing, because of how process heaps are implemented and how garbage collection (traditional or ETS/ThreadProgress-based) works.
I think what I want is something like Golang but with goroutine-based ownership semantics (or Rust with the Go runtime and goroutines): en excellent scheduler for extremely light-weight green threads, no refcounting or reduction counting, and all the clever optimizations around channel sending and copy elision--but no ability to use a value after it's sent to a channel, and only channel-based access to shared global state. That'd get most of the benefits of process-local heaps but without the (copying, cache/memory fragmentation) drawbacks.
These are all true and I have recognized those as innate limitations of the BEAM VM. For now I am OK with those but I am already skirting at the limit and I am starting to want to jump to Golang and Rust again.