SRV is essentially a simple layer of abstraction that provides (via one approach) the required end result (reachability + UX) that is easy to add to any $PROTO client without. Supporting ESNI would complicate the actual lib/protocol, increase the amount of dev and maintenance work required all around, significantly increase complexity, and require more infrastructure and invasive integration than any DNS-enabled service already uses.