Accessibility problems are something both retained-mode and immediate-mode UIs have generally. I've found that you can kind-of hack it together but the best route is to incorporate the actual accessibility frameworks of the operating system(s) your targeting. Egui was doing this at one point I think but I'm pretty sure it's either broken now or just doesn't work all that well.
There are libs that function as a messenger between your "hack" and the OS on this.
You don't lose anything.
This is but my opinion, but toolkits tend to be opinionated piles of **. Talking directly with the GPU to paint stuff is often just saner to me.
The problem with immediate mode is that most a11y frameworks expect all UI elements to have a stable identity.
Imagine you have a to-do list of 100 items. What happens when a remote user drags and drops item 100 to lie between items 1 and 2? In retained mode, that's clearly communicated to the UI toolkit; the widget representing that todo is told to change its position in the list. In immediate mode, you can't just destroy and re-create the a11y tree on each render. You need some kind of tree diffing algorithm to figure out that it's one op (move item) instead of ~200 ops (change the name and checkbox state for all items between 2 and 100).