so you replace one costly sweeping with a costly sweeping. i wouldn't call that an advantage in any way over junping n bytes.
what you describe is the bare minimum so you even know what you are searching for while you scan pretty much everything every time.
What do you mean? What would you suggest instead? Fixed-length encoding? It would take a looot of space given all the character variations you can have.