logoalt Hacker News

barrkelyesterday at 11:22 PM4 repliesview on HN

You usually want O(1) indexing when you're implementing views over a large string. For example, a string containing a possibly multi-megabyte text file and you want to avoid copying out of it, and work with slices where possible. Anything from editors to parsing.

I agree though that usually you only need iteration, but string APIs need to change to return some kind of token that encapsulates both logical and physical index. And you probably want to be able to compute with those - subtract to get length and so on.


Replies

ori_btoday at 12:37 AM

You don't particularly want indexing for that, but cursors. A byte offset (wrapped in an opaque type) is sufficient for that need.

naniwadunitoday at 1:34 AM

You really just very rarely want codepoint indexing. A byte index is totally fine for view slices.

nostrademonsyesterday at 11:56 PM

Sure, but for something like that whatever constructs the view can use an opaque index type like Animats suggested, which under the hood is probably a byte index. The slice itself is kinda the opaque index, and then it can just have privileged access to some kind of unsafe_byteIndex accessor.

There are a variety of reasons why unsafe byte indexing is needed anyway (zero-copy?), it just shouldn’t be the default tool that application programmers reach for.

MrBuddyCasinotoday at 7:14 AM

If you have multi-MB strings in an editor, that’s the problem right there. People use ropes instead of strings for a reason.