Things that have bugged me for 40 years...
* NUL terminated strings (and now, non UTF-8 encoded strings on input/output)
* Using LF or CR or CRLF as line terminators, and pipe/comma-delimited fields when there were other unambiguous ASCII characters that could have been used (eg, GS, FS, RS) that would have made the encoding/decoding of line termination an I/O thing keeping HT/VT/CR/LF/FF as literally print related codes.
Now with Unicode we actually have even more:
NL Next line (from EBCDIC?)
LS Line separator (invented by Unicode)
PS Paragraph separator (same)
The Unicode standard says that in addition to CR, LF, CRLF and the above, vertical tabs and form feeds should also be treated as line separators.
> non UTF-8 encoded strings on input/output
UTF-8 on stdin/stdout works perfectly fine (unless you are on Windows of course, which is stuck in in the early 90s when it comes to international text encoding).
> Using LF or CR or CRLF as line terminators
This is also an operating system convention, and it would be better if programming languages wouldn't try to "guess" the correct line endings, since this causes more problems than it solves - but again, this is mostly a Windows specific problem, and it's Microsoft's job to finally bring Windows into the current century.
LF makes the most sense, but they're all fine for text files. The issue is that CSV isn't text.
Last time I had to handle CSV files in bash, I converted them internally to RS and FS.
> non UTF-8 encoded strings on input/output
I would just use UTF-8 everywhere.
I did a project to translate data framed in the ASCII field/record separator characters and it was gloriously easy. All the ugly escaping considerations with comma-delimited data went away and it became much easier.