logoalt Hacker News

Four Column ASCII (2017)

325 pointsby tempodoxlast Sunday at 9:15 AM76 commentsview on HN

Comments

HocusLocustoday at 3:25 PM

I have lived my whole professional life with this being 'beyond obvious'... It's hard to imagine a generation where it's not. But then again, I did work with EBCDIC for awhile and we were reading and translating ASCII log tapes (ITT/Alcatel 1210 switch, phone calls, memory dumps).

I once got drunk with my elderly unix supernerd friend and he was talking about TTYs and how his passwords contained embedded ^S and ^Q characters and he traced the login process to learn they were just stalling the tty not actually used to construct the hash. No one else at the bar got the drift. He patched his system to put do 'raw' instead of 'cooked' mode for login passwords. He also used backspaces ^? ^H as part of his passwords. He was a real security tiger. I miss him.

dcmintertoday at 2:31 PM

It doesn't seem to have been mentioned in the comments so far, but as a floppy-disk era developer I remember my mind was blown by the discovery that DEL was all-bits-set because this allowed a character on paper tape and punched card to be deleted by punching any un-punched holes!

show 1 reply
fix4funtoday at 7:21 AM

For me was interesting that all digits in ASCII starts with 0x3, eg. 0x30 - 0, 0x31 - 1, ..., 0x39 - 9. I thought it was accidental, but in real it was intended. This was giving possibility to build simple counting/accounting machines with minimal circuit logic with BCD (Binary Coded Decimals). That was wow for me ;)

show 3 replies
kazinatortoday at 7:07 AM

This is by design, so that case conversion and folding is just a bit operation.

The idea that SOH/1 is "Ctrl-A" or ESC/27 is "Ctrl-[" is not part of ASCII; that idea comes from they way terminals provided access to the control characters, by a Ctrl key that just masked out a few bits.

show 2 replies
kazinatortoday at 7:14 PM

If Unicode had used a full 32 bits from the start, it could have usefully reserved a few bits as flags that would divide it into subspaces, and could be easily tested.

Imagine a Unicode like this:

8:8:16

- 8 bits of flags. - 8 bit script family code: 0 for BMP. - 16 bit plane for every script code and flag combination.

The flags could do usefuil things like indicate character display width, case, and other attributes (specific to a script code).

Unicode peaked too early and applied an economy of encoding which rings false now in an age in which consumer devices have two digit gigabyte memories, multi terabyte of storage, and high definition video is streamed over the internet.

taejavutoday at 7:38 AM

For whatever reason, there are extraordinarily few references that I come back to over and over, across the years and decades. This is one of them.

show 2 replies
california-ogtoday at 5:24 PM

I made an interactive viewer some time ago (scroll down a bit):

https://blog.glyphdrawing.club/the-origins-of-del-0x7f-and-i...

It really helps understand the logic of ASCII.

pixelbeat__today at 8:23 AM

Some of this elegance discussed from a programmatic point of view

https://www.pixelbeat.org/docs/utf8_programming.html

mbreesetoday at 2:14 PM

I came across this a week ago when I was looking at some LLM generated code for a ToUpper() function. At some point I “knew” this relationship, but I didn’t really “grok” it until I read a function that converted lowercase ascii to uppercase by using a bitwise XOR with 0x20.

It makes sense, but it didn’t really hit me until recently. Now, I’m wondering what other hidden cleverness is there that used to be common knowledge, but is now lost in the abstractions.

show 2 replies
jeztoday at 5:43 PM

I have a command called `ascii-4col.txt` in my personal `bin/` folder that prints this out:

https://github.com/jez/bin/blob/master/ascii-4col.txt

It's neat because it's the only command I have that uses `tail` for the shebang line.

dveeden2today at 7:10 AM

Also easy to see why Ctrl-D works for exiting sessions.

rbanffylast Sunday at 9:47 AM

This is also why the Teletype layout has parentheses on 8 and 9 unlike modem keyboards that have them on 9 and 0 (a layout popularised by the IBM Selectric). The original Apple IIs had this same layout, with a “bell” on top of the G.

show 3 replies
gpvostoday at 3:45 PM

Back in early times, I used to type ctrl-M in some situations because it could be easier to reach than the return key, depending on what I was typing.

seyztoday at 1:07 PM

This is why Ctrl+C is 0x03 and Ctrl+G is the bell. The columns aren't arbitrary. They're the control codes with bit 6 flipped. Once you see it, you can't unsee it. Best ASCII explainer I've read.

dangtoday at 7:04 AM

Related. Others?

Four Column ASCII (2017) - https://news.ycombinator.com/item?id=21073463 - Sept 2019 (40 comments)

Four Column ASCII - https://news.ycombinator.com/item?id=13539552 - Feb 2017 (68 comments)

unnahtoday at 8:06 AM

If Ctrl sets bit 6 to 0, and Shift sets bit 5 to 1, the logical extension is to use Ctrl and Shift together to set the top bits to 01. Surely there must be a system somewhere that maps Ctrl-Shift-A to !, Ctrl-Shift-B to " etc.

show 2 replies
ezekiel68today at 12:53 PM

I love this stuff. It's the kind of lore that keeps getting forgotten and re-discovered by swathes of curious computer scientists over the years. So easy to assume many of the old artifacts (such as the ASCII table) had no rhyme or reason to them.

renoxtoday at 8:38 AM

I still find weird that they didn't make A,B... just after the digits, that would make binary to hexadecimal conversion more efficient..

show 5 replies
mac3ntoday at 3:38 PM

credit to William Crosby, "Note on an ASCII-Octal Code Table", CACM 8.10, Oct 1965

https://dl.acm.org/doi/epdf/10.1145/365628.365652

also defined 6-bit ASCII subset

mac3ntoday at 3:45 PM

anyone remember 005 ENQ (also called WRU who are you) and its effect on a teletype?

mekentoday at 2:37 PM

Very cool.

Though the 01 column is a bit unsatisfying because it doesn’t seem to have any connection to its siblings.

y42today at 1:49 PM

first I was like "What but why? You don't save any space or what's that excercise about" then I read it again and it blew my mind. I thought I knew everything about ASCII. What a fool I am, Sokrates was right. Always.

msarnofftoday at 8:48 AM

On early bit-paired keyboards with parallel 7-bit outputs, possibly going back to mechanical teletypes, I think holding Control literally tied the upper two bits to zero. (citation needed)

Also explains why there is no difference between Ctrl-x and Ctrl-Shift-x.

joshcorbintoday at 6:32 PM

Just wait until someone finally gets why CSI ( aka the “other escape” from the 8-bit ansi realm, which is now eternalized in unicode C1 block ) is written ESC [ in 7-bit systems, such as the equally now eternal utf-8 encoding

SUDEEPSD25today at 4:46 PM

Love this!

timonokotoday at 7:15 AM

where does this character set come from? It looks different on xterm.

for x in range(0x0,0x20): print(chr(x),end=" ")

                    

show 2 replies
Aardwolftoday at 1:40 PM

Imho ascii wasted over 20 of its precious 128 values on control characters nobody ever needs (except perhaps the first few years of its lifetime) and could easily have had degree symbol, pilcrow sign, paragraph symbol, forward tick and other useful symbols instead :)

show 5 replies