Making your own programming language is easier than you think (but also harder)

103 points • by ibobev • last Thursday at 1:01 PM • 56 comments • view on HN

Comments

Anyone trying to do this... the first thing you do is avoid lex/yacc/bison/antlr. You do not need all this ceremony. A recursive descent parser that uses Pratt parsing will work for a vast majority of cases.

The lexer/parser is never the bottleneck. In fact, you can write those two by hand over a single weekend for a largish language. With LLMs, it takes 15 minutes if you have an unambiguous spec.

The biggest time sink, and the reason you will fail for sure, is the inability to restrict the scope of the project. You start with a limited feature set and produce the entire compiler/vm toolchain. Then you get greedy and fiddle with the type system, adding features that you have never used and probably never will. And now you have to change every single phase from start to end.

I mostly give up at this stage.

➕ show 7 replies

Mikhail_Edoshin • today at 7:03 AM

To me the most interesting part of a notation is the underlying thing that actually runs the code. The virtual machine, if you will. There are many ways to do that but I don't know a good systematic overview. E. g. what is Forth, if we ignore the notation? What is Lisp? What is Pascal and how it is different from C?

tikotus • today at 8:59 AM

I've also made my own language for making games. It's a scheme with some tricks to make some gamdev specific aspects much nicer. Making it work was indeed not that hard, but making it good has taken its toll. Really happy with it currently!

➕ show 1 reply

crowdhailer • today at 9:19 AM

I think more people having a crack at a language is a good thing. It demystifies a lot. For a long while I wanted the install guide for EYG (my language) to be a tutorial to write an interpreter in the language of your choice. I thought following the guide should take about a weekend and cover every feature in the language. For production you might want someone else's implementation, but for getting started what a great intro.

sheepscreek • today at 1:53 AM

Easier than you think to get started, but harder than you think to turn into something truly usable that isn’t a toy of an experiment.

nithinbekal • today at 12:45 AM

I've been having a lot of fun building my own programming language [1]. Getting to the point where you can write programs in your own language was surprisingly easy.

The language, Sapphire, is Ruby inspired, so the most interesting part is digging into the internals of the latter when I'm trying to figure out how something should work.

[1] https://github.com/sapphire-project/sapphire

gobdovan • today at 12:27 AM

I had a similar surprise about how approachable PL is, but from going from 'the bottom up' instead from a normal language.

I wrote a compiler toolchain and debugger that takes a Turing machine description plus input string and emits an encoded tape runnable by a Universal Turing Machine [0]. I had some prior PL experience, but never did an end-to-end compiler pipeline, at least not this low level.

It started as a joke/experiment, but I couldn't believe how fast it pulled me into designing:

- a small low-level ASM for building the UTM

- an ABI for symbol widths and encoding grammar

- an interpreter used as the behavioral oracle

- raw TM transitions for each ASM instruction, generated by having an LLM iterate on candidate emissions and checked against the interpreter oracle

- a CFG-style IR to fix the LLM mess once direct ASM -> TM emission became too hard to keep sane (LLM did a decent job actually, I don't think I would have done a much better job without the IR either)

- a gdb-style debugger for raw transitions, ASM routines, and blocks

- a trace visualizer

- a bootstrapping experiment where an L1 UTM/input pair was itself run through an L2 UTM

- optimisation experiments

And every step came quite naturally and was easy to tie in with everything else. Each one was just the next local repair needed to make the previous layer tractable.

[0] Repo: https://github.com/ouatu-ro/mtm

coldcode • today at 1:05 AM

I wrote my own interpreted language about 25+ years ago to write online surveys. It made it easy to create complex surveys with many branches. I think I wrote it in Objective-C.

The team implementing the survey system wound up using the same language to implement the runtime portion, something I never expected or designed in.

I don't recall anything about what it looked like now. I do remember it was a lot of fun to write.

chrisaycock • today at 1:35 AM

Yes, it's true that someone can put together a simple language like in a university course. The difficulties, as mentioned at the bottom of the post, are things like metaprogramming features or optimizing compilers.

The tail ends of a language implementation (parsing and code generation) are a fixed cost; the "middle end" can grow unbounded as more production-quality items are added.

My language: https://www.empirical-soft.com

atan2 • today at 1:40 AM

This URL was posted two days ago: https://news.ycombinator.com/item?id=48040422

amelius • today at 12:45 AM

Making a programming language is easy if you just copy ideas already existing in other languages.

Coming up with new ideas is hard. Especially since you have to test them in the real world.

Panzerschrek • today at 5:51 AM

Just making a better C with no real compiler (only JIT) is easy, I agree. It's much harder to make something innovative and mature. It requires years of development.

virexene • yesterday at 10:55 PM

this project is pretty interesting, although i'm wondering how they're planning to address the "easy sandboxing" design goal in a compiled language with raw pointer arithmetic and clib interop... in that regard i think lua would have been a lot easier to sandbox, despite the author's concerns.

(also, they might want to look into lua userdata, since that would address their concern about the overhead of converting between native and lua data structures. the language is designed to be embedded in C programs after all)

wg0 • today at 2:35 AM

Strange to read that C++ can be someone's favorite programming language.

Only thing that goes for C++ is that it has acceptable (not straightforward) C interop.

I don't like C# and X++ because the language surface is huge but if you use a limited subset than needles to say, very useful and handy languages too.

➕ show 1 reply

ecto • today at 12:32 AM

There are many like it, but this one is mine https://loonlang.com

Decabytes • today at 2:23 AM

Like most things in programming, handling the easy stuff is easy, but it’s all the edge cases that kill you. I’m writing an IDE in flutter right now, and all of the defensive programming I have to do to handle the unhappy path, is where 50% of my code goes.

➕ show 1 reply

Tomokisan • today at 1:24 AM

I watched a lot of youtube videos explaining in detail how to do it but i admit i never tried myself.

I'm kind of curious and want to try it for fun as long as i get some free time ^^

hyper_frog • today at 5:41 AM

Any reasons for not using odin? It seems great for gamedev

➕ show 1 reply

BSTRhino • today at 7:36 AM

Great write up!

Razengan • today at 2:46 AM

For years I've been fantasizing about a language designed specifically for gameplay development that doesn't try to be like C.

Maybe AI is good enough now to help me with that..

The last time I tried, Claude couldn't even help me build a syntax highlighter for a hypothetical language.

➕ show 1 reply

smitty1e • yesterday at 10:12 PM

If I were to make my own programming language, it would look an awful lot like Python.

Roughly 100%.

➕ show 3 replies

unnouinceput • yesterday at 10:59 PM

Making you own language is easy. Creating the library that will actually solve problems without forcing the developers to reinvent the wheel is the crux. There is a reason why C++ / Java / JavaScript etc are established, it's the already proven libraries around those languages that allows them to be so successful.

Imustaskforhelp • yesterday at 10:14 PM

I have only read the first end of the article but I can't help but think that a project like libriscv[0] would've/could've worked for their game project too because fun fact but the creator of librsicv, the legendary fwsgonzo is also making a game. I highly recommend for people to check out their discord server.

But my main point is that libriscv is one of the fastest libriscv emulators and then something like C/C++/lua could've been used with sandboxing purposes for the purposes of the game then.

Am I missing something? Although, making a programming language is one kind of its own projects and that's really cool as well :-D

but I would also love to hear the author's opinion on libriscv as it feels like it ticks of all the boxes from my understanding

[0]: https://github.com/libriscv/libriscv

jdw64 • today at 2:21 AM

[dead]

alt Hacker News

Making your own programming language is easier than you think (but also harder)

Comments