logoalt Hacker News

A Better R Programming Experience Thanks to Tree-sitter

108 pointsby sebgyesterday at 9:14 PM10 commentsview on HN

Comments

tylermwtoday at 2:29 AM

I read this article a week or so ago and immediately implemented a VS Code extension that I've always wanted: a static analysis tool for targets pipelines. targets is an R package which provides Make-like pipelines for data science and analysis work. You write your pipeline as a DAG and targets orchestrates the analysis and only re-runs downstream nodes if upstream ones are invalidated and the output changes. Fantastic tool, but at a certain level of complexity the DAG becomes a bit hard to navigate and reason about ("wait, what targets are downstream of this one again?"). This isn't really a targets problem, as this will happen with any analysis of decent complexity, but the structure targets adds to the analysis actually allows for a decent amount of static analysis of the environment/code. Enter tree-sitter.

I wrote a VS Code extension that analyzes the pipeline and provides useful hover information (like size, time last invalidated, computation time for that target, and children/parent info) as well as links to quickly jump to different targets and their children/parents. I've dogfooded the hell out of it and it's already vastly improved my targets workflow within a week. Things like providing better error hints in the IDE for targets-specific malformed inputs and showing which targets are emitting errors really take lots of the friction out of an analysis.

All that to say: nice work on extending tree-sitter to R!

tarborist: targets + tree-sitter https://open-vsx.org/extension/tylermorganwall/tarborist

GH: https://github.com/tylermorganwall/tarborist

nomilktoday at 12:14 AM

The article makes out like auto completion and help on hover are new things, but RStudio IDE has had them for years and years.

R/RStudio was my first language/IDE. I was horribly shocked when moving into other languages to discover they didn't have things you got out of the box with R/RStudio. "You mean I have to look up documentation for a function/method!?! - that's supposed to be automatic!".

R has a bunch of features which other languages lack to the degree that it's a rude shock to learn that other ecosystems lack them. One is the REPL with extremely convenient RStudio keyboard shortcuts to run lines of code (to achieve similar with ruby, I have an elaborate neovim/slime setup that took hours to configure and still isn't as good as RStudio gives out of the box).

A sign of a brilliant tool is when an idiot can get more done with it than an expert can with alternatives.

show 2 replies
epistasisyesterday at 10:52 PM

Tree-sitter is one of the finer engineering products out there, it enables so much. Thanks to its creator and everyone who has contributed to this project and its many grammars!

fn-motetoday at 12:06 AM

Do the tools built on this understand dplyr pipelines and columns in the data frames appearing as bare variables in the code? If so, I’m really impressed. R does some unusual stuff.

TacticalCodertoday at 12:09 AM

I moved to tree-sitter inside Emacs a while ago and I'd say tree-sitter is much easier than it looks like.

I had a first little use case... For whatever reason the options to align let bindings in Clojure code, no matter if I tried the "semantic" or Tonsky's semi-standard way of formatting Clojure code (several tools adopted Tonsky's suggestion) and no matter which option/knob I turned on, I couldn't align like I wanted.

I really, really, really hate the pure horrible chaos of this:

    (let [abc (+ a 2)
          d (inc b)
          vwxyz (+ abc d)]
      ...
But I love the perfection of this [1]:

    (let [abc     (+ a 2)
          d       (inc b)
          vwxyz   (+ abc d)]
      ...
And the cljfmt is pretty agnostic about it: I can both use cljfmt from Emacs and have a hook forcing cljfmt and it'll align everything but it won't mess with those nice vertical alignments.

Now, I know, I know: it is supposed to work directly from cljfmt but many options are, still in the latest version, labelled as experimental and I simply couldn't make it work on my setup, no matter which knob I turned on.

So what did I do? Claude Code CLI, tree-sitter, and three elisp functions.

And I added my own vertical indenting to Clojure let bindings. And it's compatible with cljfmt (as in: if I run cljfmt it doesn't remove my vertical alignments).

I'd say the tree-sitter syntax tree is incredibly verbose (and has to be) but it's not that hard to use tree-sitter.

P.S: and I'm not alone in liking this kind of alignment and, no, we're not receptive to the "but then you modify one line and several lines are detected as modified". And we're less receptive by the day now that we begin to had tools like diff'ing tools that are indentation-agnostic and only do AST diffs.

show 1 reply