logoalt Hacker News

Syzygiesyesterday at 9:06 PM1 replyview on HN

I recently revisited a language comparison project, a specific benchmark tallying the cycle decompositions in parallel of the 3,715,891,200 signed permutations on 10 letters. I kept a dozen languages as finalists, different philosophies but all choices I could imagine making for my research programming. Rather than "ur" I was looking for best modern realizations of various paradigms. And while I measured performance I also considered ease of AI help, and my willingness to review and think in the code. I worked hard to optimize each language, a form of tourism made possible by AI.

The results surprised me:

             F#  100    19.17s  ±0.04s
            C++   96    19.92s  ±0.13s
           Rust   95    20.20s  ±0.38s
         Kotlin   89    21.51s  ±0.04s
          Scala   88    21.68s  ±0.04s
  Kotlin-native   81    23.69s  ±0.11s
   Scala-native   77    24.72s  ±0.03s
            Nim   69    27.92s  ±0.04s
          Julia   63    30.54s  ±0.08s
          Swift   52    36.86s  ±0.03s
          Ocaml   47    41.10s  ±0.10s
        Haskell   40    47.94s  ±0.06s
           Chez   39    49.46s  ±0.04s
           Lean   10   198.63s  ±1.02s
https://github.com/Syzygies/Compare

Replies

LeCompteSftwareyesterday at 11:45 PM

Naively this is quite surprising, but the devil is in the details. With the exception of Lean I'd point out they're all fairly close: Chez being 2.5x slower than C++ is not ignorable but it's also quite good for a dynamically-typed JITted language[1]. And I'm not surprised that F# does so well at this particular task. Without looking into it more closely, this seems to be a story about F# on .NET Core having the most mature and painless out-of-the-box parallelism of these languages. I assume this is elapsed time, it would be interesting to see a breakdown of CPU time.

I don't think these results are quite comparable because of slightly differing parallelism strategies; I'd expect the F# implementation of just spinning off threads to be more a little more performant than a Rayon parallel iterator, which presumably has some overhead. But that really just shows how hard it is to do a cross-language comparison; Rust and C++ can certainly be made faster than the F# code by carefully manipulating a ton of low-level OS concurrency primitives. This would arguably also be little misleading. Likewise Chez and Haskell have good C FFI; does that count? It's a tricky and highly qualitative analysis.

[1] FYI, one possible performance improvement with the Chez code is keeping the permutations in fxvectors and replace math operations with the fixnum-specific equivalent - this tells the compiler/interpreter that the data are guaranteed to be machine integers rather than bigints, so they aren't boxed/unboxed. I am not sure without running it myself, but there seems to be avoidable allocations in the Chez implementation. https://cisco.github.io/ChezScheme/csug/objects.html#./objec...