The Limits of JIT · Smoking JavaScript with Dart 🔥

A dive into benchmarking, language performance, and the trepidation of JIT compiled execution.

JavaScriptDartJIT

2024-02-22 ~14 minutes

As a systems programming centenarian, I’ve never openly admitted my weak spot for Dart. Languages are so much more than the sum of their syntax and Dart manages to scratch an itch like no other, even after having used many mainstream and artisinal offerings.

In the meantime, the entire techno Boheme has seemingly moved to running JavaScript (JS) on the server giving rise to the elusive soydev. If Dart, following popular opinion, is so similar but yet offers a meaningfully saner environment, why isn’t there more momentum? Clearly there’s a multitude of reasons even when glossing over Dart’s rocky history. I can already hear you yell: full-stack meta-frameworks, ecosystem of database drivers, middleware, … While being a great reason on the day-to-day, it’s also more of an effect than a cause. With more interest there would be a larger ecosystem. With seamless interop and transpilation, there’s isn’t too much of a difference to TypeScript either, when targeting Web. We could easily have yet another JS compatible meta frameworks in Dart. After all, we have them in C# or Rust, forgoing JS entirely by going straight to WebAssembly.

Anyway, you’re probably here to see JS and Dart fight it out over performance, which is ultimately just a ruse to ramble about JIT compilation trade-offs and benchmarking more generally 😳.

JavaScript vs Dart Performance

One thing that always struck me about the JavaScript crowd is: how much they like to argue. If it’s not the latest frontend framework, it’s who has the faster runtime: Node, Deno, or more recently Bun. There’s clearly a need for speed. In order to establish dominance, battles are often fought over HelloWorld HTTP servers. With only toy amounts of JS and the HTTP servers implemented in C++, Rust, and Zig, these setups completely ignore actual language performance. In practice, any amount of JS would quickly eclipse the request handling overhead and level the playing field among runtimes :foreshadowing:.

Before the irony is lost on anyone, let’s jump straight into our own

HelloWorld benchmarks:

If you came here to see JavaScript mopping the floor with Dart, here you go. Case closed. Dart sucks.

This could easily be the end of the story. After all, Node and Dart are not that far off. However, there’s one key difference: Dart simply binds a network socket and implements HTTP request handling in Dart itself, as opposed to calling out into a more efficient language to do the heavy lifting. In other words, we’re comparing virtually no JS to quite a bit of Dart. If nothing else, it’s impressive that Dart can keep up. This begs the question: how would Dart fare if we’d set it up similarly? Let’s try it out:

It may seem surprising at first that we’re edging out Bun w/o any optimization effort, however we’ve really only compared Bun’s HTTP server against Axum+Tokio. Turns out Axum+Tokio is pretty fast. Even fast enough to make my lazy setup shine 😎.

Language comparisons with a HelloWorld-like laser focus are inherently suspicious and beg the question as to what they omit, deliberately or not? How would we fare in a benchmark with a wider scope? A popular next escalation are JSON echo servers, i.e. a server that receives a meaty chunk of JSON, deserializes it, reserializes it and then sends it back. Clearly, Dart and our new super-charged runtime should win this. Well, hold my beer…

Damn, the JS runtimes collectively left us in the dust, crowning JS the true king of performance after all 👑🙇… Luckily for us, we’re simply experiencing the very asymmetry we observed earlier: all JS runtimes defer parsing to tuned implementations in a more efficient language, while Dart’s is implemented in Dart. It’s also interesting to see that Buns famously touted performance melted away. We now know that this has little to do with JS performance. V8’s JSON parser is simply fast enough to make up for the difference in request handling overhead. In other words, if you’re parsing a ton of JSON, you’re likely better off sticking with Node or Deno, at least for now.

Experiencing the same issue means that we can bring our one-trick pony back out and extend our makeshift runtime with a faster JSON parser. Our setup is actually quite nice in that it gives us the freedom to choose whatever parser we fancy. So let’s go full SIMD and give V8 a run for its money.

There you go - we’re able to serve double the queries, with more consistent and roughly half the latency, while paying only ¼ of Bun’s memory cost 🔥.

My crusade is over: we’ve seen that Dart is faster than JS… except we didn’t benchmark the languages at all. We only compared the performance of HTTP server and JSON parser implementations authored in entirely different languages. If there’s only one take-away, it’s that benchmarking is hard, interpretation requires a lot of context, and they can be deceiving enough to fuel entire marketing departments. JavaScript and even the JS runtimes among each other aren’t necessarily faster than the competition, especially when comparing them apples to apples 😱.

So which language is faster then? I’m not even going to attempt to answer this highly contextual question. Instead, I’ll point you to the Benchmark Game, which IMHO is the only way to compare language performance fairly. Even if I was to solve some computational problem using both languages, who’s to say that my problem is representative or my implementations are any good? Am I using the languages to their full potential? In practice, we would merely be benchmarking my skill issues.
The Benchmark Game compares the fastest solutions submitted by language experts for a set of well defined problems. In other words, it’s a measure of how fast a language can go when you know what you’re doing. On the flip side this can lead to some horrific solutions, especially for lower level languages, Fortunately we don’t have to concern ourselves with this today. Looking at the results, Dart and JS are generally neck and neck with Dart AOT consistently needing several times less memory matching our own results.

AOT vs JIT Compilation

Changing gears a little, we haven’t really talked about the fundamental differences between ahead-of-time (AOT) and just-in-time (JIT) compilation. At its core, AOT is very straight forward: you take the code and compile a bundle that is ready to execute on a specific platform. Most compilers target specific hardware architectures but waters can get murky. For example, AOT compiling code to an intermediate representation (IR) like WebAssembly. This just means that we apply our optimization passes ahead of distribution to make life easier on the virtual machine (VM) running the IR later. Fundamentally, AOT is less load and time-sensitive. It will always allow you to spend more effort on optimization passes w/o stalling execution or pummeling the executor with costly bookkeeping and compilation.

Talking about JITs, on the other hand, is a slippery topic. It can mean many things. For one, it could simply refer to install-time compilation post distribution. However, it most commonly refers to machine code generation while the program is executing. Orthogonally, JIT compilers may employ a multitude and even layered strategies to generate and dynamically re-generate more optimized versions as the program continues to run. In its simplest form, a JIT compiler will look at code method by method and compile each method exactly once before stitching everything together. Modern JITs do much more than that. They are straddling a balance between starting up quickly in an interpreted or hastily compiled mode and handing off to increasingly costly levels of optimizing compilers depending on how “hot” a piece of code runs, i.e. how frequently it gets invoked. Many runtimes will do this bookkeeping on a method-level, others will simply trace the execution and recompile specific code paths independent of method boundaries. Modern JIT compilers are beasts. They tackle profiling, invalidation, fallbacks, re-compilation, linking, … and continue to be an active field of research. Reading through the evolution of any of the major JITs, like SpiderMonkey, JavaScriptCore, or V8, is a treat 🍪.

While in theory JIT compilation should be able to produce more optimized code based on up-to-date profile data, it is always a story of trade-offs. Being able to stop the world for a few minutes to do nothing but optimization passes is a privilege that JIT compilers usually don’t have. Meaning, there’s a practical limit on what JIT compilers could do versus can afford to do. Once you add profile-guidance to your AOT pipeline any elusive performance benefits JITs may have are virtually gone. When applying guidance to our JSON echo server, throughput further increased by 58% with an average latency reduction of 36%. Even w/o explicit guidance, AOT compiled lower-level languages, especially with manual memory management, tend to come out on top while maintaining a much smaller footprint and operational surface.

This hard truth explains why runtimes, even with the latest and greatest JIT compilers, continue to benefit greatly from calling out into highly optimized implementations in lower-level languages. As we’ve seen, it’s one of the key strategies employed by JS runtimes to speed things up. The same strategy enables notoriously slow language, like Python, to match regular expressions 15x faster than Swift 🔥, a language that itself should be a lot more efficient (arguably they also just didn’t optimize their regex engine well).

Don’t get me wrong, JIT compilers are an amazing feat of engineering and demonstration of what’s possible. They’ve been a godsend for the web¹, fast iteration cycles with hot-reload, and wherever upfront compilation would be daunting. They work best when running the same code paths over and over and over again… like in benchmarks, maybe shifting the perceived gains in their favor². However, on servers, mobile and desktop, we do typically have the privilege of an expensive upfront build step. This makes the increased footprint, added complexity, and inflated operational (attack) surface a life-choice on may or may not choose ³. Despite JIT manufacturers going out of their way to minimize overhead and trying to make them as safe as possible, running AOT compiled binaries is fundamentally simpler and will often be more economical when looking at the bigger picture. Heck, I’ve seen teams employ genetic algorithms to tune JVM flags 🤯.

I will happily concede that the JIT compilers shipped by popular JS engines seem to be a quite a bit more optimized than Dart’s. In our tests we’ve found it to be reasonably speedy but also immensely hungry for memory. The Benchmark Game has numbers on Dart’s AOT vs JIT modes across a wide range of programs. As we’d expect, they’re mostly performing quite similar with AOT winning more often than not and needing up to 15x less memory for small problems. For larger problems the memory cost outpaces the overhead amortizing the overall cost. Interestingly though, the Dart JIT manages to bring in ample gains of 10%-30% over Dart AOT, and even more for Node.js, in the reverse-complement challenge ⁴. In any case, Dart’s JIT compilation is a nice-to-have that greatly enhances especially Flutter’s developer experience with hot-reload. In terms of overall cost, there isn’t a clear justification for using it in production, whether that’s on the server, desktop, or mobile⁵.

Parting Words

At surface level, we looked at the performance of Dart and several JavaScript runtimes. However, we quickly found that we had really only compared their respective HTTP serves and JSON parser implementations. Doing a more apples to apples comparison, all of JavaScript’s seeming lead and even differences between the JavaScript runtimes melted away 🫠. In case this comes as a surprise, hopefully it helps to contextualize some of the benchmarks out there.

We also looked at the fundamental differences between AOT and JIT compilation, the inherent trade-offs JIT compilers have to make, and the benefits AOT compilation can bring to performance and your wallet when applicable.

Whenever you’re considering to run JavaScript, or any JIT compiled language for that matter, it’s important to rationalize:

There are great, often historically motivated, reasons why every major JS engine opts to ship a JIT compiler.
Likewise, there are great reasons why every major JS runtime opts to provide core functionality in a more efficient AOT-compiled language: reliable performance but maybe even more importantly: idiosyncrasies. Just having fixed-length or measly 64bit integers is hugely desirable when implementing any protocol.
Running AOT compiled code is fundamentally simpler, will often be faster, and is typically cheaper, if you have the privilege to an upfront build step.

Do you have the same requirements as a Web Browser? Maybe not. Is it primarily language familiarity you seek or could there be a simpler, saner option? There’s certainly some calming irony in building in an environment where even the architects have an easier time building core functionality outside 🙃. Don’t let my trash talk dishearten you, without a doubt, amazing products have been build on JS and JITted runtimes. If you just really like something, you just really like something. But if you still end up trying something new, even if it doesn’t pan out, in the worst case you’ve learned something ✌️. Otherwise, I hope this article, at the very least, gave you something to chew on.

Going forward, should you use Dart? That depends entirely on you, what you like, and what you need. If you’re a JavaScript person looking into servers and have no specific requirements - why not? Dart is similar enough to pick up quickly, ads quality-of-life, and its decently efficient AOT mode makes it cheaper to operate 💸 .

Should you be building on my toy runtime? Absolutely not. It was hacked up merely for this experiment and is hardened in a torrential summer sprinkle. Though, shout out to flutter_rust_bridge, which made super-charging Dart truly a bliss. Ultimately, if it’s raw performance you seek and are happy to stray further, you may take a look at Go or OCaml as more practical zeitgeisty offerings. Rust is certainly faster to run and brings many novel concepts to the table but isn’t for the faint of heart.

Thanks for making it this far. It means a lot. In case you didn’t notice, this is my first attempt at writing an article. So if you’d like to see more, yell at me for being wrong and obnoxious, or are interested in taking things further, simply reach out. For good measure, the setup and benchmarks can be found on GitHub.

This needs to be understood in a historic context: JITs helped to dramatically accelerate an already established ecosystem based on JavaScript and source-level distribution. It’s fun to think about how the Web would look, if we were to rebuild it from scratch? For one, we’d probably distribute code in some pre-optimized IR. This would let us get away with simpler, faster VMs and maybe just streaming AOT compilation. Look at Go’s build times and that’s w/o pre-optimization or IR. FWIW, we’re already seeing a lot of complexity unravel with WebAssembly 🎉. ↩
All benchmarks above involved a warm-up to minimize latency distributions and generally give the JIT runtimes a leg up. ↩
To be clear, I’m not talking about runtimes in general. I’m specifically talking about JIT compilation. Other runtime quality-of-life features like automatic memory management with garbage collection do incur an overhead but they’re generally much simpler and even let you move complexity out of your application. ↩
Personally, I find the lead over Dart AOT a bit more surprising. The lead over Node.js could be explained by differences in the implementation. ↩
If your instant instinct is over-the-air (OTA) code pushes as advertised by some of the mobile JS frameworks, I’d urge you to stick with meager daily releases and invest in QA instead. ↩