Serving b8r

In a former life I worked on optimizing delivery of a fairly large website. I won’t pretend I understood a fraction of the detail, but I had a pretty good idea of the big picture and in a couple of places I drilled down to the bottom-most details.

This isn’t new to anyone who pays attention, but scale makes simple things hard.

The basic tricks to getting a web page to load fast are:

Do as little as possible:

  • Make everything as small as possible.
  • Make everything as simple as possible.
  • Be as asynchronous as possible.

Do it as infrequently, fast, and in parallel as possible:

  • Minimize the round-trip time from client to server.
  • Parallelize everything as much as possible.
  • Split stuff up enough to make it parallel, but not so much as to increase the number of round-trips. (To some extent, SPDY/http2 is solving this for us.)
  • Minimize the number of round trips.
  • Maximize cache utilization.

The grandparent of bindinator was bind-o-matic, which was not designed with all of these things in mind. In particular, it made no real attempt to leverage parallelism or asynchonicity. When Craig, Josiah, and I wrote bindinator, the “state of the art” was:

  • Figure out what your dependencies are.
  • Compile them into a big blob.
  • Minify the blob.
  • Give the client the blob.

Bind-o-matic’s approach was: “be so small and light that you don’t need to do clever shit to get good performance” (during development) because you can always do that later. We actually compiled our LESS on the client and it didn’t cause performance problems (once we forked less.js and sped it up a bit, and cached compiled CSS in localStorage).

While almost any javascript web application architecture can served as above, more fine-grained optimization (e.g. trying to get the user to interactivity as fast as possible) is a tougher problem, especially when you have little or no ability to do less (e.g. almost every person in the organization is incentivized to make your application bigger, slower, and more complex.)

And you might be committed to using a big, complicated framework that is virtually guaranteed to make anything it touches, bigger, more complex, and less asynchronous.

Anyway, I designed b8r to be as small, simple, and asynchronous as possible but left delivery optimization “for later”. I assumed I could just point webpack (or webpack2) or some more sophisticated tool (such as the stuff I had worked on) at anything I built later.

I did do one thing, though.

I wrote my own require implementation because I started reading the documentation for existing implementations and my eyes glazed over. In particular, none of them seemed as straightforward to use as the one I’d gotten used to in my former life (and with which I had deep familiarity).

Now, my require is purely client-side, which means that it is a big performance problem. Consider the following code:

const foo = require('foo.js');
foo(17);

The call to require must be synchronous. But what does require do behind the scenes?

  1. Use an XMLHttpRequest to pull “foo.js” from the server.
  2. Wrap a Function instance around the code inside it.
  3. Pass an object to the function.
  4. Return the “exports” property of the object when the function returns.

It really doesn’t matter how asynchronous your code is if every dependency involves halting execution while round-tripping to your server… recursively in the case of a file that, itself, has dependencies.

This behavior actually throws warnings in Chrome…

Chrome no likee
Chrome no likee

Chrome only complains about this the first time it happens, but it happens a lot.

Now the solution for this is to compile your javascript code on the server and deliver some kind of optimized blob — site.min.js say. This is exactly what tools like webpack do — they actually watch your code tree and trigger recompiles on-the-fly. Webpack offers a dev server that actually sets up a backchannel to the client and refreshes the browser automagically when there are code changes.

Sounds pretty good, right? — but it’s about 1/10 as responsive as using b8r and just forcing refresh. I fucking hate having to compile my code all the time, even if all the compiler does is walk a tree and concatenate a shitload of files wrapped in function calls and assignment statements and then call uglifyjs.

But that’s on a local dev server. What happens when you stick this code on a real server and the round-trip goes from ~0ms to ~100ms? It turns out that on the project I’m working on it changes my web application’s spin-up (with nothing cached) from ~600ms to ~1500ms. (Aside: this is a real web application with a shitload of functionality talking to production servers with no back-end optimization. In my past life, loading in ~1500ms from a real server would have caused spontaneous orgasms. When I told people that performance like this was achievable I was assumed to be a naive fool. No, I’m not bitter!)

So, how to do all this stuff on the client?

  1. Make b8r’s use of require asynchronous. E.g. b8r synchronously loads b8r.dom.js before it finishes loading, so load its dependencies asynchronously before loading b8r itself asynchronously.
  2. Get require to warn whenever it loads a module synchronously. Ick.
  3. OK, get require to return a JSON structure of synchronously loaded modules and load them asynchronously before doing anything else.
  4. Repeat step 3 until no warnings. (This took three iterations.)

Loading time from the stage server went from ~1500ms to ~600ms with no server-side optimization whatsoever. Not bad for a late night hack.

Now wouldn’t it be nice if this were all automatic?

I started writing this post on Friday evening, but my first stab at automating this didn’t work and my brain was too fried to fix it.

In order to function, require tracks all the modules it loads, and it already replaces itself recursively to handle nested requires (to allow for relative paths in require statements) so all I needed to do was track one level of dependencies, and then generate a list of preload “waves” where each wave comprises all modules with no unloaded dependencies. (Circular dependencies will be detected and throw errors.)

Oh, and this eliminates the need for b8r to do anything clever internally. The new solution is general and fixes b8r as well as everything else.

const preload_data =/* data from require.preloadData() */;
require.preload(preload_data).then(() => {
  /* ... */
});

So, now the steps are:

  1. Run require.preloadData() in the console, which spits out JSON data.
  2. Now call require.preload(), passing the data from step 1, which will generate a promise of everything loaded asynchronously.

If dependencies change, everything will still work, but dependencies that are force a synchronous request will generate console warnings.

As a nice bonus, this improves the load time of the b8r demo page by over 80%.