Farewell require, you’re no longer needed…

This is a really technical article about front-end javascript programming. You have been warned.

Background

For most of the last two years, I have written my front-end code using my own implementation of require. It was inspired by the mostly delightful implementation of require used at Facebook (which I spent had over a month of my life in the bowels of). Initially, I just wanted to use something off-the-shelf, but every implementation I found required a build-phase during development (the Facebook version certainly did) and simultaneously tried to abstract away paths (because during the build phase, something would find all the libraries and map names to paths).

I wanted a require that did not depend on a build phase, did not abstract out paths (so if you knew a problem was in a required library, you also knew where that library was, was easy to use, and supported third-party libraries written to different variations of the commonjs “standard”.

What is require?

const foo = require('path/to/foo.js');

Require allows you to pull code from other files into a javascript file. Conceptually, it’s something like this (the first time).

  1. get text from ‘path/to/foo.js’
  2. insert it in the body of a function(module){ /* the code from foo.js */ )
  3. create an object, pass it to the function (as module)
  4. after the function executes, put whatever is in module.exports in a map of library files to output.

It follows that foo.js looks something like this:

module.exports = function() { console.log("I am foo"); }

The next time someone wants that file, pass them the same stuff you gave the previous caller.

Now, if that were all it needed to do, require would be pretty simple. Even so, there’s one nasty issue in the preceding when you want to allow for relative paths — if you require the same module from two different places you don’t want to return different instances of the same thing — both for efficiency and correctness.

But of course there are lots and lots of wrinkles:

  • Every javascript file can, of course, require other javascript file.
  • Circular references are easy to introduce accidentally, and can become unavoidable.
  • Each of these things is a synchronous call, so loading a page could take a very long time (if you put this naive implementation into production).
  • Javascript is a dynamic language. It’s already hard to statically analyze, require makes it harder. It’s particularly nasty because require is “just a function” which means it can be called at any time, and conditionally.
  • And, just to be annoying, the different implementations of commonjs make weird assumptions about the behavior of the wrapper function and the way it’s called (some expect module to be {}, some expect it to be {exports:{}}, some expect it to be something else, and because require is just javascript can choose to behave differently based on what they see.

Under the hood, require ends up needing to support things like dependency-mapping, circular-reference detection, asynchronous require, inlining pre-built dependency trees, and pretending to be different versions of require for different modules. It’s quite a nasty mess, and it’s vital to the correct functioning of a web app and its performance.

So, good riddance!

ES6 Modules

So, all the major browsers (and nodejs — sort of) finally support ES6 modules and — perhaps more importantly — because this is a change to the language (rather than something implemented in multiple different ways using the language) various build tools support it and can transpile it for less-compatible target platforms.

So now, you can write:

import foo from 'path/to/foo.js';

And the module looks like:

export default function(){ console.log('I am foo'); }

A couple of months ago, I took a few hours and deleted require.js from the b8r source tree, and then kept working on the b8r source files would pass b8r’s (rather meagre) unit tests. I then ran into a hard wall because I couldn’t get any components to load if they used require.

The underlying problem is that components are loaded asynchronously, and import is synchronous. There’s an asynchronous version of import I didn’t know about.

const foo = await import ('path/to/foo.js'); // does not work!

To obtain default exports via import() you need to obtain its default property. So:

const foo = (await import ('path/to/foo.js').default;

In general, I’ve started avoiding export default for small libraries. In effect, default is just another named export as far as dynamic import() is concerned.

import — again because it’s a language feature and not just some function someone wrote — is doing clever stuff to allow it to be non-blocking, e.g.

import {foo} from 'path/to/foo.js';

Will not work if foo is not explicitly exported from the module. E.g. exporting a default with a property named foo will not work. This isn’t a destructuring assignment. Under the hood this presumably means that a module can be completely “compiled” with placeholders for the imports, allowing its dependencies to be found and imported (etc.). Whereas require blocks compilation, import does not.

This also means that import affords static analysis (unlike require, import must be at top level scope in a module). Also, even though import() (dynamic import) looks just like a regular function, it turns out it’s also a new language feature. It’s not a function, and can’t be called etc.

An Interesting Wrinkle

One of the thorniest issues I had to deal with while maintaining require was that of third-party libraries which adhered to mystifyingly different commonjs variations. The funny thing is that all of them turn out to support the “bad old way” of handling javascript dependencies, which is <script> tags.

As a result of this, my require provided a solution for the worse-case-scenario in the shape of a function — viaTag — that would insert a (memoized) script tag in the header and return a promise that resolved when the tag loaded. Yes, it’s icky (but heck, require is eval) but it works. I’ve essentially retained a version of this function to deal with third-party libraries that aren’t provided as ES modules.

b8r without require

Last night, I revisited the task of replacing require with import in b8r armed with new knowledge of dynamic import(). I did not go back to the branch which had a partial port (that passed tests but couldn’t handle components) because I have done a lot of work on b8r’s support for web-components in my spare time and didn’t want to handle a complicated merge that touched on some of the subtlest code in b8r.

In a nutshell, I got b8r passing tests in about three hours, and then worked through every page of the b8r demo site and had everything loading by midnight with zero console spam. (Some of the loading is less seamless than it used to be because b8r components are even more deeply asynchronous than they used to be, and I haven’t smoothed all of that out yet.)

This morning I went back through my work, and found an error I’d missed (an inline test was still using require) and just now I realized I hadn’t updated benchmark.html so I fixed that.

An Aside on TDD

My position on TDD is mostly negative (I think it’s tedious and boring) but I’m a big fan of API-driven design and “dogfooding”. Dogfooding is, in essence, continuous integration testing, and the entire b8r project is a pure dogfooding. It’s gotten to the point where I prefer developing new libraries I’m writing for other purposes within b8r because it’s such a nice environment to work with. I’ve been thinking about this a lot lately as my team develops front-end testing best practices.

Here’s the point — swapping out require for import across a pretty decent (and technically complex) project is a beyond a “refactor”. This change took me two attempts, and I probably did a lot of thinking about it in the period between the attempts — so I’d call it a solid week of work.

At least when you’re waiting for a compile / render / dependency install you can do something fun. Writing tests — at least for UI components or layouts — isn’t fun to do and it’s not much fun to run the tests either.

“5 Story Points”

I’ve done a bunch of refactors of b8r-based code (in my last month at my previous job two of us completely reorganized most of the user interface in about a week, refactoring and repurposing a dozen significant components along the way). Refactoring b8r code is about as pleasant as I’ve ever found refactoring (despite the complete lack of custom tooling). I’d say it’s easier than fairly simple maintenance on typical “enterprisey” projects (e.g. usually React/Redux, these days) festooned with dedicated unit tests.

Anyway, even advocates of TDD agree that TDD works best for pure functions — things that have well-defined pre- and post- conditions. The problem with front-end programming is that if it touches the UI, it’s not a pure function. If you set the disabled attribute of a <button>, the result of that operation is (a) entirely composed of side-effects and (b) quite likely asynchronous. In TDD world you’d end up writing a test expecting the button with its disabled attribute set to be a button with a disabled attribute set. Yay, assignment statements work! We have test coverage. Woohoo!

Anyway, b8r has a small set of “unit tests” (many of which are in fact pretty integrated) and a whole bunch of inline tests (part of the inline documentation) and dogfooding (the b8r project is built out of b8r). I also have a low tolerance for console spam (it can creep in…) but b8r in general says something if it sees something (bad) and shuts up otherwise.

Anyway, I think this is a pretty pragmatic and effective approach to testing. It works for me!

What’s left to do?

Well, I have to scan through the internal documentation and purge references to require in hundreds of places.

Also, there’s probably a bunch of code that still uses require that for some reason isn’t being exercised. This includes code that expects to run in Electron or nwjs. (One of the nice things about removing my own implementation of require is that it had to deal with environments that create their own global require.) This is an opportunity to deal with some files that need to be surfaced or removed.

At that point there should be no major obstacle to using rollup.js or similar to “build” b8r (versus using the bespoke toolchain I build around require, and can now also throw away). From there it should be straightforward to convert b8r into “just another package”.

Is it a win …yet?

My main goal for doing all this is to make b8r live nicely in the npm (and yarn) package ecosystem and, presumably benefit from all the tooling that you get from being there. If we set that — very significant — benefit aside:

  • Does it load faster? No. The old require loads faster. But that’s to be expected — require did a lot of cute stuff to optimize loading and being able to use even cleverer tooling (that someone else has written) is one of the anticipated benefits of moving to import.
  • Does it run faster? No. The code runs at the same speed.
  • Is it easier to work with? export { .... } is no easier than module.exports = { .... } and destructuring required stuff is actually easier. It will, however, be much easier to work with b8r in combination with random other stuff. I am looking forward to not having to write /* global module, require */ and 'use strict' all over the place to keep my linters happy..
  • Does it make code more reliable? I’m going to say yes! Even in the simplest case import {foo} from 'path/to/foo.js' is more rigorous than const {foo} = require('path/to/foo.js') because the latter code only fails if foo is called (assuming it’s expected to be a function) or using it causes a failure. With import, as soon as foo.js loads an error is thrown if foo isn’t an actual export. (Another reason not to use export default by the way.)

More on web-components and perf

tl;dr web-components are only slow if they have a shadowRoot.

Here’s an updated version of the previous graph, showing two new data points — one where I replace two spans with a simple b8r component, and another where I modified makeWebComponent to allow the creation of custom-elements without a shadowRoot (and hence without styling).

My intention with the upcoming version of b8r is to replace b8r’s components with web-components, and my previous test showed that web-components were kind of expensive. But it occurred to me I hadn’t compared them to b8r components, so I did.

In a nut, a b8r component is about as expensive as a web-component with a shadowRoot (the b8r component in question was styled; removing the style didn’t improve performance, which isn’t surprising since b8r deals with component styles in a very efficient way), and a web-component without a shadowRoot is just plain fast. This is great news since it means that switching from b8r components (which do not have a shadow DOM) to web-components is a perf win.

C#

Screen shot of my galaxy generator in action
Screen shot of my galaxy generator in action

I’ve been developing stuff with Unity in my spare time for something like eight years (I started at around v1.5). I was initially drawn in by its alleged Javascript support. Indeed, I like Unityscript so much, I defended it vocally against charges that using C# is better, and wrote this article to help others avoid some of my early stumbles. I also contributed significant improvements to JSONParse — although the fact that you need a JSON module for Unityscript tells you something about just how unlike Javascript it really is.

I’m a pretty hardcore Javascript coder these days. I’ve learned several frameworks, written ad unit code that runs pretty much flawlessly on billions of computers every day (or used to — I don’t know what’s going on with my former employers), created a simple framework from scratch, helped develop a second framework from scratch (sorry, can’t link to it yet), built services using node.js and phantom.js, built workflow automation tools in javascript for Adobe Creative Suite and Cheetah 3D, and even written a desktop application with node-webkit.

The problem with Unityscript is that it’s not really Javascript, and the more I use it, the more the differences irk me.

Anyway, one evening in 2012 I wrote a procedural galaxy generator using barebones Javascript. When I say barebones, I mean that I didn’t use any third-party libraries (I’d had a conversation with my esteemed colleague Josiah Ulfers about the awfulness of jQuery’s iterators some time earlier and so I went off on a tangent and implemented my own iteration library for fun that same evening).

Now, this isn’t about the content or knowledge that goes into the star system generator itself. It’s basic physics and astrophysics, a bit of googling for things like the mathematics of log spirals, and finding Knuth’s algorithm for generating gaussian random distributions. Bear in mind that some of this stuff I know by heart, some of it I had googled in an idle moment some time earlier, and some of it I simply looked up on the spot. I’m talking about the time it takes to turn an algorithm into working code.

So the benchmark is: coding the whole thing, from scratch, in one long evening, using Javascript.

Now, the time it took to port it into Unityscript — NaN. Abandoned after two evenings.

I’m about halfway through porting this stuff to C# (in Unity), and so far I’ve devoted part of an afternoon and part of an evening. Now bear in mind that with C# I am using the Mono project’s buggy auto-completing editor, which is probably a slight productivity win versus using a solid text editor with no autocomplete for Javascript (and Unityscript). Also note that I am far from fluent as a C# programmer.

So far here are my impressions of C# versus Javascript.

C#’s data structures and types are a huge pain. Consider this method in my PNRG class (which wraps a MersenneTwister implementation I found somewhere in a far more convenient API):

// return a value in [min,max]
public float RealRange( double min, double max ){
    return (float)(mt.genrand_real1 () * (max - min) + min);
}

I need to cast the double (that results from mt.genrand_real1 ()). What I’d really like to do is pick a floating point format and just use it everywhere, but it’s impossible. Some things talk floats, others talk double, and of course there are uints and ints, which must also be cast to and fro. Now I’m sure there are bugs caused by, for example, passing signed integers into code that expects unsigned, but seriously. It doesn’t help that the Mono compiler generates cryptic error messages (not even telling you, for example, what it is that’s not the right type).

How about some simple data declarations:

Javascript:

var stellarTypes = {
    "O": {
        luminosity: 5E+5,
        color = 'rgb(192,128,255)',
        planets = [0,3]
    },
    ...
};

C#:

public static Dictionary<string, StellarType> stellarTypes = new Dictionary<string, StellarType> {
    {"O", new StellarType(){
        luminosity = 50000F,
        color = new Color(0.75F,0.5F,1.0F),
        minPlanets = 0, 
        maxPlanets = 3
    }},
    ...
};

Off-topic, here’s a handy mnemonic — Oh Be A Fine Girl Kiss Me (Right Now Smack). Although I think that R and N are now referred to as C-R and C-N and have been joined by C-H and C-J so we probably need a replacement.

Note that the C# version requires the StellarType class to be defined appropriately (I could have simply used a dictionary of dictionaries or something, but the declaration gets uglier fast, and it’s pretty damn ugly as it is. I also need use the System.Collections.Generic namespace (that took me a while to figure out — I thought that by using System.Collections I would get System.Collections.Generic for free).

Now I don’t want to pile on C#. I actually like it a lot as a language (although I prefer Objective-C so far). It’s a shame it doesn’t have some obvious syntax sugar (e.g. public static auto or something to avoid typing the same damn type twice) and that its literal notation is so damn ugly.

Another especially annoying declaration pattern is public int foo { get; private set; } — note the lack of terminal semicolon, and the fact that it’s public/private. And note that this should probably be the single most common declaration pattern in C#, so it really should be the easiest one to write. Why not public int foo { get; }? (You shouldn’t need set at all — you have direct internal access to the member.)

I’m also a tad puzzled as to why I can’t declare static variables inside methods (I thought I might be doing it wrong, but this explanation argues it’s a design choice — but I don’t see how a static method variable would or should be different from an instance variable, only scoped to the method. So, instead I’m using private member variables which need to be carefully commented. How is this better?

So in a nutshell, I need to port the following code from Javascript to C#:

  • astrophysics.js — done
  • badwords.js — done; simple code to identify randomly generated names containing bad words and eliminate them
  • iter.js — C# has pretty good built-in iterators (and I don’t need most of the iterators I wrote) so I can likely skip this
  • mersenne_twister — done; replaced this with a different MT implementation in C#; tests written
  • planet.js — I’ve refactored part of this into the Astrophysics module; the rest will be in the Star System generator
  • pnrg.js — done; tests written; actually works better and simpler in C# than Javascript (aside from an hour spent banging my head against weird casting issues)
  • star.js — this is the galaxy generator (it’s actually quite simple) — it basically produces a random collection of stars offset from a log spiral using a gaussian distribution.
  • utils.js — random stuff like a string capitalizer, roman numeral generator, and object-to-HTML renderer; will probably go into Astrophysics or be skipped

Once I’ve gotten the darn thing working, I’ll package up a web demo. (When Unity 5 Pro ships I should be able to put up a pure HTML version, which will bring us full circle.) Eventually it will serve as the content foundation for Project Weasel and possibly a new version of Manta.

The Browser as Word-Processor

Like, I think, most web developers, I assumed that making a decent WYSIWYG editor that works inside a browser is difficult and tedious, and assumed the best solution was to use something off-the-shelf — especially since there are what look like pretty good free options such as TinyMCE and CKEditor.

Upon close — or even superficial — examination, these editors (and there are a lot of them) seem to pretty much suck. As a simple pons asinorum try the following test (incidentally, the editor in the latest version of WordPress scores quite well here):

  1. Create a simple document with a heading and a couple of paragraphs of text.
  2. Bold a couple of words in one of the paragraph (not two consecutive words).
  3. Now, while observing the editor’s toolbar, try a series of selections:
  • Select a bold word
  • Select a plaintext word
  • Select from a bold word to some plaintext
  • Select within the different paragraphs and headings
  • Select across two paragraphs with different styles

If you didn’t encounter several WTF moments then either the state of off-the-shelf JavaScript text editors has improved markedly since I wrote this or you’re not paying close attention.

Google Documents — errr Google Drive — handles all this with aplomb, which is to say it behaves exactly like Microsoft Word (which is kind of dumb, but at least (a) what users probably expect, and (b) reasonably consistent). E.g. if you select a mixture of bold and non-bold text the bold toolbar icon will be “unlit” indicating (my colleague and I assume) that when you press it the selection will turn bold. In most of these editors either the bold button exhibits random behavior or goes bold if the selection starts in bolded text. (The latter at least accurately foreshadows the behavior of the bold button: more on that later.)

Assumed Hard, Left Untried

My experience as a JavaScript coder has been that there are only really two levels of difficulty for doing stuff in JavaScript — not that hard and practically impossible (and every so often someone will surprise me by doing something I assumed was practically impossible).

I knew that there’s a trick for editing any web page, e.g. the following scriptlet will allow you to edit any web page (with spellchecking for bonus points):

javascript: document.body.contentEditable = true; document.body.attributes[“spellcheck”] = true;

So, I knew that something like this was easy. What I didn’t realize was that at some point in the browser wars Microsoft implemented pretty much all of Microsoft Word (modulo some UI glitches) into Internet Explorer, and then everyone else simply implemented most of their API.

So, for example, the following scriptlet used in conjunction with the preceding one allows you to bold the current selection in any web page:

javascript: document.execCommand(“bold”);

If  nothing else, you can have a lot of fun defacing web pages and taking screenshots:

CKEditor Home Page (Defaced)While I knew about the contentEditable “trick”, I didn’t put all this together until — frustrated by the huge and unwieldy codebase of CKEditor (which, at bottom, is really just a custom UI library that calls execCommand) and unimpressed by alternatives (aside from Redactor.js, which looks pretty good but is not free or open source) I found this thread on Stackoverflow.

editor.js in action on its documentation demo page

I cleaned up the example and had a working text editor in a few dozen lines of code. (I’d post them but they’re written for a client. I believe the whole project will eventually be open-sourced, but right now it hasn’t been. If and when it ever gets open-sourced, I’ll link the code. Failing that I may write my own simpler version for personal use (and integration with Foldermark).

document.execCommand implements most of the functionality you need to write a perfectly good word-processor from scratch in a browser. In fact, if you’re willing to put up with some UI quirks, pretty much all you need to do is implement some trivial UI components. Almost all the implementations out there create a complete blank document inside an iframe by default, but it’s perfectly viable to edit inline in a div, especially if you’re planning to use the ambient styles anyway.

The beauty of writing your own word processor using execCommand is that the browser gives you fine-grained access to all events, allowing you to arbitrarily fine-tune the low-level behavior of your word-processor. Microsoft Word, for example, has always had unfathomable UI quirks.

What don’t you get?

First, you do get pretty solid table support.

You don’t get fine control over styling, although there’s nothing to stop you from implementing a CSS editor of some kind (disguised or not). From my point of view, the default behavior of the browser word-processor is to be very styles-driven, and that’s a good thing. It’s not so easy out-of-the-box to, for example, set a specific point size for text.

Some execCommand commands don’t work very well. E.g. you can implement a “hiliter” using execCommand(“backColor”…) but there’s no way to toggle it off (unlike bold) so to properly implement it you need to directly mess with the DOM, which — given the way selections are represented, can be a tad ugly.

There’s stuff that is simply difficult because you’re in a browser. E.g. without implementing some kind of service platform (or perhaps leveraging an existing one) you’re not going to be able to drag-and-drop a picture from your desktop into a document. It would be fairly straightforward, I suspect, to integrate DropBox with a word-processor to allow drag-and-drop images and attachments — anything that can be represented by a url is, of course, golden.

Most of the missing features from the free word-processor in your browser are what you’d expect. E.g. anything to do with overall document structure: automatic index and table of contents generation, footnotes, endnotes, headers, footers, and so forth. None of this stuff is hard to do in a browser. The real problem is support for printing — browsers generally suck at targeting printers — so you’re not going to replace Quark or InDesign — but if you’re targeting ePub rather than paper, I don’t see why you’d want to use anything else.

Final Thoughts

The advantages of “owning” your word processor’s codebase are enormous, especially if you’re trying to integrate it into a complex workflow. You can fine-tune exactly what happens when a user hits delete (e.g. we need to track dependencies — was this paragraph based on that paragraph template or not?), and what is editable or not. You can do validation inside input fields while allowing other text to be edited free-form. It’s pretty wonderful. One day, perhaps, there will be free off-the-shelf editor that solves key UI and workflow integration issues, but we’re a ways from there.

 

PHP json_encode replacement

I ran into a problem with RiddleMeThis recently — the new online runtime needs to generate JavaScript structures on the server to hand over to the client. To do this I used the json_encode function, which requires PHP 5.2. Until now, RiddleMeThis hasn’t made many assumptions about the PHP runtime, but it turns out assuming PHP 5.2 is not a good idea. There’s a chunk of PHP you can get somewhere or other that will replace json_encode, but it’s annoyingly inconvenient.

Anyway, it turns out I wrote my own jsencode() function in order to deploy an earlier version of the runtime on a Mac OS X 10.5 server (which doesn’t have PHP 5.2, argh). This was a quick and dirty effort which served the purpose but is kind of evil (it wraps quotation marks around numbers, for one thing, and doesn’t quote the symbols — which is fine for JavaScript but not allowed for JSON, especially if you’re using a strict parser as found in jQuery 1.4.

Feel free to use either of these snippets as you please.

function jsencode( $obj ){
	if( is_array( $obj ) ){
		$code = array();
		if( array_keys($obj) !== range(0, count($obj) - 1) ){
			foreach( $obj as $key => $val ){
				$code []= $key . ':' . jsencode( $val );
			}
			$code = '{' . implode( ',', $code ) . '}';
		} else {
			foreach( $obj as $val ){
				$code []= jsencode( $val );
			}
			$code = '[' . implode( ',', $code ) . ']';
		}
		return $code;
	} else {
		return '"' . addslashes( $obj ) . '"';
	}
}

So, here’s a better version. It allows you to encode for JSON or (by default) JavaScript (useful for passing stuff from PHP server-side to JavaScript client-slide):

function jsencode( $obj, $json = false ){
	switch( gettype( $obj ) ){
		case 'array':
		case 'object':
			$code = array();
			// is it anything other than a simple linear array
			if( array_keys($obj) !== range(0, count($obj) - 1) ){
				foreach( $obj as $key => $val ){
					$code []= $json ?
						'"' . $key . '":' . jsencode( $val ) :
						$key . ':' . jsencode( $val );
				}
				$code = '{' . implode( ',', $code ) . '}';
			} else {
				foreach( $obj as $val ){
					$code []= jsencode( $val );
				}
				$code = '[' . implode( ',', $code ) . ']';
			}
			return $code;
			break;
		case 'boolean':
			return $obj ? 'true' : 'false' ;
			break;
		case 'integer':
		case 'double':
			return floatVal( $obj );
			break;
		case 'NULL':
		case 'resource':
		case 'unknown':
			return 'null';
			break;
		default:
			return '"' . addslashes( $obj ) . '"';
	}
}

To send the information from PHP to JavaScript, you’d write something like this:

<script type="text/javascript">
      var foo = <?php echo jsencode( $some_variable ); ?>;
</script>

To generate a JSON feed using this code you’d write something like this:

header('Cache-Control: no-cache, must-revalidate');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT'); // some time in the past
header('Content-type: application/json');
echo jsencode( $some_associative_array, true );