What does a good API look like?

I’m temporarily unemployed — my longest period of not having to work every day since I was laid off by Valuclick in 2010 — so I’m catching up on my rants. This rant is shaped by recent experiences I had during my recent job interview process (I was interviewed by two Famous Silicon Valley Companies, one of which hired me and the other of which “didn’t think I was a good fit”). Oddly enough, the company that hired me interviewed me in an API-agnostic (indeed almost language-agnostic) way while the company that didn’t interviewed me in terms of my familiarity with an API created by the other company (which they apparently are using religiously; although maybe not so much following recent layoffs).

Anyway, the API in question is very much in vogue in the Javascript / Web Front End world, in much the same way as CoffeeScript and Angular were a couple of years ago. (Indeed, the same bunch of people who were switching over to Angular a few years back have recently been switching over to this API, which is funny to watch). And this API and the API it has replaced in terms of mindshare is in large part concerned with a very common problem in UI programming.

A Very Common Problem in UI Programming

Probably the single most common problem in UI programming is synchronizing data with the user interface, or “binding”. There are three fundamental things almost any user-facing program has to do:

  • Populate the UI with data you have when (or, better, before) the UI first appears on screen
  • Keep the UI updated with data changed elsewhere (e.g. data that changes over time)
  • Update the data with changes made by the user

This stuff needs to be done, it’s boring to do, it often gets churned a lot (there are constant changes made to a UI and the data it needs to sync with in the course of any real project). So it’s nice if this stuff isn’t a total pain to do. Even better if it’s pretty much automatic for simple cases.

The Javascript API in question addresses a conspicuous failing of Javascript given its role as “the language of the web”. In fact almost everything good and bad about it stems from its addressing this issue. What’s this issue? It’s the difficulty of dealing with HTML text. If you look at Perl, PHP, and JSP, the three big old server-side web-programming languages, each handles this particular issue very well. The way I used to look at it was:

  • A perl script tends to look like a bunch of code with snippets of HTML embedded in it.
  • A PHP script tends to look like a web page with snippets of code embedded in it.
  • A JSP script tends to look like a web page with horrible custom tags and/or snippets of code embedded in it.

If you’re trying to solve a simple problem like get data from your database and stick it in your dynamic web page, you end up writing that web page the way you normally would (as bog standard HTML) and just putting a little something where you want your data to be and maybe some supporting code elsewhere. E.g. in PHP you might write “<p>{$myDate}</p>” while in  JSP you’d write something like “<p><%= myDate %></p>”. These all look similar, do similar things, and make sense.

It’s perfectly possible to defy these natural tendencies, e.g. write a page that has little or no HTML in it and just looks like a code file, but this is pretty much how many projects start out.

Javascript, in comparison, is horrible at dealing with HTML. You either end up building strings manually “<p>” + myDate + “</p>” which gets old fast for anything non-trivial, or you manipulate the DOM directly, either through the browser’s APIs, having first added metadata to your existing HTML, e.g. you’d change “<p></p>” to “<p id=”myDate”></p>” and then write “document.getElementById(‘myDate’).text = myDate;” in a script tag somewhere else.

The common solution to this issue is to use a template language implemented in Javascript (there are approximately 1.7 bazillion of them, as there is of anything involving Javascript) which allow you to write something like “<p>{{myDate}}</p>” and then do something like “Populate(slabOfHtml, {myDate: myDate});” in the simplest case (cue discussion about code injection). The net effect is you’re writing non-standard HTML and using a possibly obscure and flawed library to manipulate markup written in this non-standard HTML (…code injection). You may also be paying a huge performance penalty because depending on how things work, updating the page may involve regenerating its HTML and getting the browser to parse it again, which can suck — especially with huge tables (or, worse, huge slabs of highly styled DIVs pretending to be tables). OTOH you can use lots of jQuery to populate your DOM-with-metadata fairly efficiently, but this tends to be even worse for large updates.

The API in question solves this problem by uniting non-standard HTML and non-standard Javascript in a single new language that’s essentially a mashup of XML and Javascript that compiles into pure Javascript and [re]builds the DOM from HTML efficiently and in a fine-grained manner. So now you kind of need to learn a new language and an unfamiliar API.

My final interview with the company that did not hire me involved doing a “take home exam” where I was asked to solve a fairly open-ended problem using this API, for which I had to actually learn this API. The problem essentially involved: getting data from a server, displaying a table of data, allowing the user to see detail on a row item, and allowing the user to page through the table.

Having written a solution using this unfamiliar API, it seemed very verbose and clumsy, so I tried to figure out what I’d done wrong. I tried to figure out what the idiomatic way to do things using this API was and refine them. Having spent a lot of spare time on this exercise (and I was more-than-fully-employed at the time) it struck me that the effort I was spending to learn the API, and to hone my implementation, were far greater than the effort required to implement the same solution using an API I had written myself. So, for fun, I did that too.

Obviously, I had much less trouble using my API. Obviously, I had fewer bugs. Obviously I had no issues writing idiomatic code.

But, here’s the thing. Writing idiomatic code wasn’t actually making my original code shorter or more obvious. It was just more idiomatic.

To bind an arbitary data object to the DOM with my API, the code you write looks like this:


The complex case looks like this:

$(<some-selector>).bindomatic(<data-object>, <options-object>);

Assuming you’re familiar with the idioms of jQuery, there’s nothing new to learn here. The HTML you bind to needs to be marked up with metadata in a totally standard way (intended to make immediate sense even to people who’ve never seen my code before), e.g. to bind myDate to a particular paragraph you might write: “<p data-source=”.myDate”></p>”. If you wanted to make the date editable by the user and synced to the original data object, you would write: “<input data-bind=”.myDate”>”. The only complaints I’ve had about my API are about the “.” part (and I somewhat regret it). Actually the syntax is data-source=”myData.myDate” where “myData” is simply an arbitrary word used to refer to the original bound object. I had some thoughts of actually directly binding to the object by name, somehow, when I wrote the API, but Javascript doesn’t make that easy.

In case you’re wondering, the metadata for binding tabular data looks like this: “<tr data-repeat=”.someTable”><td data-source=”.someField”></td></tr>”.

My code was leaner, far simpler, to my mind far more “obvious”, and ran faster than the code using this other, famous and voguish, API. There’s also no question my API is far simpler. Oh, and also, my library solves all three of the stated problems — you do have to tell it if you have changes in your object that need to be synced to the UI — (without polluting the source object with new fields, methods, etc.) while this other library — not-so-much.

So — having concluded that a programming job that entailed working every day with the second API would be very annoying — I submitted both my “correct” solution and the simpler, faster, leaner solution to the second company and there you go. I could have been laid off by now!

Here’s my idea of what a good API looks like

  • It should be focused on doing one thing and one thing well.
  • It should only require that the programmer tell it something it can’t figure out for itself and hasn’t been told before.
  • It should be obvious (or as obvious as possible) how it works.
  • It should have sensible defaults
  • It should make simple things ridiculously easy, and complex things possible (in other words, its simplicity shouldn’t handcuff a programmer who wants to fine-tune performance, UX, and so on.

XCode and Swift

I don’t know if the binding mechanisms in Interface Builder seemed awesome back in 1989, but today — with all the improvements in both Interface Builder and the replacement of Objective-C with the (potentially) far cleaner Swift — they seem positively medieval to me, combining the worst aspects of automatic-code-generation “magic” and telling the left hand what the left hand is doing.

Let’s go back to the example of sticking myDate into a the UI somewhere. IB doesn’t really have “paragraphs” (unless you embed HTML) so let’s stick it in a label. Supposing you have a layout created in IB, the way you’re taught — as a newb — to do it this is:

  1. In XCode, drag from your label in IB to the view controller source code (oh, step 0 is to make sure both relevant things are visible)
  2. You’ll be asked to name the “outlet”, and then XCode will automagically write this code: @IBOutlet weak var myDate: UILabel!
  3. Now, in — say — the already-written-for-you viewDidLoad method of the controller you can write something like: myDate.text = _myDate (it can’t be myDate because you’re used myDate to hold the outlet reference).

Congratulations, you have now solved one of the three problems. That’s two lines of code, one generated by magic, the other containing no useful information, that you’ve written to get one piece of data from your controller to your view.

Incidentally, let’s suppose I wanted to change the outlet name from “myDate” to “dateLabel”. How do I do that? Well, you can delete the outlet and create a new outlet from scratch using the above process, and then change the code referencing the outlet. Is there another way? Not that I know of.

And how to we solve the other two problems?

Let’s suppose we’d in fact bound to an input field. So now my outlet looks like this: @IBOutlet weak var myDate: UITextField! (the “!” is semantically significant, not me getting excited).

  1. In XCode, drag from the field in IB to the view controller source code.
  2. Now, instead of creating an outlet, you select Action, and you make sure the type is UITextField, and change the event to ValueChanged.
  3. In the automatically-just-written-for-you Action code add the code _myDate = sender.text!

You’re now solved the last of the three problems. You’ve had a function written for you automagically, and you’ve written one line of retarded code. That’s three more lines of code (and one new function) to support your single field. And that’s two different things that require messing with the UI during a refactor or if a property name gets changed.

OK, what about the middle problem? That’s essentially a case of refactoring the original code so that you can call it whenever you like. So, for example, you write a showData method, call it from viewDidLoad, and then call it when you have new data.

Now, this is all pretty ugly in basic Javascript too. (And it was even uglier until browsers added documentQuerySelector.) The point is that it’s possible to make it very clean. How to do this in Swift / XCode?

Javascript may not have invented the hash as a fundamental data type, but it certainly popularized it. Swift, like most recent languages, provides dictionaries as a basic type. Dictionaries are God’s gift to people writing binding libraries. That said, Swift’s dictionaries are strongly typed which leads to a lot of teeth gnashing.

Our goal is to be able to write something like:


It would be even cooler to be able to round-trip JSON (the way my Javascript binding library can). So if this works we can probably integrate a decent JSON library.

So the things we need are:

  • Key-value-pair data storage, i.e. dictionaries — check!
  • The ability to add some kind of metadata to the UI
  • The ability to find stuff in the UI using this metadata

This doesn’t seem too intimidating until you consider some of the difficulty involved in binding data to IB.


The way tables are implemented in Cocoa is actually pretty awesome. In essence, Cocoa tables (think lists, for now) are generated minimally and managed efficiently by the following mechanism:

The minimum number of rows is generated to fill the available space.

When the user scrolls the table, new rows are created as necessary, and old rows disposed of. But, to make it even more efficient rather than disposing of unused rows, they are kept in a pool and reallocated as needed — so the row that scrolls off the top as you scroll down is reused to draw the row that just scrolled into view. (It’s more complex and clever than this — e.g. rows can be of different types, and each type is pooled separately — but that’s the gist.) This may seem like overkill when you’re trying to stick ten things in a list, but it’s ridiculously awesome when you’re trying to display a list of 30,000 songs on your first generation iPhone.

In order for this to work, there’s a tableDelegate protocol. The minimal implementation of this is that you need to tell the table how many rows of data you have and populate a row when you’re asked to.

So, for each table you’re dealing with you need to provide a delegate that knows what’s supposed to go in that specific table. Ideally, I just want to do something like self.bind(data) in the viewDidLoad method, how do I create and hook up the necessary delegates? It’s even worse if I want to use something like RootViewController (e.g. for a paged display) which is fiddly to set up even manually. But, given how horrible all this stuff is to deal with in vanilla Swift/Cocoa, that’s just how awesome it will be not to have to do any of it ever again if I can do this. Not only that, but to implement this I’m going to need to understand the ugly stuff really well.


Adding Metadata to IB Objects

The first problem is figuring out some convenient way of attaching metadata to IB elements (e.g. buttons, text fields, and so on). After a lot of spelunking, I concluded that my first thought (to use the accessibilityIdentifier field) turns out to be the most practical (even though, as we shall see, it has issues).

There are oodles of different, promising-looking fields associated with elements in IB, e.g. you can set a label (which appears in the view hierarchy, making the control easy to find). This would be perfect, but as far as I could tell it isn’t actually accessible at runtime. There’s also User Defined Runtime Attributes which are a bit fiddly to add and edit, but worse, as far as I’ve been able to tell, safely accessing them is a pain in the ass (i.e. if you simply ask for a property by name and it’s not there — crash!). So, unless I get a clue, no cigar for now.

The nice thing about the accessibilityIdentifier is that it looks like it’s OK for it to be non-unique (so you can bind the same value to more than one place) and it can be directly edited (you don’t need to create a property, and then set its name, set its type as you do for User Defined Runtime Attributes). The downside is that some things — UITableViews in particular — don’t have them. (Also, presumably, they have an impact on accessibility, but it seems to me like that shouldn’t be a problem if you use sensible names.)

So my first cut of automatic binding for Swift/Cocoa took a couple of hours and handled UITextField and UILabel.

class Bindery: NSObject {
    var view: UIView!
    var data: [String: AnyObject?]!
    init(view v:UIView, data dict:[String: AnyObject?]){
        view = v
        data = dict
    func subviews(name: String) -> [UIView] {
        var found: [UIView] = []
        for v in view!.subviews {
            if v.accessibilityIdentifier == name {
        return found
    @IBAction func valueChanged(sender: AnyObject?){
        var key: String? = nil
        if sender is UIView {
            key = sender!.accessibilityIdentifier
            if !data.keys.contains(key!) {
        if sender is UITextField {
            let field = sender as? UITextField
            data[key!] = field!.text
    func updateKey(key: String){
        let views = subviews(key)
        let value = data[key]
        for v in views {
            if v is UILabel {
                let label = v as? UILabel
                label!.text = value! is String ? value as! String : ""
            else if v is UITextField {
                let field = v as? UITextField
                field!.text = value! is String ? value as! String : ""
                field!.addTarget(self, action: "valueChanged:", forControlEvents: .EditingDidEnd)
    func update() -> Bindery {
        for key in (data?.keys)! {
        return self

Usage is pretty close to my ideal with one complication (this code is inside the view controller):

    var binder: Bindery
    var data:[String: AnyObject?] = [
        "name": "Anne Example",
        "sex": "female"

    override func viewDidLoad() {
        // Do any additional setup after loading the view, typically from a nib.
        binder = Bindery(view:self.view, data: data).update()

If you look closely, I have to call update() from the new Bindery instance to make things work. This is because Swift doesn’t let me refer to self inside an initializer (I assume this is designed to avoid possible issues with computed properties, or to encourage programmers to not put heavy lifting in the main thread… or something). Anyway it’s not exactly terrible (and I could paper over the issue by adding a class convenience method).

OK, so what about tables?

Well I figure tables will need their own special binding class (which I shockingly call TableBindery) and implement it so that you need to use an outlet (or some other hard reference to the table) and then I use Bindery to populate each cell (this lets you create a cell prototype visually and then bind to it with almost no work). This is how that ends up looking like this (I won’t bore you with the implementation which is pretty straightforward once I worked out that a table cell has a child view that contains all of its contents, and how to convert a [String: String] into a [String: AnyObject?]):

    var data:[String: AnyObject?] = [
        "name": "Anne Example",
        "sex": "female"
    override func viewDidLoad() {
        tableBinder = TableBindery(table:table, array: tableData).update()

In the course of getting this working, I discover that the prototype cells do have an accessibilityIdentifier, so it might well be possible to spelunk the table at runtime and identify bindings by using the attributes of the table’s children. The fact is, though, that tables — especially the sophisticated multi-section tables that Cocoa allows — probably need to be handled a little more manually than HTML tables usually do, and having to write a line or two of code to populate a table is not too bad.

Now imagine if Bindery supported all the common controls, provided a protocol for allowing custom controls to be bound, and then imagine an analog of TableBindery for handling Root view controllers. This doesn’t actually look like a particularly huge undertaking, and I already feel much more confident dealing with Cocoa’s nasty underbelly than I was this morning.

And, finally, if I really wanted to provide a self.bindData convenience function — Swift and Cocoa make this very easy. I’d simply extend UIView.

Node-Webkit Development

I’m in the process of porting RiddleMeThis from Realbasic (er Xojo) to Node-Webkit (henceforth nw). The latest version of nw allows full menu customization, which means you can produce pretty decently behaved applications with it, and they have the advantage of launching almost instantly and being able to run the same codebase everywhere (including online and, using something like PhoneGap, on mobile). It has the potential to be cross-platform nirvana.

Now, there are IDEs for nw, but I’m not a big fan of IDEs in general, and I doubt that I’m going to be converted to them any time soon. In the meantime, I’ve been able to knock together applications using handwritten everything pretty damn fast (as fast as with Realbasic, for example, albeit with many of the usual UI issues of web applications).

Here’s my magic sauce — a single shell script that I simply put in a project directory and double-click to perform a build:

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
echo "$DIR"
cd "$DIR"
zip app.nw *.html package.json *.png *.js *.css *.map
mv app.nw ../node-webkit.app/Contents/Resources/
cp nw.icns ../node-webkit.app/Contents/Resources/
cp info.plist ../node-webkit.app/Contents/
open ../node-webkit.app

What this shell script does is set the working directory to the project directory, zip together the source files and name the archive app.nw, and then move that, the icon, and the (modified) info.plist into the right place (assuming the node-webkit app is in the parent directory), and launch it. This basically takes maybe 2s before my app is running in front of me.

info.plist is simply copied from the basic nw app, and modified so that the application name and version are what I want to appear in the about box and menubar.

This is how I make sure the standard menus are present (on Mac builds):

var gui = require('nw.gui'),
    menubar = new gui.Menu({ type: 'menubar' });
if(process.platform === 'darwin'){
gui.Window.get().menu = menubar;

And finally, to enable easy debugging:

    var debugItem = new gui.MenuItem({ label: 'Dev' }),
        debugMenu = new gui.Menu(),
    debugItem.submenu = debugMenu;
    menuItem = new gui.MenuItem({ label: "Open Debugger" });
    menuItem.click = function(){

So with a few code snippets you get a Mac menubar (where appropriate), a Dev menu (if dev is truthy) and a single double-click to package and build your application in the blink of an eye. You can build your application with whatever Javascript / HTML / CSS combination suits you best.

Obviously, you don’t get a real “native” UI (but the next best thing is a web UI, since people expect to deal with web applications on every platform), and this isn’t going to be a great way to deliver performant applications that do heavy lifting (e.g. image processing), but it’s very easy to orchestrate command-line tools, and of course the chrome engine offers a ridiculous amount of functionality out of the box.

I should mention that Xojo is actually quite capable of producing decent Cocoa applications these days. (It sure took a while.) But its performance is nowhere near nw in practical terms. For example, C3D Buddy is able to load thousands of PNGs in the blink of an eye; the equivalent Xojo application is torpid. Now if I were doing heavy lifting with Javascript, perhaps it would tilt Xojo’s way, but it’s hard to think of just how heavy the lifting would need to get.


Screen shot of my galaxy generator in action
Screen shot of my galaxy generator in action

I’ve been developing stuff with Unity in my spare time for something like eight years (I started at around v1.5). I was initially drawn in by its alleged Javascript support. Indeed, I like Unityscript so much, I defended it vocally against charges that using C# is better, and wrote this article to help others avoid some of my early stumbles. I also contributed significant improvements to JSONParse — although the fact that you need a JSON module for Unityscript tells you something about just how unlike Javascript it really is.

I’m a pretty hardcore Javascript coder these days. I’ve learned several frameworks, written ad unit code that runs pretty much flawlessly on billions of computers every day (or used to — I don’t know what’s going on with my former employers), created a simple framework from scratch, helped develop a second framework from scratch (sorry, can’t link to it yet), built services using node.js and phantom.js, built workflow automation tools in javascript for Adobe Creative Suite and Cheetah 3D, and even written a desktop application with node-webkit.

The problem with Unityscript is that it’s not really Javascript, and the more I use it, the more the differences irk me.

Anyway, one evening in 2012 I wrote a procedural galaxy generator using barebones Javascript. When I say barebones, I mean that I didn’t use any third-party libraries (I’d had a conversation with my esteemed colleague Josiah Ulfers about the awfulness of jQuery’s iterators some time earlier and so I went off on a tangent and implemented my own iteration library for fun that same evening).

Now, this isn’t about the content or knowledge that goes into the star system generator itself. It’s basic physics and astrophysics, a bit of googling for things like the mathematics of log spirals, and finding Knuth’s algorithm for generating gaussian random distributions. Bear in mind that some of this stuff I know by heart, some of it I had googled in an idle moment some time earlier, and some of it I simply looked up on the spot. I’m talking about the time it takes to turn an algorithm into working code.

So the benchmark is: coding the whole thing, from scratch, in one long evening, using Javascript.

Now, the time it took to port it into Unityscript — NaN. Abandoned after two evenings.

I’m about halfway through porting this stuff to C# (in Unity), and so far I’ve devoted part of an afternoon and part of an evening. Now bear in mind that with C# I am using the Mono project’s buggy auto-completing editor, which is probably a slight productivity win versus using a solid text editor with no autocomplete for Javascript (and Unityscript). Also note that I am far from fluent as a C# programmer.

So far here are my impressions of C# versus Javascript.

C#’s data structures and types are a huge pain. Consider this method in my PNRG class (which wraps a MersenneTwister implementation I found somewhere in a far more convenient API):

// return a value in [min,max]
public float RealRange( double min, double max ){
    return (float)(mt.genrand_real1 () * (max - min) + min);

I need to cast the double (that results from mt.genrand_real1 ()). What I’d really like to do is pick a floating point format and just use it everywhere, but it’s impossible. Some things talk floats, others talk double, and of course there are uints and ints, which must also be cast to and fro. Now I’m sure there are bugs caused by, for example, passing signed integers into code that expects unsigned, but seriously. It doesn’t help that the Mono compiler generates cryptic error messages (not even telling you, for example, what it is that’s not the right type).

How about some simple data declarations:


var stellarTypes = {
    "O": {
        luminosity: 5E+5,
        color = 'rgb(192,128,255)',
        planets = [0,3]


public static Dictionary<string, StellarType> stellarTypes = new Dictionary<string, StellarType> {
    {"O", new StellarType(){
        luminosity = 50000F,
        color = new Color(0.75F,0.5F,1.0F),
        minPlanets = 0, 
        maxPlanets = 3

Off-topic, here’s a handy mnemonic — Oh Be A Fine Girl Kiss Me (Right Now Smack). Although I think that R and N are now referred to as C-R and C-N and have been joined by C-H and C-J so we probably need a replacement.

Note that the C# version requires the StellarType class to be defined appropriately (I could have simply used a dictionary of dictionaries or something, but the declaration gets uglier fast, and it’s pretty damn ugly as it is. I also need use the System.Collections.Generic namespace (that took me a while to figure out — I thought that by using System.Collections I would get System.Collections.Generic for free).

Now I don’t want to pile on C#. I actually like it a lot as a language (although I prefer Objective-C so far). It’s a shame it doesn’t have some obvious syntax sugar (e.g. public static auto or something to avoid typing the same damn type twice) and that its literal notation is so damn ugly.

Another especially annoying declaration pattern is public int foo { get; private set; } — note the lack of terminal semicolon, and the fact that it’s public/private. And note that this should probably be the single most common declaration pattern in C#, so it really should be the easiest one to write. Why not public int foo { get; }? (You shouldn’t need set at all — you have direct internal access to the member.)

I’m also a tad puzzled as to why I can’t declare static variables inside methods (I thought I might be doing it wrong, but this explanation argues it’s a design choice — but I don’t see how a static method variable would or should be different from an instance variable, only scoped to the method. So, instead I’m using private member variables which need to be carefully commented. How is this better?

So in a nutshell, I need to port the following code from Javascript to C#:

  • astrophysics.js — done
  • badwords.js — done; simple code to identify randomly generated names containing bad words and eliminate them
  • iter.js — C# has pretty good built-in iterators (and I don’t need most of the iterators I wrote) so I can likely skip this
  • mersenne_twister — done; replaced this with a different MT implementation in C#; tests written
  • planet.js — I’ve refactored part of this into the Astrophysics module; the rest will be in the Star System generator
  • pnrg.js — done; tests written; actually works better and simpler in C# than Javascript (aside from an hour spent banging my head against weird casting issues)
  • star.js — this is the galaxy generator (it’s actually quite simple) — it basically produces a random collection of stars offset from a log spiral using a gaussian distribution.
  • utils.js — random stuff like a string capitalizer, roman numeral generator, and object-to-HTML renderer; will probably go into Astrophysics or be skipped

Once I’ve gotten the darn thing working, I’ll package up a web demo. (When Unity 5 Pro ships I should be able to put up a pure HTML version, which will bring us full circle.) Eventually it will serve as the content foundation for Project Weasel and possibly a new version of Manta.

The Browser as Word-Processor

Like, I think, most web developers, I assumed that making a decent WYSIWYG editor that works inside a browser is difficult and tedious, and assumed the best solution was to use something off-the-shelf — especially since there are what look like pretty good free options such as TinyMCE and CKEditor.

Upon close — or even superficial — examination, these editors (and there are a lot of them) seem to pretty much suck. As a simple pons asinorum try the following test (incidentally, the editor in the latest version of WordPress scores quite well here):

  1. Create a simple document with a heading and a couple of paragraphs of text.
  2. Bold a couple of words in one of the paragraph (not two consecutive words).
  3. Now, while observing the editor’s toolbar, try a series of selections:
  • Select a bold word
  • Select a plaintext word
  • Select from a bold word to some plaintext
  • Select within the different paragraphs and headings
  • Select across two paragraphs with different styles

If you didn’t encounter several WTF moments then either the state of off-the-shelf JavaScript text editors has improved markedly since I wrote this or you’re not paying close attention.

Google Documents — errr Google Drive — handles all this with aplomb, which is to say it behaves exactly like Microsoft Word (which is kind of dumb, but at least (a) what users probably expect, and (b) reasonably consistent). E.g. if you select a mixture of bold and non-bold text the bold toolbar icon will be “unlit” indicating (my colleague and I assume) that when you press it the selection will turn bold. In most of these editors either the bold button exhibits random behavior or goes bold if the selection starts in bolded text. (The latter at least accurately foreshadows the behavior of the bold button: more on that later.)

Assumed Hard, Left Untried

My experience as a JavaScript coder has been that there are only really two levels of difficulty for doing stuff in JavaScript — not that hard and practically impossible (and every so often someone will surprise me by doing something I assumed was practically impossible).

I knew that there’s a trick for editing any web page, e.g. the following scriptlet will allow you to edit any web page (with spellchecking for bonus points):

javascript: document.body.contentEditable = true; document.body.attributes[“spellcheck”] = true;

So, I knew that something like this was easy. What I didn’t realize was that at some point in the browser wars Microsoft implemented pretty much all of Microsoft Word (modulo some UI glitches) into Internet Explorer, and then everyone else simply implemented most of their API.

So, for example, the following scriptlet used in conjunction with the preceding one allows you to bold the current selection in any web page:

javascript: document.execCommand(“bold”);

If  nothing else, you can have a lot of fun defacing web pages and taking screenshots:

CKEditor Home Page (Defaced)While I knew about the contentEditable “trick”, I didn’t put all this together until — frustrated by the huge and unwieldy codebase of CKEditor (which, at bottom, is really just a custom UI library that calls execCommand) and unimpressed by alternatives (aside from Redactor.js, which looks pretty good but is not free or open source) I found this thread on Stackoverflow.

editor.js in action on its documentation demo page

I cleaned up the example and had a working text editor in a few dozen lines of code. (I’d post them but they’re written for a client. I believe the whole project will eventually be open-sourced, but right now it hasn’t been. If and when it ever gets open-sourced, I’ll link the code. Failing that I may write my own simpler version for personal use (and integration with Foldermark).

document.execCommand implements most of the functionality you need to write a perfectly good word-processor from scratch in a browser. In fact, if you’re willing to put up with some UI quirks, pretty much all you need to do is implement some trivial UI components. Almost all the implementations out there create a complete blank document inside an iframe by default, but it’s perfectly viable to edit inline in a div, especially if you’re planning to use the ambient styles anyway.

The beauty of writing your own word processor using execCommand is that the browser gives you fine-grained access to all events, allowing you to arbitrarily fine-tune the low-level behavior of your word-processor. Microsoft Word, for example, has always had unfathomable UI quirks.

What don’t you get?

First, you do get pretty solid table support.

You don’t get fine control over styling, although there’s nothing to stop you from implementing a CSS editor of some kind (disguised or not). From my point of view, the default behavior of the browser word-processor is to be very styles-driven, and that’s a good thing. It’s not so easy out-of-the-box to, for example, set a specific point size for text.

Some execCommand commands don’t work very well. E.g. you can implement a “hiliter” using execCommand(“backColor”…) but there’s no way to toggle it off (unlike bold) so to properly implement it you need to directly mess with the DOM, which — given the way selections are represented, can be a tad ugly.

There’s stuff that is simply difficult because you’re in a browser. E.g. without implementing some kind of service platform (or perhaps leveraging an existing one) you’re not going to be able to drag-and-drop a picture from your desktop into a document. It would be fairly straightforward, I suspect, to integrate DropBox with a word-processor to allow drag-and-drop images and attachments — anything that can be represented by a url is, of course, golden.

Most of the missing features from the free word-processor in your browser are what you’d expect. E.g. anything to do with overall document structure: automatic index and table of contents generation, footnotes, endnotes, headers, footers, and so forth. None of this stuff is hard to do in a browser. The real problem is support for printing — browsers generally suck at targeting printers — so you’re not going to replace Quark or InDesign — but if you’re targeting ePub rather than paper, I don’t see why you’d want to use anything else.

Final Thoughts

The advantages of “owning” your word processor’s codebase are enormous, especially if you’re trying to integrate it into a complex workflow. You can fine-tune exactly what happens when a user hits delete (e.g. we need to track dependencies — was this paragraph based on that paragraph template or not?), and what is editable or not. You can do validation inside input fields while allowing other text to be edited free-form. It’s pretty wonderful. One day, perhaps, there will be free off-the-shelf editor that solves key UI and workflow integration issues, but we’re a ways from there.


Creating UI Atlases in Photoshop Automagically

A little over a year ago I was working on a game engine for a successful toy company. The project never ended up being finished (long, nasty story which I’ll happily tell over beers), but one of the interesting things I did for this project was build a Photoshop-to-Unity automatic UI workflow. The basic idea was:

  1. Create UI layout in Photoshop, with one “root level” layer or layer group corresponding to a control.
  2. Name the groups according to a fairly complicated naming convention (which encapsulated both behavior and functionality, e.g. how a button might change its appearance when hovered or clicked and what clicking it would do).
  3. Press a button.
  4. Select a target folder (which could be inside a Unity project’s “Resources” folder, of course).
  5. And point a script at the folder.

This worked amazingly well and allowed me to adjust to changing client requirements (e.g. random UI redesigns) very rapidly. But along the way I decided there were significant design issues with the concept, one of them being that the images needed to be texture-atlases (b) for performance reasons, but more importantly (a) because you needed to adjust import settings for each image (you can’t even select multiple images and change their import settings all at once — maybe this is fixed in Unity 4).

Another obvious problem was the embedding of behavior in names — it was convenient if you got it right the first time, but a serious pain in the ass for iterative development (either change the name in Photoshop and re-export everything or change the name in Photoshop and then edit the metadata file, and… yuck).

Anyway, I’ve had the “perfect” successor bouncing around in my head for a while and then the other day it struck me that someone probably has written a Photoshop image atlas tool already, and I might be able to rip that off and integrate it with my script.

Turns out that (a) someone has written an image atlas tool for Photoshop and (b) that the key component of that tool was a rectangle packer someone else (sadly the link is defunct) had written, implementing an algorithm documented here.

So that’s what I spent New Year’s Eve doing, and the result — Layer Group Atlas — is now availble on github.

Screen Shot 2013-01-01 at 7.00.46 PM

For the more visually-minded, you start with a UI design in Photoshop. (Stuff can overlap, you can have multiple states set up for controls, etc.) The key thing is each “root level” group/layer corresponds to an image in the final atlas (and yes, transparency/alpha will be supported, if a group/layer’s name starts with a period then it is ignored (as per UNIX “invisible files”) while a group/layer with an underscore will only have its metadata exported.

Screen Shot 2013-01-01 at 7.02.47 PM

For every layer (other than those which were ignored) metadata about the layer is stored in a JSON file. (One of the reasons I didn’t take this approach with my original tool was the lack of solid JSON support in Unity. I — cough — addressed that with another little project over the holiday break.) The JSON data is intended to be sufficient to rebuild the original Photoshop layout from the atlas, so it includes both the information as to where each element is in the atlas, but where it was in the original layout.

Screen Shot 2013-01-01 at 7.01.36 PM

Finally, the script creates and saves the atlas itself (as a PNG file, of course).

Aside from the CSS sprite support I mention in the comments in a TODO — an obvious thing for this tool to be able to do would be to export a bunch of CSS styles allowing the images in the atlas to be used as CSS sprites — there’s one more missing thing: a Unity UI library that consumes the JSON/atlas combination and produces a UI.

That’s my project for tonight.