Serving b8r

In a former life I worked on optimizing delivery of a fairly large website. I won’t pretend I understood a fraction of the detail, but I had a pretty good idea of the big picture and in a couple of places I drilled down to the bottom-most details.

This isn’t new to anyone who pays attention, but scale makes simple things hard.

The basic tricks to getting a web page to load fast are:

Do as little as possible:

  • Make everything as small as possible.
  • Make everything as simple as possible.
  • Be as asynchronous as possible.

Do it as infrequently, fast, and in parallel as possible:

  • Minimize the round-trip time from client to server.
  • Parallelize everything as much as possible.
  • Split stuff up enough to make it parallel, but not so much as to increase the number of round-trips. (To some extent, SPDY/http2 is solving this for us.)
  • Minimize the number of round trips.
  • Maximize cache utilization.

The grandparent of bindinator was bind-o-matic, which was not designed with all of these things in mind. In particular, it made no real attempt to leverage parallelism or asynchony. When Craig, Josiah, and I wrote bindinator, the “state of the art” was:

  • Figure out what your dependencies are.
  • Compile them into a big blob.
  • Minify the blob.
  • Give the client the blob.

Bind-o-matic’s approach was: “be so small and light that you don’t need to do clever shit to get good performance” (during development) because you can always do that later. We actually compiled our LESS on the client and it didn’t cause performance problems (once we forked less.js and sped it up a bit, and cached compiled CSS in localStorage).

While almost any javascript web application architecture can be served thus, more fine-grained optimization (e.g. trying to become interactive as fast as possible) is a tougher problem, especially when you have little or no ability to do less (e.g. suppose hypothetically that almost every person in your organization is incentivized to make your application bigger, slower, and more complex…)

And you might be committed to using a big, complicated framework that is virtually guaranteed to make anything it touches, bigger, more complex, and less asynchronous.

Anyway, I designed b8r to be as small, simple, and asynchronous as possible but left delivery optimization “for later”. I assumed I could just point webpack (or webpack2) or some more sophisticated tool (such as the stuff I had worked on) at anything I built later.

I did do one thing, though.

I wrote my own require implementation because I started reading the documentation for existing implementations and my eyes glazed over. In particular, none of them seemed as straightforward to use as the one I’d gotten used to in my former life (and with which I had deep familiarity).

I used to be able to write:

const foo = require('foo'); // don't even need to know the path to foo!

I think not needing to know the (relative) path to foo is an anti-feature by the way. The point is that this kind of thing normally needs a build-process to work, which adds a bunch of pain to development. I want the good part — require that just works — and no bad part.

Now, my require is purely client-side, which means that it is a big performance problem. Consider the following code:

const foo = require('path/to/foo.js');
foo(17);

The call to require must be synchronous. But what does require do behind the scenes?

  1. Use an XMLHttpRequest to pull “foo.js” from the server.
  2. Wrap a Function instance around the code inside it.
  3. Pass an object to the function.
  4. Return the “exports” property of the object when the function returns.

It really doesn’t matter how asynchronous your code is if every dependency involves halting execution while round-tripping to your server… recursively in the case of a file that, itself, has dependencies.

This behavior actually throws warnings in Chrome…

Chrome no likee
Chrome no likee

Chrome only complains about this the first time it happens, but it happens a lot.

Now the solution for this is to compile your javascript code on the server and deliver some kind of optimized blob — site.min.js say. This is exactly what tools like webpack do — they actually watch your code tree and trigger recompiles on-the-fly. Webpack offers a dev server that actually sets up a backchannel to the client and refreshes the browser automagically when there are code changes.

Sounds pretty good, right? — but it’s about 1/10 as responsive as using b8r and just forcing refresh. I fucking hate having to compile my code all the time, even if all the compiler does is walk a tree and concatenate a shitload of files wrapped in function calls and assignment statements and then call uglifyjs.

But that’s on a local dev server. What happens when you stick this code on a real server and the round-trip goes from ~0ms to ~100ms? It turns out that on the project I’m working on it changes my web application’s spin-up (with nothing cached) from ~600ms to ~1500ms. (Aside: this is a real web application with a shitload of functionality talking to production servers with no back-end optimization. In my past life, loading in ~1500ms from a real server would have caused spontaneous orgasms. When I told people that performance like this was achievable I was assumed to be a naive fool. No, I’m not bitter!)

So, how to do all this stuff on the client?

  1. Make b8r’s use of require asynchronous. E.g. b8r synchronously loads b8r.dom.js before it finishes loading, so load its dependencies asynchronously before loading b8r itself asynchronously.
  2. Get require to warn whenever it loads a module synchronously. Ick.
  3. OK, get require to return a JSON structure of synchronously loaded modules and load them asynchronously before doing anything else.
  4. Repeat step 3 until no warnings. (This took three iterations.)

Loading time from the stage server went from ~1500ms to ~600ms with no server-side optimization whatsoever. Not bad for a late night hack.

Now wouldn’t it be nice if this were all automatic?

I started writing this post on Friday evening, but my first stab at automating this didn’t work and my brain was too fried to fix it.

In order to function, require tracks all the modules it loads, and it already replaces itself recursively to handle nested requires (to allow for relative paths in require statements) so all I needed to do was track one level of dependencies, and then generate a list of preload “waves” where each wave comprises all modules with no unloaded dependencies. (Circular dependencies will be detected and throw errors.)

Oh, and this eliminates the need for b8r to do anything clever internally. The new solution is general and fixes b8r as well as everything else.

const preload_data =/* data from require.preloadData() */;
require.preload(preload_data).then(() => {
  /* ... */
});

So, now the steps are:

  1. Run require.preloadData() in the console, which spits out JSON data.
  2. Now call require.preload(), passing the data from step 1, which will generate a promise of everything loaded asynchronously.

If dependencies change, everything will still work, but dependencies that are force a synchronous request will generate console warnings.

As a nice bonus, this improves the load time of the b8r demo page by over 80%.

And One More Thing…

If you don’t mind your web app not loading asynchronously the first time, I’ve added another feature to bindinator (and it’s used in he demo page)…

require.autoPreload(10000).then(() => { 9/* ... */ });

This automates all of the above, storing preloadData() in localStorage (the number is the delay in milliseconds before preloadData is written to localStorage; the default is 2000). If code changes cause a file to load synchronously or break the page load, the preloadData is updated or destroyed as appropriate.

So, the first time you load the page it looks like this:

Browser network tab showing b8r files loading one-at-a-time

But from then on it looks like this:

network tab shows b8r files loading in parallel "waves"

You may notice that the last few items are sequential. These are chained promises. At some point I’ll add components to preloadData…

Heterogeneous Lists

b8r's demo site uses a heterogeneous list to display source files with embedded documentation, tests, and examples
b8r’s demo site uses a heterogeneous list to display source files with embedded documentation, tests, and examples

One of the things I wanted to implement in bindinator was heterogeneous lists, i.e. lists of things that aren’t all the same. Typically, this is implemented by creating homogeneous lists and then subclassing the list element, even though the individual list elements may have nothing more in common with one another than the fact that, for the moment, they’re in the same list.

This ended up being pretty easy to implement in two different ways.

The first thing I tried was a effectively an empty (as in no markup) “router” component. Instead of binding the list to a superclass component, you bind to a content-free component (code, no content) which figures out which component you really want programmatically (so it can be arbitrarily complex) and inserts one of those over itself. This is a satisfactory option because it handles both simple cases and complex cases quite nicely, and didn’t actually touch the core of bindinator.

Here’s the file-viewer code in its entirety:

<script>
    switch (data.file_type || data.url.split('.').pop()) {
        case 'md':
        case 'markdown':
            b8r.component('components/markdown-viewer').then(viewer => {
                b8r.insertComponent(viewer, component, data);
            });
            break;

        case 'text':
            b8r.component('components/text-viewer').then(viewer => {
                b8r.insertComponent(viewer, component, data);
            });
            break;

        case 'js':
            b8r.component('components/literate-js-viewer').then(viewer => {
                b8r.insertComponent(viewer, component, data);
            });
            break;
    }
</script>

(I should note that this router is not used for a list in the demo site, since the next approach turned out to meet the specific needs for the demo site.)

The example of this approach in the demo code is the file viewer (used to display markdown, source files, and so on). You pass it a file and it figures out what type of file it is from the file type and then picks the appropriate viewer component to display it with. In turn this means that a PNG viewer, say, need have nothing in common with a markdown viewer, or an SVG viewer. Or, to put it another way, we can use a standalone viewer component directly, rather than creating a special list component and mixing-in or subclassing the necessary stuff in.

You’ll note that this case is pretty trivial — we’re making a selection based directly on the file_type property, and I thought it should be necessary to use a component or write any code for such a simple case.

The second approach was that I added a toTarget called component_map that let you pick a component based on the value of a property. This maps onto a common JSON pattern where you have a heterogeneous array of elements, each of which has a common property (such as “type”). In essence, it’s a toTarget that acts like a simple switch statement, complete with allowing a default option.

The example of this in the demo app is the source code viewer which breaks up a source file into a list of different kinds of things (source code snippets, markdown fragments, tests, and demos). These in turn are rendered with appropriate components.

This is what a component_map looks like in action:

<div
  data-list="_component_.parts"
  data-bind="component_map(
    js:js-viewer|
    markdown:markdown-viewer|
    component:fiddle|
    test:test
  )=.type"
>
  <div data-component="loading"></div>
</div>

From my perspective, both of these seem like pretty clean and simple implementations of a common requirement. The only strike against component_map is obviousness, and in particular the quasi-magical difference between binding to _component_.parts vs. .type, which makes me think that while the latter is convenient to type, forcing the programmer to explicitly bind to _instance_.type might be clearer in the long run.

P.S.

Anyone know of a nice way to embed code in blog posts in WordPress? All I can find are tools for embedding hacks in wordpress.

Announcing bindinator.js

Having recently set up bindinator.com, I am “officially” announcing my side-project Bind-O-Matic.js bindinator.js. It’s a small (currently 7kB gzipped and minified) Javascript library that is designed to make developing in vanilla javascript better in every way than using one or more frameworks. It embodies my current ideas about Javascript, Web, and UI development, and programming — for whatever that’s worth.

Also, I’m having a ton of fun hacking on it.

By way of “dogfooding”, I’m simultaneously building a skunkworks version of my main work project (which is an Electron-based desktop application) with it, adapting any code I can over to it, building b8r’s own demo environment, and slowly porting various other components and code snippets to it.

Above is my old galaxy generator, updated with a bunch of SVG goodness, and implemented using b8r (it was originally cobbled together using jQuery).

Why another framework?

I’ve worked with quite a few frameworks over the years, and in the end I like working in “vanilla” js (especially now that modern browsers have made jQuery pretty much unnecessary). Bindinator is intended to provide a minimal set of tools for making vanilla js development more:

  • productive
  • reusable
  • debuggable
  • maintainable
  • scalable

Without ruining the things that make vanilla js development as pleasant as it already is:

  • Leverage debugging tools
  • Leverage browser behavior (e.g. accessibility, semantic HTML)
  • Leverage browser strengths (e.g. let it parse and render HTML)
  • Be mindful of emerging ideas (e.g. semantic DOM, import)
  • Super fast debug cycle (no transpiling etc.) — see “leverage debugging tools”
  • Don’t require the developer to have to deal with different, conflicting models

The last point is actually key: pretty much every framework tries to abstract away the behavior of the browser (which, these days, is actually pretty reasonable) with some idealized behavior that the designer(s) of the framework come up with. The downside is that, like it or not, the browser is still there, so you (a) end up having to unlearn your existing, generally useful knowledge of how the browser works, (b) learn a new — probably worse — model, and then (c) reconcile the two when the abstraction inevitably leaks.

Being Productive

Bindinator is designed to make programmers and designers more (and separately) productive, decouple their activities, and be very easy to pick up.

To make something appear in a browser you need to create markup or maybe SVG. The easiest way to create markup or SVG that looks exactly like what you want is — surprise — to create what you want directly, not write a whole bunch of code that — if it works properly — will create what you want.

Guess what? Writing Javascript to create styled DOM nodes is slower, more error-prone, less exact, probably involves writing in pseudo-languages, adds compilation/transpilation steps, doesn’t leverage something the browser is really good at (parsing and rendering markup), and probably involves adding a mountain of dependencies to your code.

Bindinator lets you take HTML and bind it or turn it into reusable components without translating it into Javascript, some pseudo-language, a templating language, or transpilation. It also follows that a designer can style your markup.

Here’s a button:

<button class="awesome">Click Me</button>

Now it’s bound — asynchronously and by name.

<button class="awesome" data-event="click:reactor.selfDestruct">
  Click Me
</button>

When someone clicks on it, an object registered as “reactor” will have its “selfDestruct” property (presumably a function) called. If the controller object hasn’t been loaded, b8r’s event handler will store the event and replay it when the controller is registered.

Here’s an input:

<inpu type="range">

And now its value is bound to the fuel_rod_position of an object registered as “reactor”:

<input type="range" data-bind="value=reactor.fuel_rod_position">

And maybe we want to allow the user to edit the setting manually as well, so something like this:

<input type="range" data-bind="value=reactor.fuel_rod_position">
<input type="number" data-bind="value=reactor.fuel_rod_position">

…just works.

Suppose later we find ourselves wanting lots of sliders like this, so we want to turn it into a reusable component. We take that markup, and modify it slightly and add some setup to make it behave nicely:

<input type="range" data-bind="value=_component_.value">
<input type="number" data-bind="value=_component_.value">
<script>
 const slider = findOne('[type="range"]');
 slider.setAttribute('min', component.getAttribute('min') || 0);
 slider.setAttribute('max', component.getAttribute('max') || 10);
 register(data || {value: 0});
</script>

This is probably the least self-explanatory step. The script tag of a component executes in a private context where there are some useful local variables:

component is the element into which the component is loaded; find and findOne are syntax sugar for component.querySelector and component.querySelectorAll (converted to a proper array) respectively, and register is syntax sugar for registering the specified object as having the component’s unique id.

And save it as “slider-numeric.component.html”. We can invoke it thus:

<span 
  data-component="slider-numeric"
  data-bind="component(value)=reactor.fuel_rod_position"
></span>

And load it asynchronously thus:

const {component} = require('b8r');
component('slider-numeric');

To understand exactly what goes on under the hood, we can look at the resulting markup in (for example) the Chrome debugger:

Chrome debugger view of a simple b8r component

Some things to note: data-component-id is human-readable and tells you what kind of component it is. The binding mechanism (change and input event handlers) is explicit and self-documented in the DOM, and the binding has become concrete (_component_ has been replaced with the id of that component’s instance). No special debugging tools required.

Code Reuse

Bindinator makes it easy to separate presentation (and presentation logic) from business logic, making each individually reusable with little effort. Components are easily constructed from pieces of markup, making “componentization” much like ordinary refactoring.

A bindinator component looks like this:

<style>
  /* style rules go here */
</style>
<div>
  <!-- markup goes here -->
</div>
<script>
  /* component logic goes here */
</script>

All the parts are optional. E.g. a component need not have any actual

When a component is loaded, the HTML is rendered into DOM nodes, the script is converted into the body of a function, and the style sheet is inserted into the document head. When a component is instanced, the DOM elements are cloned and the factory function is executed in a private context.

Debugging

Bindinator is designed to have an incredibly short debug cycle, to add as little cognitive overhead as possible, and work well with debugging tools.

To put it another way, it’s designed not to slow down the debug cycle you’d have if you weren’t using it. Bindinator requires no transpilation, templating languages, parallel DOM implementations, it’s designed to leverage your existing knowledge of the browser’s behavior rather than subvert and complicate it, and if you inspect or debug code written with bindinator you’ll discover the markup and code you wrote where you expect to. You’ll be able to see what’s going on by looking at the DOM.

Maintenance

If you’re productive, write reusable (and hence DRY) code, and your code is easier to debug, your codebase is likely to be maintainable.

Scale

Bindinator is designed to make code scalable:

Code reuse is easy because views are cleanly separated from business logic.

Code is smaller because bindinator is small, bindinator code is small, and code reuse leads to less code being written, served, and executed.

Bindinator is designed for asynchrony, making optimization processes (like finessing when things are served, when they are loaded, and so forth) easy to employ without worrying about breaking stuff.

Core Concepts

Bindinator’s core concepts are event and data binding (built on the observation that data-binding is really just event-binding, assuming that changes to bound objects generate events) and object registration (named objects with properties accessed by path).

Bindinator provides a bunch of convenient toTargets — DOM properties to which you might want to write a value, in particular value, text, attr, style, class, and so forth. In most cases bindings are self-explanatory, e.g.

data-bind="style(fontFamily)=userPrefs.uiFont"

There are fewer fromTargets (value, text, and checked) which update bound properties based on user changes — for more complex cases you can always bind to methods by name and path.

Components are simply snippets of web content that get inserted the way you’d want and expect them to, with some syntax sugar for allowing each snippet to be bound to a uniquely named instance object.

And, finally, b8r provides a small number of convenience methods (which it needs to do what it does) to make it easier to work with ajax (json, jsonp), the DOM, and events.

The Future

I’m still working on implementing literate programming (allowing the programmer to mix documentation, examples, and tests into source code), providing b8r-specific lint tools, building out the standard control library (although mostly vanilla HTML elements work just fine), and adding more tests in general. I’m tracking progress publicly using Trello.

Usability on the Underside

A minimalist cocoa app
A minimalist cocoa app

I’ve always thought it ironic that Apple makes the most usable computers and the least usable APIs. I’m referring, of course, to AppleScript — just kidding. At first I thought it was a huge failing (and belies Steve Jobs’s claimed obsession with making even the stuff the user never sees beautiful; but then he likely never looked at APIs); then I excused it as Apple wanting to make dealing with its APIs a pons asinorum that less capable programmers wouldn’t be able to cross.

(Note: I’m not kidding that AppleScript is horrible. Just that it’s twenty years old and that horse has been beaten to death.)

I think I was right the first time.

But, it has gotten a lot better.

Perhaps the single most impressive piece of software Apple ever shipped was HyperCard. There was, basically, nothing wrong with HyperCard that couldn’t have been easily fixed by version 3. The chief problems with HyperCard were that:

  • Images were not a first class entity (in VB3, for example, you could load an image into an image variable much like you could load a text file into a string variable in almost any language with decent string manipulation (i.e. not C/C++ or Pascal, but every other popular language).
  • Binary blobs were not a first class entity (something that you’d probably implement for free while implementing the above).
  • The UI gadgets weren’t native UI gadgets.
  • There was no bridge to the native Toolbox short of writing a plugin in C/C++ or using the crazy-but-genius third party HyperTalk compiler.

That may seem like a lot of key flaws, but there’s nothing there that can’t be solved with a bunch of rudimentary programming. Consider what HyperCard got right:

  • First of all, it was shockingly stable — you could often work on HyperTalk projects for days with no serious crashes (let alone System reboots). This was in an era when computers (both Mac and PC) crashed like crazy.
  • You could quickly build user interfaces with drag-and-drop, including bitmap drawing tools
  • State persistent by default
  • Every application was its own database
  • By default you ran inside the development environment — you lived in the development environment
  • Shockingly fast — once HyperTalk 2 came out with its JIT compiler, performance was amazingly good (see note below).
  • Its language was amazingly accessible — both readable and writable — more readable than AppleTalk and far, far more writeable. Most people, including non-coders, could figure out how to do simple stuff within a few hours.
  • Awesome levels of introspection (which allowed metaprogramming)
  • First class debugging tools
  • Always-available REPL

Many of these virtues have yet to be matched by any programming language. VB3 essentially took a few of these virtues, replaced the incredibly easy to work with but hard-to-implement HyperTalk with the very easy to use and already-implemented BASIC, fixed all the obvious faults, and became the foundation of Microsoft’s entire approach to software development. And, say what you will about Microsoft, they do make good development tools.

Note on performance: we had an ambitious multimedia project written in VB3 that barely ran on maximally configured Windows PCs (we were using then-bleeding-edge 486 DX2/66 desktops for development, and our target platform was the then just-shipped IBM Thinkpad 750 (which cost $11,000 in Australia and had a 486DX4/75 CPU by my recollection; it was also the first name-brand MS-DOS compatible laptop with both color graphics and sound). I cloned the project on my aging Quadra 700 (68040 25MHz) using HyperCard and it ran circles around the PCs.

Usable Languages

It’s not easy to create a really approachable programming language.

Obviously.

It’s been done a few times — notably with BASIC. Javascript is a major achievement, but compared to HyperTalk it’s still very difficult to learn. Javascript essentially thrived by being ubiquitous, indispensable, and based on a simplified subset of the widely understood C-ish syntax:

if (x == y && z != q){ p = foo ? bar : baz; }

could be in any one of a dozen C-ish languages, including Javascript. HyperCard was based on a simplified subset of the even more widely understood English syntax:

get the width of button ok
get it * 2
set the width of button ok to it

Javascript was chiefly indispensable because browsers lacked obvious functionality and simply wouldn’t address it — so instead of simple ways to center stuff or make image-buttons change when the user clicked on them, we got a weird new programming language. (E.g. it took the addition of CSS for us to make it possible for something to change its appearance based on mouse events without programming, and having to code CSS is hardly an improvement over having to code in Javascript.) I’m not sure if this was a conspiracy by Netscape to make Javascript popular — never ascribe to any other cause that which is adequately explained by incompetence — but it really took a huge amount of community effort for programming in Javascript to become even vaguely tolerable.

When I finally was forced to get good at programming Javascript (around 2005), just figuring out how to get some kind of debugger working was a major pain in the neck that many developers never bothered to figure out, and writing browser-portable code was a so difficult most developers threw up their hands and just targeted IE6.

My point is that Javascript isn’t a usable language, it’s just more usable than C or Java (not saying much) and supported by a ubiquitous and indispensable environment (the browser) which has grown to have many of the virtues of HyperCard over time (e.g. browsers are pretty stable now, you get a REPL and a decent debugger) while retaining many of the faults (native UI elements are conspicuously missing, and would be amazingly useful given how much effort goes into making half-assed faux elements).

(It’s worth noting that the web was at least partially inspired by HyperCard. The syntax for comments:

<!-- comment -->

looks to me like a little ode to HyperCard, whose comment syntax was

-- comment

although I’m told they both got it from some precursor; still the influence of HyperCard on the web is pervasive.)

I’m not saying that we should have to write

set the borderLeft of the style of button "OK" to "2px solid black" 

instead of the far more economical (but hardly intuitive):

$("#ok").css({borderLeft: "2px solid black"})

but the fact all web developers find the latter programming style tolerable is something of a miracle, and relies on the rise of jQuery — the currently preferred library for papering over the manifold stupidities of web browsers.

I think it’s safe to say that something like

find("button.ok").style.borderLeft = "2px solid black"

would be reasonably intuitive, C-ish, and decently economical, but despite 25 years of progress and a huge amount of effort by some of the best programmers working at some of the richest companies in the world, the best we can come up with right now out-of-the-box is something like:

document.querySelector("button.ok").style.borderLeft = "2px solid black"

which is both far less readable and less intuitive than the HyperTalk version and yet less concise.

A tangential rant on idiotic naming and the misuse of namespaces

How is it that despite everything being namespaced to hell and back these days, “querySelector” isn’t called — I don’t know — “find”? Why not have a global function named “find” for fuck’s sake, or a global find object that supported find.byURL(), find.byCSSSelector(). What’s the point of sticking everything in its own private name space if you can’t give things people use constantly a short, meaningful, intuitive name?

End rant.

Which gets us back to Apple

The first thing in the original Inside Macintosh (after introductory stuff) was a minimalist Macintosh application (I can’t remember whether it was in Pascal or C) that launched, opened up a Window with a text editor inside it, and had a menu and a main event loop. It was something like four pages of code. Note that this application did not handle undo, did not save or load files (maybe it did), and didn’t support multiple documents, and it didn’t behave nicely with other applications (e.g. redraw its Window after being sent to the background) because MacOS was single-tasking at the time. It was barely more than “hello world” in a Window.

This was because, in the original toolbox, when you created a window you had to do things like track mouse behavior in the window’s close box yourself. The fact you didn’t need to handle every keystroke inside the text editor was actually one of the miraculously cool things about the Mac toolbox in 1984 (but the Inside Mac documentation and natively-hosted compilers were still a couple of years away in 1984).

Thanks to over thirty years of progress, and the efforts of some of the greatest programmers and the most design-and-usability-focused (and now richest) company the world has ever known we can now do the equivalent of the above in — I have no idea how much code, so let’s find out. I tried googling for an example somewhere and the best I could come up with was this pure-code Obj-C “Hello World” for iOS.

This approach isn’t necessarily the first thing you should learn when learning to develop applications for a platform, but it should probably be the second or third. You should have a pretty good idea how absolutely everything works at some level.

#import <Foundation/Foundation.h>
#import <UIKit/UIKit.h>

@interface MyDelegate : UIResponder< UIApplicationDelegate >
@end

@implementation MyDelegate
- ( BOOL ) application: ( UIApplication * ) application
           didFinishLaunchingWithOptions: ( NSDictionary * ) launchOptions {
  UIWindow *window = [ [ [ UIWindow alloc ] initWithFrame: 
    [ [ UIScreen mainScreen ] bounds ] ] autorelease ];
  window.backgroundColor = [ UIColor whiteColor ];
  UILabel *label = [ [ UILabel alloc ] init ];
  label.text = @"Hello, World!";
  label.center = CGPointMake( 100, 100 );
  [ label sizeToFit ];
  [ window addSubview: label ];
  [ window makeKeyAndVisible ];
  [ label release ];

  return YES;
}
@end

int main( int argc, char *argv[ ] )
{
  UIApplicationMain( argc, argv, nil, NSStringFromClass( [ MyDelegate class ] ) );
}

That’s actually a lot better than I would have guessed, and a gigantic improvement over writing a minimal Mac application in 1986 (but about on par with writing a MacApp 2.x application in 1993). I can’t find a similar example for Swift, and I’d prefer to implement a desktop application (as its minimal functionality is less minimal than an iPhone app which (for example) is killed by the OS rather than being quit. So I used this longer but similarly minimal Obj-C desktop example as a starting point. Note that this example doesn’t even display “Hello World”, so I added that.

One thing I really like about this second example is that it doesn’t require defining a new class or subclass — like the original Inside Mac example it’s a function, so it doesn’t seem as much like magic — create an instance of the Application class, and send it the .run() method and you’re done, but now WTF does that class do? Sure it’s dealing with objects but it’s really clear what’s going on and where you need to drill down to understand a particular thing better. Similarly, it gives you a very good idea of (for example) what loading a XIB (or NIB) does for you and where it fits into the application’s life cycle.

import Cocoa

func main() -> Int {
    // give me an application instance
    let app = NSApplication.sharedApplication()
    app.setActivationPolicy(.Regular) // no clue if this is necessary or default
                                      // behavior would be fine
    
    // build out our menu (iOS apps do not need to do any of this
    let menubar = NSMenu()
    let appMenuItem = NSMenuItem()
    menubar.addItem(appMenuItem)
    app.mainMenu = menubar
    let appMenu = NSMenu()
    let appName = NSProcessInfo.processInfo().processName
    let quitTitle = "Quit " + appName
    
    // the only thing we're making that actually DOES something is implementing 
    // the Quit menu item
    let quitMenuItem = NSMenuItem(title: quitTitle, action: "stop:", keyEquivalent: "q")
    quitMenuItem.target = app
    appMenu.addItem(quitMenuItem)
    appMenuItem.submenu = appMenu
    
    // this is where we create our window and completely define how it looks -- it 
    //could be entirely replaced by loading a XIB
    let window = NSWindow(
        contentRect: NSRect(x: 0, y: 0, width: 400, height: 400),
        styleMask: NSTitledWindowMask,
        backing: .Buffered,
        `defer`: false // I assume defer is a keyword
    )
    window.cascadeTopLeftFromPoint(NSPoint(x: 20, y: 20)) // playing nice with 
                                 // other apps -- iOS apps don't do this either
    window.title = appName // iOS apps don't have a title, and we don't really 
                           //need to set this
    
    // now we're putting "Hello, world" in nice big letters in the Window
    let label = NSText(frame: NSRect(x: 0, y: 250, width: 400, height: 40))
    label.editable = false
    label.selectable = false
    label.string = "Hello, world"
    label.font = NSFont(name: "Helvetica Neue", size: 30.0)
    label.alignment = .Center
    label.backgroundColor = NSColor.windowBackgroundColor()
    window.contentView?.addSubview(label)
    
    // Having defined the window we're good to go
    window.makeKeyAndOrderFront(nil)
    app.activateIgnoringOtherApps(true)
    app.run() // Magic happens here!
    return 0;
}

I think this is really neat. You can copy and paste this code into an XCode Swift Playground and invoke main() and voila!

This is longer than the iPhone example, but a lot of the lines could be skipped if I went for brevity. If I didn’t prettify the text or define lots of constants (per the example I copied) then it would be just as short as the iOS example while, I think, using less magic and being clearer in what it does. I’m particularly proud of figuring out that I could get the Quit item to send a message (“stop:”) to NSApp without creating a selector referring to the local context (which would have forced me down the “define a subclass and yada yada magic happens” route… I think — I tried just defining a local function and passing its name as a selector but that did not work). That said, the keyboard shortcut for the quit menu item doesn’t work (as I note in the comments).

I actually don’t think the current Apple APIs are all that bad. They’re about on par with MacApp. There is a pretty big impedance mismatch between what the base initializers for the different UI elements do and what you would want them to do if you wanted to write applications this way, which I suspect is because almost no-one does, but while defaults may not look good, they’re not dysfunctional. I could just create the label and set its string property and it would work fine — but why is the default background color not the window background color or — better — transparent? why is the default font not the standard font and size for a label or a text field but something else entirely (the default text settings for a 1989 NeXT machine perhaps)?

I think the problem is that this isn’t where learning app development starts (after perhaps a simpler, more engaging introduction — look how I can create a “Hello, world” app in one minute using XCode. OK, fine, but how the hell would I make an application from scratch if I needed to? How is all this being hooked together? How does the Window hook up to its view controller? How can I use the same view controller for different views? (By the way, this is not addressed by my little example.) Understanding how the basic wiring works also makes XCode’s ridiculously complicated UI more comprehensible — since you now know certain things exist, you know to look for them, and you also know how to simply make stuff work without guessing where it’s buried in the XCode UI.

That said, this is far better than I expected. It would be great if there were initializers that allowed a typical task to be completed in one line (e.g. create a label, position it, and set its content) and if there were better defaults (and the label’s default font settings corresponded to reasonable expectations). If you omit the call to NSWindow.cascadeTopLeftFromPoint then the window appears in a really stupid place. Similarly, why isn’t there a NSMenuItem initializer that lets you stick the new item in a menu? I’m clearly missing something with respect to event handling, and that’s because the relevant initializers aren’t making it easy to do the common, correct thing. I don’t think you can argue that the badness of Apple’s Cocoa APIs is acting as a pons asinorum. In fact it’s more like it’s creating extra work for mediocre programmers.

(Correction: ignore the stuff I say about the menu item not working and the correct event wiring not being implemented by default — by putting a capital “Q” as the parameter for the keyboard equivalent I inadvertently made the shortcut command+shift+q which was why it appeared not to work when I didn’t bother to look at my menu. So my estimation of the APIs improves by a small notch — better default behavior but more magic going on: how does the window know it belongs to app?)

Learning to write Unity Shaders

Dynamically generated infinite scrolling terrain with a custom shader
Dynamically generated infinite scrolling terrain with a custom shader (inside the Unity editor)

Unity has replaced Photoshop in terms of adding features faster than I can assimilate them. It’s been possible to write custom shaders for years, but every time I tried to do something non-trivial I would give up in frustration. Recently, however, I finally wrote some pretty nice dynamic terrain and dynamic planet code and was very frustrated with shading options.

What I wanted to do, in essence, was embed multiple tiling textures inside a single texture, and then have a shader continuously interpolate between a pair of those shaders based on altitude, biome, or whatever. This doesn’t appear to be something other people are doing, so I was not going to be able to take an existing shader and tweak a couple of things to make it work. I’d actually need to understand what I was doing.

If you look at the picture, you’ll see seamless transitions (based on altitude) between a sand dune texture (at sea level) and a forest texture (at middle levels) and further up the beginning of a transition to bare rock. I’ve got a darker blue-tinged rock below the sand, so the material I’m using looks like this:

A single texture that contains multiple tiling textures.
A single texture that contains multiple tiling textures. Most of the texture is blank, but you get the idea.

Obviously there’s room for expansion. I could do some really interesting things with this technique (even moreso if I can interpolate between three or four samples without choking the GPU). I haven’t figured out how to benchmark this stuff — I’m not seeing any hit on the GPU using Unity’s profile, but I haven’t tried running this on a mobile device — so far, this shader seems to run just as fast as (say) a standard diffuse shader.

How to Start

Writing shaders is actually pretty simple. The big problems are finding useful documentation (I couldn’t find any useful documentation on ShaderLab, but it turns out that nVidia’s cg documentation appears to do the trick) and tutorials (I couldn’t find any). It doesn’t help that Unity has made radical changes to its Shader language over time (not really their fault, the underlying GPU architectures have been in flux) which makes a lot of the tutorials you do find worse than useless.

For the record, I’m still using Unity 4.6.x so this may all be obsolete in Unity 5.x. That said, I was working off the latest Unity 5.x online documentation.

The closest thing to useful tutorials I could find is in the Unity documentation — specifically Surface Shaders Examples. Sadly, you’re going to need to infer a great deal from these examples, because I couldn’t find explanations of the simplest things (e.g. how data gets from the shader’s UI to the actual pixel shader code — there’s a lot of automagical linking going on).

Shader "Custom/DynamicTerrainShader" {
	Properties {
		_MainTex ("Base (RGB)", 2D) = "white" {}
		_Color ("Main Color", Color) = (0,0,0,1)
		_WorldScale("World Scale", Float) = 0.25
		_AltitudeScale("Altitude Scale", float) = 0.25
		_TerrainBands("Terrain Bands", Int) = 4
	}
	SubShader {
		Tags { "RenderType"="Opaque" }
		LOD 200
		
		CGPROGRAM
		#pragma surface surf Lambert

		sampler2D _MainTex;
		float _WorldScale;
		float _AltitudeScale;
		float4 _Color;
		int _TerrainBands;

		struct Input {
			float3 worldPos;
			float2 uv_MainTex;
            float2 uv_BumpMap;
		};

		void surf (Input IN, inout SurfaceOutput o) {
			float y = IN.worldPos.y / _AltitudeScale + 0.5;
			float bandWidth = 1.0 / _TerrainBands;
			float s = clamp(y * (_TerrainBands + 2) - 2, 0, _TerrainBands);
			float t = frac(s);
			t = t < 0.25 ? t * 0.5 : ( t > 0.75 ? (t - 0.75) * 0.5 + 0.875 : t * 1.5 - 0.25);
			float band = floor(s)  * bandWidth;
			float2 uv = frac(IN.uv_MainTex * _WorldScale) * (bandWidth - 0.006, bandWidth - 0.006) + (0.003, 0.003);
			uv.y = uv.y + band;
			float2 uv2 = uv;
			uv2.y = uv2.y + bandWidth;
			half4 c = tex2D(_MainTex, uv) * (1 - t) + tex2D(_MainTex, uv2) * t;
			o.Albedo = c.rgb * 0.5;
			o.Alpha = c.a;
		}
		ENDCG
	} 
	FallBack "Diffuse"
}

So here’s my explanation for what it’s worth.

To refer to properties in your code, you need to declare them (again) inside the shader body. Look at occurrences of _MainTex as an instructional example. As far as I can tell you have to figure out which parameters are available where, and which type declarations in the shader body correspond with the (different) types in the properties declaration by osmosis.

The Input block is where you declare which bits of “ambient” information your shader uses from the rendering engine. Again, what you can declare in here and what it is you simply need to figure out from examples. I figured out how worldPos worked from the example which turns the soldier into strips.

Note that the Input block declaration determines what is passed as Input (referred to as IN in the body). The way the declaration works (you declare the type rather than the variable) is a bit puzzling but it kind of makes sense. The SurfaceOutput object is essentially a set of parameters for the pixel that is about to be rendered. So the simplest shader body would simply be something like o.Albedo = (1,0,0,1) which would be the constant color red (depending on the basic shader type, lighting would or wouldn’t be applied, etc.).

Variables and calculations are all about vectors. Basically everything is a vector. A 3D point is a float3, a 2D point is a float2. You can add and multiply vectors (component by component) so (1,0.5) + (-0.5, 0.25) -> (0.5,0.75). You can mix scalars and vectors in some obvious and not-so-obvious ways (hint: it usually pays to be explicit about components).

The naming conventions are interesting. For vectors, you can use x, y, and z as shorthand for accessing specific components. I’m not sure if the fourth coordinate is w or a or something else. I’m also pretty sure that spatial coordinates are not in the order I think they are (so I do ok with foo.x, but get into trouble if I try to handle specific components via (,,) expressions. Hence lines like uv.y = uv.y + band instead of uv = uv + (0,band,0) (which doesn’t work).

You may have noticed some handy functions such as floor and frac being used and wonder what else there is. I couldn’t find any list or references on the Unity website, but eventually found this cg standard library documentation on nVidia’s website (for its shader language). Everything I tried from this list seemed to work (on my nVidia-powered Macbook Pro).

If you’re looking for control structures and the like, I haven’t found any aside from the ternary operator — condition ? value-if-condition-true : value-if-condition-false — which is fully supported, can be nested, etc.. This alone would probably have driven me away just five years ago before I learned to stop worrying and love the ternary operator.

Why no switch statements, loops and such? I’m writing a pixel shader here and I suspect it relies on every program executing the same instruction at the same time, so conditional loops are out. (Actually I may be wrong about this — see the cg language documentation. I don’t know how closely ShaderLab corresponds to cg though.)

Once you see that list of functions you’ll understand why I am using piecewise linear interpolation between the materials (it looks just fine to me).

Some final points — the shader had terrible problems with the edges of the sub-textures until I changed the bitmap sampling to point (versus linear or trilinear interpolation). I suspect this may be a wrapping issue, but (as you may find) debugging these suckers is not easy.

One final comment — even though you’re essentially writing assembler for your GPU, shader programming is pretty forgiving — I haven’t crashed my laptop once (although Unity itself seems to periodically die during compiles).