A better RAW workflow for $35

Processed and cropped in RAWPower
Processed and cropped in RAWPower

As my frequent reader knows, I have been grappling with my RAW workflow for as long as I have had a RAW workflow. I’m hardly a pro or even much of an enthusiast, and I find dealing with all these files exhausting (it also consumes a stupid amount of disk space, etc.)

I’ve been ambivalent about Adobe software ever since they started renting their software; this was actually before they switched to monthly fees — the Creative Suites essentially forced you to upgrade on a constant basis simply to not have your software mysterious stop booting up when Adobe’s authentication servers were down.

Today, despite paying Adobe’s tax (albeit the lesser “Photographer’s” tax of $100/year) I remain unhappy with their products. Lightroom is slow, constantly wants patching, requires me to sign in (often more than once) to Adobe’s stupid services just to launch, and on and on. But, until recently, I had no credible alternative that was fast and produced even vaguely decent results.

But now there are two inexpensive, lightweight products that together may mean I don’t need Adobe’s crap any more (I’ll get back to you!):

A quick press of the P key and I can see that I nailed focus on the grass
A quick press of the P key and I can see that I nailed focus on the grass

FastRawViewer — endorsed by no less than Thom Hogan and Nasim Mansurov — is a terrific program that does exactly what it says on the can. It’s simple and lets you browse and rate photos really, really fast. You can customize its keyboard shortcuts to your pleasure (e.g. I have ratings mapped to the 0-5 keys, and P toggles high pass filtering so you can see exactly what, if anything, is in focus without pixel-peeping. Rather than having its own proprietary catalog system, it leverages your file system and XMP metadata (“sidecar” files that are compatible with Lightroom if that still floats your boat). It costs $20, you can get it here.

RAWPower lets me make RAW adjustments, straighten, and crop faster than I can launch Photoshop
RAWPower lets me make RAW adjustments, straighten, and crop faster than I can launch Photoshop

RAWPower — developed by former Aperture engineers (or a former Aperture engineer; I’m not sure) — gives you most of Aperture’s non-destructive RAW processing in a fast, lightweight app that also provides the same functionality via Apple’s Photos app. I like the Photos app except for the whole slower-than-treacle-in-a-walled-garden thing, so there’s that too. It costs $15 in the App Store. My only issue with RAWPower is that its crop-and-rotate tool is clumsy if you want to both crop AND rotate, which I usually do (and I’ve been told that addressing this issue is a priority).

(If you’re a Windows user, FastRawViewer is still great, but RAWPower is Mac only.)

FastRawViewer lets me view a folder with thousands of RAW files with no waiting (just dragging the folder info Lightroom, Photos, or Aperture would be agony), and RAWPower lets me adjust exposure, shadow recovery, straightening, and so forth faster and just as competently as Lightroom. (Photoshop still wins for any major surgery, obviously — RAWPower has no dodge, burn, layers, healing brush, perspective correction, stitching, etc.)

Getting an Nvidia 1070 (or similar) GPU working on a Mac Pro 5, 1

Victory!

I’ve been using a chipped Radeon 7950 in my 2012 Mac Pro for several years (it was a serious upgrade to my original 5770 or whatever it was that came with it) but eventually my Dell 2715Q (a 4K display) stopped working reliably with it at full resolution and I had to drop down to 1080p. Then it stopped working in 1080p.

I was pretty sure the problem was with the display (which also wasn’t working properly with my Macbook Pros), but the GPU had always been twitchy (sometimes not working on boot, and not driving all its display ports) so when Nvidia announced drivers for its latest GPUs, I figured what the heck?

Anyway, here’s the correct process along with gotchas from not doing it this way, since I found zero reliable guides online to help me.

Warning: if anything goes wrong you’ll need to screenshare into your Mac Pro from another Mac to see what’s going on, so make sure your Mac’s network connection is robust and you can screenshare into it before you do anything you’re going to regret. Luckily for me (since I fucked all this stuff up multiple times) our Macs can all “see” each other (mainly so I can get at parental controls on other Macs easily).

  1. Update your Mac to 10.12.4 (or whatever is current).
  2. Go to Nvidia’s website and download their out-of-date Mac OS X drivers, install them, and then update them in the control panel. I don’t know when you’re reading this but you want your drivers as up-to-date as possible.
  3. You may also want to install CUDA drivers, but that’s not critical.
  4. Shut down, unplug, power off, remove the Mac’s cover.
  5. I got a 1070 bundled with Mass Effect Andromeda. (Don’t care about the bundle, since I’ve got it for PS4 and hate Windows, but it was $50 cheaper than the same card without Mass Effect Andromeda. I don’t think much of Mass Effect Andromeda, but it’s definitely worth more than -$50.)
  6. The 1070 is physically a total pain to get into the Mac Pro (the 7950 seems to have been just as bad, but I have cheerfully lost all memory of it). Be careful to remove all the rubbery covers so that they don’t fall off on top of the PCI slot and cause you enormous consternation.
  7. The Mac Pro comes with two 6-pin power cables for graphics cards. The Nvidia 1070 takes one 8-pin cable, but there should be a 2x 6-pin to 8-pin adapter cable in the box. You’ll need that. Sadly it creates a lot of slack in your cables that will be snaked inside your otherwise tidy (if horribly dusty) Mac Pro.
  8. Make sure everything is securely hooked up. Close the box, plug it in, plug in displays, and boot. (If you’re using a wired network, make sure that’s plugged in.)
  9. Power on, wait for the chime, and hopefully you will be golden.

Troubleshooting

Here are the ways I fucked this up.

First, I didn’t realize the current version of Mac OS was 10.12.4, so I had 10.12.3 and installed the (January) version of the Nvidia drivers which then claimed to be up-to-date.

After I installed the card my Mac wouldn’t display jack shit from any port at any resolution. After trying two different displays and four different ports, I screen-shared into it and verified (a) it was working properly, (b) it could see the video card and recognize the vendor but couldn’t do anything with it, and (c) that the Nvidia panel could see the video card but not do anything with it.

I then found a post showing someone had successfully installed a 1080 on their Hackintosh with 10.12.4. Whoops! I installed 10.12.4 and rebooted. No dice. I went into the Nvidia panel and found it no longer claimed to be up-to-date, so I installed a new version, rebooted, and my Dell monitor came to life at a resolution I’d never seen it in before. (Easily fixed. I am now looking at my Mac Pro’s desktop in glorious 1440p, as God Steve intended.)

Serving b8r

In a former life I worked on optimizing delivery of a fairly large website. I won’t pretend I understood a fraction of the detail, but I had a pretty good idea of the big picture and in a couple of places I drilled down to the bottom-most details.

This isn’t new to anyone who pays attention, but scale makes simple things hard.

The basic tricks to getting a web page to load fast are:

Do as little as possible:

  • Make everything as small as possible.
  • Make everything as simple as possible.
  • Be as asynchronous as possible.

Do it as infrequently, fast, and in parallel as possible:

  • Minimize the round-trip time from client to server.
  • Parallelize everything as much as possible.
  • Split stuff up enough to make it parallel, but not so much as to increase the number of round-trips. (To some extent, SPDY/http2 is solving this for us.)
  • Minimize the number of round trips.
  • Maximize cache utilization.

The grandparent of bindinator was bind-o-matic, which was not designed with all of these things in mind. In particular, it made no real attempt to leverage parallelism or asynchony. When Craig, Josiah, and I wrote bindinator, the “state of the art” was:

  • Figure out what your dependencies are.
  • Compile them into a big blob.
  • Minify the blob.
  • Give the client the blob.

Bind-o-matic’s approach was: “be so small and light that you don’t need to do clever shit to get good performance” (during development) because you can always do that later. We actually compiled our LESS on the client and it didn’t cause performance problems (once we forked less.js and sped it up a bit, and cached compiled CSS in localStorage).

While almost any javascript web application architecture can be served thus, more fine-grained optimization (e.g. trying to become interactive as fast as possible) is a tougher problem, especially when you have little or no ability to do less (e.g. suppose hypothetically that almost every person in your organization is incentivized to make your application bigger, slower, and more complex…)

And you might be committed to using a big, complicated framework that is virtually guaranteed to make anything it touches, bigger, more complex, and less asynchronous.

Anyway, I designed b8r to be as small, simple, and asynchronous as possible but left delivery optimization “for later”. I assumed I could just point webpack (or webpack2) or some more sophisticated tool (such as the stuff I had worked on) at anything I built later.

I did do one thing, though.

I wrote my own require implementation because I started reading the documentation for existing implementations and my eyes glazed over. In particular, none of them seemed as straightforward to use as the one I’d gotten used to in my former life (and with which I had deep familiarity).

I used to be able to write:

const foo = require('foo'); // don't even need to know the path to foo!

I think not needing to know the (relative) path to foo is an anti-feature by the way. The point is that this kind of thing normally needs a build-process to work, which adds a bunch of pain to development. I want the good part — require that just works — and no bad part.

Now, my require is purely client-side, which means that it is a big performance problem. Consider the following code:

const foo = require('path/to/foo.js');
foo(17);

The call to require must be synchronous. But what does require do behind the scenes?

  1. Use an XMLHttpRequest to pull “foo.js” from the server.
  2. Wrap a Function instance around the code inside it.
  3. Pass an object to the function.
  4. Return the “exports” property of the object when the function returns.

It really doesn’t matter how asynchronous your code is if every dependency involves halting execution while round-tripping to your server… recursively in the case of a file that, itself, has dependencies.

This behavior actually throws warnings in Chrome…

Chrome no likee
Chrome no likee

Chrome only complains about this the first time it happens, but it happens a lot.

Now the solution for this is to compile your javascript code on the server and deliver some kind of optimized blob — site.min.js say. This is exactly what tools like webpack do — they actually watch your code tree and trigger recompiles on-the-fly. Webpack offers a dev server that actually sets up a backchannel to the client and refreshes the browser automagically when there are code changes.

Sounds pretty good, right? — but it’s about 1/10 as responsive as using b8r and just forcing refresh. I fucking hate having to compile my code all the time, even if all the compiler does is walk a tree and concatenate a shitload of files wrapped in function calls and assignment statements and then call uglifyjs.

But that’s on a local dev server. What happens when you stick this code on a real server and the round-trip goes from ~0ms to ~100ms? It turns out that on the project I’m working on it changes my web application’s spin-up (with nothing cached) from ~600ms to ~1500ms. (Aside: this is a real web application with a shitload of functionality talking to production servers with no back-end optimization. In my past life, loading in ~1500ms from a real server would have caused spontaneous orgasms. When I told people that performance like this was achievable I was assumed to be a naive fool. No, I’m not bitter!)

So, how to do all this stuff on the client?

  1. Make b8r’s use of require asynchronous. E.g. b8r synchronously loads b8r.dom.js before it finishes loading, so load its dependencies asynchronously before loading b8r itself asynchronously.
  2. Get require to warn whenever it loads a module synchronously. Ick.
  3. OK, get require to return a JSON structure of synchronously loaded modules and load them asynchronously before doing anything else.
  4. Repeat step 3 until no warnings. (This took three iterations.)

Loading time from the stage server went from ~1500ms to ~600ms with no server-side optimization whatsoever. Not bad for a late night hack.

Now wouldn’t it be nice if this were all automatic?

I started writing this post on Friday evening, but my first stab at automating this didn’t work and my brain was too fried to fix it.

In order to function, require tracks all the modules it loads, and it already replaces itself recursively to handle nested requires (to allow for relative paths in require statements) so all I needed to do was track one level of dependencies, and then generate a list of preload “waves” where each wave comprises all modules with no unloaded dependencies. (Circular dependencies will be detected and throw errors.)

Oh, and this eliminates the need for b8r to do anything clever internally. The new solution is general and fixes b8r as well as everything else.

const preload_data =/* data from require.preloadData() */;
require.preload(preload_data).then(() => {
  /* ... */
});

So, now the steps are:

  1. Run require.preloadData() in the console, which spits out JSON data.
  2. Now call require.preload(), passing the data from step 1, which will generate a promise of everything loaded asynchronously.

If dependencies change, everything will still work, but dependencies that are force a synchronous request will generate console warnings.

As a nice bonus, this improves the load time of the b8r demo page by over 80%.

And One More Thing…

If you don’t mind your web app not loading asynchronously the first time, I’ve added another feature to bindinator (and it’s used in he demo page)…

require.autoPreload(10000).then(() => { 9/* ... */ });

This automates all of the above, storing preloadData() in localStorage (the number is the delay in milliseconds before preloadData is written to localStorage; the default is 2000). If code changes cause a file to load synchronously or break the page load, the preloadData is updated or destroyed as appropriate.

So, the first time you load the page it looks like this:

Browser network tab showing b8r files loading one-at-a-time

But from then on it looks like this:

network tab shows b8r files loading in parallel "waves"

You may notice that the last few items are sequential. These are chained promises. At some point I’ll add components to preloadData…

Tempted to switch to Windows

I love this laptop. I love the trackpad on the right. Where do I sign?
I love this laptop. I love the trackpad on the right. Where do I sign?

Before you decide my blog is suddenly interesting because I’m a hemi-demi-semi-prominent pro-Apple guy who is switching to Windows, hold your horses. It’s not. I’m not.

The correct, non-linkbaity headline should be:

“WTF Apple?! Macs suck compared to Windows PCs that are more expensive and which you can’t actually buy.”

All users of high-end Macs suffer from PC envy because there’s always a PC out there that scratches a particular hardware itch. E.g. nVidia has just released some insanely nice GPUs and on the Mac if you’re lucky you have a fairly recent mobile version of a mid-range nVidia GPU from last generation. Not even close to the same league. Similarly, the current generation Apple notebooks use Intel’s chipset from last year and are thus limited to 16GB of RAM. 16GB of RAM is so 2012 for fuck’s sake. (Heck, my 2012 Mac Pro has 36GB of RAM, it’s really just a 2010 Mac Pro, and it’s not even trying.)

So this morning I read another “fuck you and your lame-ass hardware and USB-C ports, I’m switching to Windows” post on Hacker News. I don’t remember if it was on the comments thread of the post or HN itself (since I can’t find either anymore) but there was a discussion of what Windows laptop (“other than a Dell XPS”) to get if you want to switch and don’t want a piece of shit (i.e. pretty much any Windows laptop). The replies were illuminating (including quite a few saying pretty much, ‘what do you mean “other than a Dell XPS”, that’s a piece of shit’ — a sentiment with which I can agree based on first person experience), and pointed at the Razer Blade and Razer Blade Pro.

So I took a look.

These are sold as high end laptops — CNC milled chassis, backlit keys with programmable colors, 1080P or 4K displays, (for which they provide an API, so you can have keys light up indicative of, say, the health of your team-mates in a multi-player game) and fantastic specs (e.g. nVidia 1080 in the Pro). In fact the specs were so good I simply wanted to know two things:

  • What is the battery life like?
  • How much?

The answers were:

  • We aren’t going to tell you.
  • More than a Macbook Pro (and, really, fair enough!), and by the way:
  • We don’t have any to sell and can’t tell you when we will.

Aha! Checkmate Apple. I guess there’s a reason why the laptops out there you can actually buy are either worse than Apple’s or cost about the same.