Apple's Vision Pro actually looks like it will be comfortable to wear…

I don’t think anyone will seriously dispute Apple’s claim that their newly unveiled (but unreleased) Vision Pro headset is “the most technologically advanced consumer product ever released”. (I’m quoting from memory so this is probably not exactly right.) This thing is simply a monster. 5000 patents sounds like a lot but… no, it’s a lot. The tech moat that Apple has with this thing is breathtaking.

In broad strokes, it’s pretty much what I expected in terms of UX thinking, and pretty much what the wildest rumors said in terms of hardware (even the rumor a lot of folks dismiss, that they render your face and eyes on a curved outward-facing OLED display, because of course they do).

Here’s what Apple just announced:

  • The best display technology, ever. Sold just on that basis.
  • The best immersive VR optics ever, by far (>4K per eye, 3 element design).
  • A truly breathtaking sensor package. (Dual LiDAR!)
  • Modular fit customized to your head leveraging their LiDAR tech (you’ll probably be able to show your phone your head in the Apple Store App when ordering the headset).
  • A new OS that has deeply thought out interaction in XR
    • Dialing in your level of immersion
    • Showing people near you, your level of immersion
    • Dynamically punching through immersion in response to nearby events (like people walking up to you)
    • Eliminating the need for controllers by using gesture and voice recognition and seamless integration with other hardware (…and watch this space)
    • Deep and subtle feedback on eye-tracking
  • Across the board they’re doing seamless AI processing on virtual camera inputs, so you are composited in 3D at “camera” level—you give a thumbs up gesture and the camera feed provided to Zoom sees the virtual composited fireworks…
  • They’re capturing facial expressions and modeling your face so that they can render a fake you without the goggles on in real time and provide that as a “camera” feed to FaceTime et al—and this, while imperfect, apparently looks “better than a typical zoom call”. (Consider the dodgy green screen and background blur effects people often use.)
  • Privacy: they make it clear when the device is being used to capture video.

I’ve probably forgotten stuff, but that’s off the top of my head.

My hot takes

  • It actually looks like it might be comfortable to wear (edit: according to The Verge hands-on review the headset weighs under a pound).
  • I think the product that ships may well look a bit different and that the apps certainly will. This was more smoke and mirrors than the iPhone launch (and the launch date is vague—is “early next year” sooner or later than Q1?)
  • In once demo what looked like a virtual keyboard was “sitting on” a table but there was no real discussion of text input besides “it works with mac keyboards, mice, and keyboards”. My guess is there’s more to come here.
  • There was some remark about the dev tools being discussed in the State of the Union keynote later today (I’ll be asleep). The big question for me is aside from Swift, what will our options be? I’m hoping to first class Javascript support (of course) since all the heavy lifting is going to be on GPUs anyway… Apple is practically first-classing web apps on macOS at this point… If I have to get deep into Swift, fine. It’s a nice language.
  • The 3rd party demos were super lame. I’m guessing these folks had maybe 1-2 weeks to throw something together. We’ll see much more compelling stuff closer to launch.
  • Hilarious to see Office support still being mentioned as if that matters any more. Really, you can create and show Powerpoint Presentations in XR? Awesome.

Obviously, I wish they were shipping next week, like the new Macs, but this is still an incredibly exciting unveiling. They’ll sell a ton just because idiots who want to be cool buy every new Apple product these days, and the porn industry will buy a ton of them (both for content producers and consumers). What will their margin be? There’s two Macbook Airs worth of silicon and displays (only miniaturized) along with four more cameras and two LiDARs, not to mention optics, gyros, whatever they use for 3D sound mapping… in these things. I assume there’s at very minimum 8GB of RAM and 256GB of storage.

Steve Jobs said they had a five year lead on multitouch UX when they released the iPhone. I think time has shown that this was true and Apple has maintained some of that lead despite everything, and actually gained a lead in silicon. How hard will it be for anyone to compete in the XR space if it takes off?

Warm Takes

So, I finally found some hands-on reviews, but they were reviews of the headset in isolation, not hooked up dynamically to Macs etc. which I think is the killer app on day one.

Sure, $3500 is expensive for a toy you use to play games and watch porn, but it’s dirt cheap for being able to summon up infinite Pro Studio XR displays on demand while flying economy or watch 3D movies on a screen that fills your peripheral vision. It’s even cheap relative for summoning up three decent quality 4K displays and watch 2D movies on a 100″ screen while you’re in Starbucks.

To put it another way, reviewing the Vision Pro in isolation is somewhere between the Watch and Airpods Pro in isolation. The thing I love about the Watch is that it’s a second, glanceable extension of my iPhone. Hmm, someone messaged me? Oh no it’s a calendar thing. I need to take a turn? Oh right, that’s the intersection.

As a standalone device the Watch, um, is a mediocre watch with a nice timer and on occasion stopwatch. I didn’t find those use cases compelling enough to wear any kind of watch for about 15 years. Airpods Pro are completely useless without something to connect to, and the Vision Pro is a capable computer, but it probably doesn’t have Terrabytes of storage and I don’t really want to run servers on it.

I think Vision Pro is basically kind of similar to Airpods Pro except it takes both video and audio instead of just audio. Unlike Airpods Pro it doesn’t have its own video and audio source built-in, and in fact it’s very capable as a standalone device, but that’s really superficial.

In my opinion, the most compelling use case for the Quest 2 is basically as a really big monitor you can use on a plane, or wherever, except that the way it connects is super clunky and it’s a terrible AR device so you can’t really see your keyboard, and… you get the idea.

Now, I suspect a LOT of Quest devices are primarily used to consume porn, but they aren’t really great for that either. I think it’s interesting that the Vision Pro clearly looks like both a superior creation and consumption tool for porn than anything else out there.

The thing is, a Quest acts as a computer display by brute force. The video signal has to be rendered on the host device then compressed and piped to your headset and rendered again. For this to work at all you either need your device to be plugged directly into the host device OR the device to be plugged directly into your router and the network gods must smile on you. This is still pretty damn compelling.

Apple’s OS stack is such that they can do the handoff anywhere. The desktop app can literally render to the device directly using its GPU. It won’t as efficient as rendering on the machine because of the shared memory architecture on Apple Silicon, so architecturally it would be like rendering on a gaming PC whose PCI bus between CPU and GPU is replaced with WiFi, vs. a gaming PC whose display cable is Ethernet + WiFi + two frames of rendering latency.

This probably means gaming will be noticeably better if the game runs on the headset, but using desktop apps will be lightning fast over WiFi (consider that on the macOS et al most UI operations, including rendering scaled glyphs, are performed on the GPU).

One of the “other shoes” I’ve been waiting to hear drop is networked GCD (“grand central dispatch”). Over ten years ago, Apple literally added blocks—closures—to C in part to support this. Any atomic self-contained task on macOS (et al) can be handed off via GCD, as a self-contained bundle of code and data. Originally this allowed things like tasks to be performed by either the CPU or GPU whichever was cheapest, but it could easily be a networked server. macOS would allow transparent use of server-farms this way. XCode distributes builds across networks this way although Apple doesn’t provide the APIs for third-party use. Other people have added written network libraries for GCD. Theoretically a task could be dispatched to be performed on IA64, ARM64, Nvidia, AMD, or Metal GPUs, whichever makes the most sense (remember that macOS was built on top of OpenStep which at one point supported “quad fat binaries”. But again, it also means that Vision Pro could handoff heavy-lifting parallelizable tasks to whatever happens to be convenient. Again, a lot more Airpods Pro than Airpods Pro.

Again, I think this is the next big thing, but without Steve Jobs to explain it to them, the tech press hasn’t figured it out. (Not that they’d believe Steve Jobs—they’d refer to his “reality distortion field” and then, six months later when it all turned out to be perfectly accurate, they’d praise themselves for having been skeptical.)

One more meta observation…

If the engineering folks have been using these constantly for years, they’ve probably got very good tools for accessing their desktop and dev tools while wearing the headsets. They aren’t taking them off mid-debug-loop. My guess is this stuff would be super compelling if demoed. So why wasn’t it demoed? Judging from the reaction Apple got—which I’m pretty sure is the reaction they expected, this is, after all, the machine Katie Cotton built—the purpose of the demos was to prove none of this is vaporware.

Vision Pro, the thing they announced:

  • all works, there’s no smoke and no mirrors
  • the displays are incredibly good
  • head tracking works with no perceived latency
  • crazy impressive stuff like rendering your avatar with your expressions mapped onto it works and actually looks good
  • they are light and comfortable to wear

Everything else is irrelevant at this point from a marketing point of view. This isn’t a Sales pitch. It’s an Awareness pitch.

Hands On Impressions

AAPL stock is down, last I checked.