Typing in AR/VR

The first objection I typically get to my prediction that AR is the next big technology wave that will make existing personal computers obsolete is "what about text entry?"
In Apple’s new animated movie Luck, the cat is seen typing on a projected keyboard…

The first objection I typically get to my prediction that AR is the next big technology wave that will make existing personal computers obsolete is “what about text entry?” (Now, for hardcore VR people, the question is usually “what about walking in virtual spaces?” which is a story for another day.)

Right now, on the Oculus, you type by pointing at keys on a keyboard with virtual laser pointers. This is the worst typing experience I’ve ever experienced on any platform and is totally unacceptable even for the purposes glass keyboards excel at (like text messaging).

A projected keyboard combined with hand-tracking would at first blush provide a similar user experience to the glass keyboards we all know—and have mixed feelings about—from iOS and Android. But there are many crucial differences, all of them to the advantage of the projected VR keyboard.

  1. Screen space is highly constrained on tablets and, especially, phones. So the glass keyboard experience is in large part dictated by the keyboard’s impact on the app’s layout (often reducing the usable screen area to a minimum or, worse, covering vital information).
  2. Text editing on a glass keyboard tends to be fraught with selection issues. Even with the recent improvements in iOS (where you can treat the keyboard as a virtual trackpad) it’s less than ideal, especially combined with the first issue.
  3. Glass keyboards haven’t been implemented terribly well, e.g. while Apple lets you press and hold on keys to get a menu of special characters, e.g. pressing and holding on “.” gets you to the “…”, on “e” it gets you to all the various accented e characters, and so forth, Apple hasn’t really gone to town with this feature. They could have allowed you to get to common non-alphabetic characters via a popup on the “.” key instead of just letting you get to the ellipsis, or provided lots of special overloaded keys to avoid mode-switching. It’s basically like they gave shortcuts to the common option-chords for the keys and left it at that. (In fact if you have memorized a lot of option-chords, the behavior of iOS’s glass keyboard makes perfect sense; but it should make sense to people who haven’t memorized all those crazy combinations.)
  4. Also, glass keyboards haven’t gotten really ambitious because of issue 1.
  5. Neither real nor glass keyboards have any idea what your hands/fingers are doing when you aren’t pushing keys. The AR keyboard will know where you fingers are hovering. You could have a HUD subtly telling you where your fingers are before they hit a virtual key.
  6. Projected keyboards can’t project through your fingers, and no glass or physical keyboard can be seen through your fingers. AR/VR keyboards can (e.g. be shown as translucent over your hands so you can still see the keys).
  7. Real keyboards take up desk space.

It’s also worth noting that all the “superior” keyboard options you’ve ever considered and likely not tried can be viable alternatives in AR/VR. Always wanted to try a stenographers’ “chording” keypad? How about Dvorak? How about a keyboard HUD that lets you know via peripheral vision which keys your fingers are hovering over? How about custom graphic overlays for every program you use instead of burying shortcuts in menus?

What about writing code?!

Glass keyboards are notoriously horrible for writing code, even for those of us who do not mind typing messages or prose on them, because autocorrect is a disaster when programming. So, a virtual glass keyboard isn’t going to cut it for coding without serious rethinking.

Keyboard for ZX-81, the most productive programming environment I’ve ever experienced.

The most productive programming environment I ever experienced was the Sinclair Cambridge ZX-80 which I later upgrade to a ZX-81 (the latter supported floating point). Both had the worst keyboard I ever used (the ZX-81’s was slightly better because it was made of higher quality textured plastic) but when coding you could enter any keyword with a single keystroke and the editor would not allow you to enter a line of code with a syntax error in it.

We haven’t had a UX environment where the keyboard was a highly programmable surface, let alone a highly programmable volume before. Imagine being able to press-and-hold on the function key and get a popup list of all the functions in your current source file. The possibilities are truly stupendous, and none of them have to fight with your display space (even if they cause all kinds of stuff to appear floating in the air, the AR headset knows when you’re not looking at the keyboard and can hide or fade them instantly).

Similarly, if you consider the abject failure of Apple’s TouchBar, a part of the reason it failed was that it was restricted to a tiny hard-to-reach part of the keyboard, versus (say) completely changing the keyboard and trackpad layout and appearance based on context: people still pay for keyboard overlays for complex programs like Final Cut. It’s even possible it was an idea that made sense in prototype AR environments (where the entire control surface was software controlled) that they tried to translate to laptops.

What about typing on the go?

A common use case for glass keyboards is quickly typing messages while walking around or waiting in supermarket checkout line. A projection keyboard won’t work…

Here’s a simple option: when you make a “thumb typing” gesture with your preferred thumb typing hand (my left hand, I keep my phone in my left pocket) a small keyboard appears in your hand under your thumb. It’s probably a bit bigger than a phone’s glass keyboard because it can be, and it might be shaped to accommodate the movement range of your thumb (e.g. arc-shaped key rows). You can use it while not looking because a HUD gives you visual feedback.

Or perhaps when you turn your arm in a “check the time” gesture not only does it show the time on your wrist, but a keyboard on the back of your arm. Of course that would still be hunt-and-peck typing.

Another option is that the keyboard is just a floating overlay that can be brought up or dismissed by gestures and uses relative hand movements to control virtual hands. Experienced users could probably reduce the necessary hand movements to the point where they could touch-type with almost no hand movement.

I could probably come up with another dozen ideas without too much effort.

One alternative to virtual keyboards would be for the AR camera system to parse sign languages.

Here, I have a modest proposal: the system that recognizes hand gestures and virtual keystrokes should be designed to understand sign languages (ASL for starters but I imagine support for other sign languages can be crowd-sourced pretty easily). Not only will this allow people to type without a keyboard at all, it will vastly increase the number of people who can sign and read sign language. And your AR glasses could turn signing by other people into subtitles on-the-fly.

In the early days of PDAs, Apple’s handwriting recognition—which was abysmal in the original Newton, but ended up being very good in the second generation devices and almost flawless in the third generation)—ended up losing out to Graffiti, which was a simplified set of alphabetic gestures. The end result is no-one ended up writing much on those devices (while the Newton had a visual IDE…).

It seems to me that a simplified sign-language variant well end up being the basis for gesture-driven text input in AR, but it’s also worth considering that any of the alternatives discussed can be far easier to learn in AR than they ever were in earlier technologies because contextual assistance can be abundant—a HUD can coach you on how to use your new virtual chording keyboard without taking up screen real estate from your word processor or IDE.

Actually, one can easily envisage AR software that is able to read lips, which could also be used for text entry if you had a camera pointing at your mouth.

All of which is to say that I expect these ideas and many more have all been prototyped and tested by folks working on AR at Apple.

Summing Up

All of this is speculative. I don’t have any inside information on what’s happening within Apple on the AR front. I just know that the noise leaking from Apple about AR products is getting pretty loud, and that Apple, as a rule, doesn’t release half-assed version 1 products. Apple has been working on this for a long time (it’s not common knowledge that the iPhone was a spin-off from the iPad project which had itself been going for a nearly decade when the iPhone launched).

My guess is that these issues have been identified, solutions have been brainstormed and extensively tested, and that text entry will be a solved problem when Apple’s AR/VR product ships, and Apple will have a huge lead both in implementation and usability out of the starting gate.

And yes, I am long on AAPL.