The Browser as Word-Processor

Like, I think, most web developers, I assumed that making a decent WYSIWYG editor that works inside a browser is difficult and tedious, and assumed the best solution was to use something off-the-shelf — especially since there are what look like pretty good free options such as TinyMCE and CKEditor.

Upon close — or even superficial — examination, these editors (and there are a lot of them) seem to pretty much suck. As a simple pons asinorum try the following test (incidentally, the editor in the latest version of WordPress scores quite well here):

  1. Create a simple document with a heading and a couple of paragraphs of text.
  2. Bold a couple of words in one of the paragraph (not two consecutive words).
  3. Now, while observing the editor’s toolbar, try a series of selections:
  • Select a bold word
  • Select a plaintext word
  • Select from a bold word to some plaintext
  • Select within the different paragraphs and headings
  • Select across two paragraphs with different styles

If you didn’t encounter several WTF moments then either the state of off-the-shelf JavaScript text editors has improved markedly since I wrote this or you’re not paying close attention.

Google Documents — errr Google Drive — handles all this with aplomb, which is to say it behaves exactly like Microsoft Word (which is kind of dumb, but at least (a) what users probably expect, and (b) reasonably consistent). E.g. if you select a mixture of bold and non-bold text the bold toolbar icon will be “unlit” indicating (my colleague and I assume) that when you press it the selection will turn bold. In most of these editors either the bold button exhibits random behavior or goes bold if the selection starts in bolded text. (The latter at least accurately foreshadows the behavior of the bold button: more on that later.)

Assumed Hard, Left Untried

My experience as a JavaScript coder has been that there are only really two levels of difficulty for doing stuff in JavaScript — not that hard and practically impossible (and every so often someone will surprise me by doing something I assumed was practically impossible).

I knew that there’s a trick for editing any web page, e.g. the following scriptlet will allow you to edit any web page (with spellchecking for bonus points):

javascript: document.body.contentEditable = true; document.body.attributes[“spellcheck”] = true;

So, I knew that something like this was easy. What I didn’t realize was that at some point in the browser wars Microsoft implemented pretty much all of Microsoft Word (modulo some UI glitches) into Internet Explorer, and then everyone else simply implemented most of their API.

So, for example, the following scriptlet used in conjunction with the preceding one allows you to bold the current selection in any web page:

javascript: document.execCommand(“bold”);

If  nothing else, you can have a lot of fun defacing web pages and taking screenshots:

CKEditor Home Page (Defaced)While I knew about the contentEditable “trick”, I didn’t put all this together until — frustrated by the huge and unwieldy codebase of CKEditor (which, at bottom, is really just a custom UI library that calls execCommand) and unimpressed by alternatives (aside from Redactor.js, which looks pretty good but is not free or open source) I found this thread on Stackoverflow.

editor.js in action on its documentation demo page

I cleaned up the example and had a working text editor in a few dozen lines of code. (I’d post them but they’re written for a client. I believe the whole project will eventually be open-sourced, but right now it hasn’t been. If and when it ever gets open-sourced, I’ll link the code. Failing that I may write my own simpler version for personal use (and integration with Foldermark).

document.execCommand implements most of the functionality you need to write a perfectly good word-processor from scratch in a browser. In fact, if you’re willing to put up with some UI quirks, pretty much all you need to do is implement some trivial UI components. Almost all the implementations out there create a complete blank document inside an iframe by default, but it’s perfectly viable to edit inline in a div, especially if you’re planning to use the ambient styles anyway.

The beauty of writing your own word processor using execCommand is that the browser gives you fine-grained access to all events, allowing you to arbitrarily fine-tune the low-level behavior of your word-processor. Microsoft Word, for example, has always had unfathomable UI quirks.

What don’t you get?

First, you do get pretty solid table support.

You don’t get fine control over styling, although there’s nothing to stop you from implementing a CSS editor of some kind (disguised or not). From my point of view, the default behavior of the browser word-processor is to be very styles-driven, and that’s a good thing. It’s not so easy out-of-the-box to, for example, set a specific point size for text.

Some execCommand commands don’t work very well. E.g. you can implement a “hiliter” using execCommand(“backColor”…) but there’s no way to toggle it off (unlike bold) so to properly implement it you need to directly mess with the DOM, which — given the way selections are represented, can be a tad ugly.

There’s stuff that is simply difficult because you’re in a browser. E.g. without implementing some kind of service platform (or perhaps leveraging an existing one) you’re not going to be able to drag-and-drop a picture from your desktop into a document. It would be fairly straightforward, I suspect, to integrate DropBox with a word-processor to allow drag-and-drop images and attachments — anything that can be represented by a url is, of course, golden.

Most of the missing features from the free word-processor in your browser are what you’d expect. E.g. anything to do with overall document structure: automatic index and table of contents generation, footnotes, endnotes, headers, footers, and so forth. None of this stuff is hard to do in a browser. The real problem is support for printing — browsers generally suck at targeting printers — so you’re not going to replace Quark or InDesign — but if you’re targeting ePub rather than paper, I don’t see why you’d want to use anything else.

Final Thoughts

The advantages of “owning” your word processor’s codebase are enormous, especially if you’re trying to integrate it into a complex workflow. You can fine-tune exactly what happens when a user hits delete (e.g. we need to track dependencies — was this paragraph based on that paragraph template or not?), and what is editable or not. You can do validation inside input fields while allowing other text to be edited free-form. It’s pretty wonderful. One day, perhaps, there will be free off-the-shelf editor that solves key UI and workflow integration issues, but we’re a ways from there.

 

Cruft

So, which of the following do you think affords you more screen real estate for word-processing “out of the box”: Word 2011, Pages, or Pages for iPad? (Note: the Mac screenshots are from a 1440 x 900 Macbook Pro 15″ display.)

Word 2011 (thanks to Boy Genius Report for original screen shot)
Pages 08 (09 is identical)
Pages 08 (09 is identical)
Pages on the iPad
Pages for iPad

I think these pictures pretty much speak for themselves (the bottom figures on the iPad screenshot are for using the iPad with an external keyboard, but since it’s already ahead that’s just rubbing salt in wounds). Here’s the real Hallelujah moment, though:

Pages on the iPad isn’t even hurting for room on its toolbar!

What’s more, it’s easy to see how even this very minimalist UI could be simplified. All you really need is to move the “tab” button to the keyboard (where it belongs) and eliminate everything except paragraph and character style selectors. (The ruler can be eliminated too — except when editing tables.)