Author Archives: Francois Jordaan

Polite user interfaces know when to wait a little

Web page elements that appear or disappear on hover should almost always do so with a slight delay. Why?

  • To prevent distracting elements leaping out at you while your mouse is simply traversing the page.
  • To prevent you from accidentally clicking something that popped into view just as you were moving your cursor towards the target.
  • To prevent elements such as menus from unexpectedly disappearing when you just stray a pixel off, forcing you to re-invoke them.

Building in a small delay (say, 100ms) before elements appear or disappear is a hallmark of polite user interfaces, but is woefully rare. If you do a Google search for JavaScript plugins for menus, dropdowns, etc., you’ll find almost none that do this. This is also the biggest problem I have with using CSS :hover to show or hide elements (and why I think pure CSS dropdown menus are useless.)

On pretty much all projects with interactive JavaScript elements I’ve worked on in the past, I’ve specified this behaviour, which added considerable complexity for the developer. In most cases, they developed their solution from scratch.

So I was very happy to discover Brian Cherne’s hoverIntent jQuery plugin, a lightweight (4KB unminified) script which makes this effortless to do:

HoverIntent is similar to jQuery’s hover. However, instead of calling onMouseOver and onMouseOut functions immediately, this plugin tracks the user’s mouse onMouseOver and waits until it slows down before calling the onMouseOver function… and it will only call the onMouseOut function after an onMouseOver is called.

Please consider using it on your next project!

Photoshop guidelines for web designers

We often work with external design agencies. Sometimes they work with wireframes we produce, and sometimes not. These are my standard guidelines (or wishlist, if you will) that I try to send to designers before they start. I don’t think I’ve ever received any that meets all these criteria, but it’s always good to aim for. Perhaps it’ll be a useful checklist for other designers out there.

  1. Final deliverables are layered Photoshop files (.psd), with flat file snapshots of each.
  2. When providing flat files (incl. work-in-progress snapshots), use 24-bit PNG format (not JPG).
  3. Use separate files for each distinct layout template.
  4. Photoshop Layers should be given meaningful names. Multiple identically-named layers are unnacceptable. As far as possible, remove obsolete layers. Use layer groups to organise layers
  5. Photoshop Guides should be used, matching the page layout grid. Remove obsolete guides. Guides should be snapped to pixel edges Photoshop lets you position guides at sub-pixel level, which causes confusion when working with pixel-level artwork for screen-based media. To snap guides to pixel edges, first create a marquee selection in the right place, and then snap the guide against the marquee edge..
  6. Consistency between Photoshop files is required, regarding measurements, fonts and colours.
  7. Use a meaningful and consistent file naming policy. Related files should be named in such a way that they are alphabetically adjacent (e.g. content-normal.psd, content-wide.psd
  8. Versioning: for updated versions of files, use a consistent and sensible versioning system in the file names. E.g. homepage-v01.psd. Files with version numbers should stay in order when sorted A-Z. Avoid file name suffixes like -final.psd, -updated.psd, -new.psd etc.
  9. Bear in mind CSS capabilities and work with them wherever possible. For example, horizontal and vertical lines should follow pixel edges exactly (not anti-aliased) in order to be implemented as CSS borders.
  10. When drawing vector shapes, remember to check the “Snap to Pixels” box in the shape options.
  11. Remember to specify hover (mouseover) and active (on click) states for links and buttons.
  12. Accessibility: Minimum font size is 11px. Body text should ideally be 12px or 13px.
  13. Accessibility: Test for adequate colour contrast and colour blindness
  14. Text in Photoshop should be sized in px units (not pt), and in integer sizes (not fractions)
  15. Specify the actual leading in the Character palette, and, for multi-paragraph text areas, spaces above and below paragraphs in the Paragraph palette, that should be implemented using CSS. Do not use double linebreaks or font sizing to set paragraph spacing
  16. Use web fonts as far as possible (core web fonts or from a commercial web font service). Only use non-web fonts (e.g. corporate fonts) where absolutely necessary (they’ll be implemented using images). This is extra important for text that will be content-managed or translated.
  17. Embedded images such as logos should be production quality, as the Photoshop file will be used to cut out final images. (Tip: embed logos as ‘smart objects’, or avoid resizing them multiple times.)
  18. If working from wireframes, try to avoid a “wireframe aesthetic” (monochrome, everything in boxes). If a wireframe puts something in a box, it just means that information should visually stick together. Use gestalt principles to group information visually.

Content Strategy Forum writeup

On 5-6 September my colleague Stephen and I attended the second CS Forum in London.

“Content strategy” is the latest buzz-phrase gaining traction in our industry, joining usability, information architecture, interaction design, user experience, accessibility and the like. As with the others, there is a great deal of overlap, but it’s a term I think will endure, as it gets to the nub of the biggest unsolved problem in web projects.

Put simply, it’s that content remains the weak link in most projects. Online content sucks much of the time. There’s too much uninteresting content, not enough useful content, and responsibility for creating and improving it is diffuse. Poor content often exposes the most convincing information architecture and wireframes as a fantasy of wishful thinking.

Content strategy brings out the pessimist in me. There are many reasons why it’s such a hard problem:

  • Getting good quality content out of organisations that do not see themselves as publishers is like getting blood out of a stone.
  • Most content strategy problems are human – political, organisational – on the client side; those you (as external agency/consultant) are least equipped to solve, often going beyond your remit.
  • It’s always easier to make content strategy the client’s problem – you don’t want to accept responsibility for something that is so likely to go wrong.
  • Well-structured content is a Good Thing, but the better structured your data (both information architecture and content types), the worse problems you face when the structure is no longer fit for purpose, and editors try to get around the structure you’ve imposed.

Conference themes

The following themes occurred repeatedly during the conference:

  • We forget to consider the user experience of the content creators (usually in a CMS), in focusing on only the site users
  • Fragmented organisational structures, e.g. lack of communication between developers, designers, copywriters and client
  • Not knowing the CMS well enough, or choosing a CMS without consulting content creators
  • Not learning the lessons of separating content from presentation – content should be produced from a “create once, publish anywhere” approach
  • Lorem ipsum is a problem – it’s always better to design from the content out

The following are not so much summaries of all the talks I attended, as the most interesting things I took away from them. Some writeups are therefore much longer than others.

Gerry McGovern – Manage the tasks, not the content

Back in 2003, his books “Content Critical” and “The Web Content Style Guide” had a huge influence on me and helped guide the content strategy on Unilever.com, which I was working on at the time.

He reminded us that people go to website to do specific things on a website; any content that is not related to those essential tasks should be cut. He particularly dislikes marketing content. “Support is the new marketing”, he says, meaning that the “customer service” content should be the main focus of a site, rather than marketing content.

Avoid unhelpful navigational labels like “Articles” or “FAQs”, and link text should be written to make the keywords clear. (All principles familiar from information architecture best practices.)

Finally, Gerry made a plea for measuring content success, with the emphasis on useful metrics. Figures like visitor numbers, time on page, eye-tracking “fixation time” are all relatively useless – why do we assume that more = better? A/B testing yields more credible insights.

Melissa Rach – Content strategy methodology: a DIY project

Melissa talked about methodology – what strategists do – such as

  • Create clarity
  • Facilitate smart decisions
  • Align stakeholders
  • Help operationalize change

These steps have both content components and people components. But like many talks, it mainly deals with the what (none surprising), but not the how, which is where the the problems lie in my experience.

Margo Bloomstein – First things first: message matters

This talk recommended establishing a “message architecture” as a first step. Via card-sorting, help the client establish

  • Who we are
  • Who we’re not
  • Who we’d like to be

…and tell the story of those aspirations. The message architecture drives the copywriting, the user experience and design, and becomes the benchmark by which to evaluate success later.

This is valid advice, but not always within my remit, and also not really where the worst problems originate.

Karen McGrane – The way forward: what’s next for content strategy

This gets my vote for the best talk of the conference, as it homes in on all the biggest problem areas. You can watch it on Vimeo

Problems usually don’t lie in technology, but in company structure. She describes her role as “corporate therapy”. Companies suffer from fragmentation:

  • fragmented content management
  • fragmented organisation structures
  • fragmented devices and platforms

Content strategy – part of UX – should function as the bridge between marketing and technology. Content strategy is often about organisational change management.

Karen acknowledged to me in conversation afterwards that solutions to these problems are often out of your grasp and if she could do certain projects over again, she wouldn’t necessarily be able to prevent them.

The CMS – “the enterprise software that UX forgot” – is another major problem area. Why do we care more about the conversion funnel than the CMS workflow? Why do we care about the user experience of site visitors, but not editors? We need to work with the developers implementing the CMS from the beginning. And even before that, CMS procurement should be on the basis of usability – workflow that matches editors’ mental model – rather than feature checklists. A better CMS fosters better content.

As for the developing content for a fragmented landscape of devices and platforms, Karen compared the approaches of Condé Nast and NPR. Condé Nast’s strategy, by creating iPad apps for their magazines like Wired, is unsustainable. Ethan Marcotte: “Fragmenting content across ‘device-optimised’ experiences is a losing proposition” . By contrast, NPR pursued a “Create Once, Publish Anywhere” approach, by building an API that enables their content to be deployed in a multitude of contexts. This is why we need flexible, structured content, and why Ethan Resnick claimed “Metadata is the new art direction”.

These are all familiar principles to information architects, but as Karen said, Mobile is a great wedge to bring the argument up again.

Lisa Welchman – On All the Different “Web Governances” in the Universe

Governance refers to a policy and standards. Some key points:

  • A policy that no-one knows exists, or reads, doesn’t exist
  • Policy that you don’t have the authority to implement doesn’t make sense
  • Don’t be a barrier to organisational change that needs to take place (I suspect this is aimed more at insiders than consultants)

Lisa then introduced the recently-formed Web Governance Journal.

Eric Reiss – Content strategists: the men and women of a new renaissance

Eric summed up the big problem I have with content strategy: “Strategy is easy. The rest is tactics.” We often know what needs to be done, but making it happen is where the problems lie.

Erin Kissane – Making sense of the (new) new content landscape

Erin invoked the old IA favourite, Christopher Alexander, who in his treatises on architecture described enduring patterns such as balance, interconnection and stability, and fundamental user needs that products are useful, accessible, findable, searchable, portable and usable in many ways.

We need to preserve the life of our strategic decisions to enhance the life of our visitors.

Des Traynor – The language of software: the role of content strategy in software development

Des talked about websites (or web applications) with strong social elements or user-generated content. In such cases, there may not be much content to begin with, but what we do control, is

  • The user interface
  • The blank slate (what users start with)
  • The content definition

Thus “you get the content you deserve”.

  1. User interface: “Language influences behaviour”. Consider the ramifications of labeling decisions like “tweet” (not share, update, set, publish, post…) or “like” (not love, or appreciate…) E.g you can “+1” something where “like” isn’t appropriate. The Zune’s coolest feature, instant sharing via wifi, was arguably killed at birth by the term “squirt”.
  2. Blank slate: Consider the different quality of user-generated content on YouTube, Yahoo Answers, Quora, Get Satisfaction. By subtle elements in how they let you compose, the results can be idiotic or serious, disrespectful or civil. Google Wave undermined itself with unconvincing sample/seed content.
  3. Content definition: Compare the quality of TripAdvisor reviews with those on the Apple app store (where ratings don’t differentiate between e.g. technical problems or product suitability.)

Des showed an example of a “microcopy framework” for an application. This is a spreadsheet containing all messages in the application, and their aims and expected tone of voice.

Content is “always an opportunity to delight your user”. +1

Richard Ingram – How did we all get here?

Richard revealed the results of several surveys on professionals who consider themselves content strategists, in the form of gratuitous infographics :) which you can view here.

His blog’s title is a veiled entreatment to abandon Lorem Ipsum.

Martin Belam (guardian.co.uk) @currybet – Content strategy for people who think they already have one

As expected, another stand-out talk, but probably one of the least applicable to me. Journalistic organisations tend not to have the CS problems I find most intractable (they actually have qualified writers and editors), although they have different problems relating to the business model – extracting the greatest value from content – and publishing across a ever-fragmenting landscape.

The Guardian is also an ideal organisation in many respects – doing their development in-house with good collaboration between editors (a fascinating story of “domain driven design” in itself), IAs and developers, and using a custom-built CMS that dispenses with a lot of complex workflow in favour of trust. Having a controlled taxonomy of keywords with a single manager is also something that on most projects I can but dream of.

In keeping with one of the main themes of the conference, Martin stressed the benefits of an API approach to deal with platform fragmentation – create once, publish anywhere.

I liked Martin’s advice to teach using a “portfolio of errors” – showing the mistakes others have made before, rather than criticising your colleagues’ efforts.

Martin ended with a healthy reminder for all “content strategists” not to forget their Information Architecture, and the decade-plus of prior art. (The vast majority of the conference would have sat comfortably under an IA banner.)

Sophie Dennis @sophiedennis – How web designers can stop worrying and learn to love content strategy

I enjoyed Sophie’s talk, which grappled with the all-too-familiar problem their own agency website, which no-one likes but never gets fixed. 

Sophie pointed out that “lorem ipsum” dummy text is specifically intended not to be read, hence it results in designs that are simply not reader-friendly, where the text is the least important component of the design.

When we start designing with real content, we become better designers. However, this is usually not realistic. The pressure to design first typically comes from clients – “I’ll know what to write when I see it”.

In response, Sophie proposed an agile approach, where neither comes first. For their own agency site redesign, she turned off CSS on the existing site and, working with the HTML rather than wireframes, started rewriting – this made it easy for everyone to focus on the content. Then, it became easy to start adding style and “grow the brand out from there”.

Sophie is also a fan of spreadsheets instead of tree-diagram sitemaps, as they are much better at communicating scope. Sitemaps can give a misleading mental model of the amount of content on a site. Spreadsheets are also more accessible and collaborative.

Kate Kenyon @kate_kenyon – Content strategy and CMSs

Valuable reminders never to ignore the CMS. As the content strategist, you need to own it, and understand the publishing process. Find out where it fits in the technical ecosystem. Ideally, the CMS should be procured from the user requirements of the editors.

Good reminders for me. For a long time I intentionally ignored the CMS, believing that it was my duty to focus on the user experience and not be “captured” by the tech constituent.

Cleve Gibbons @cleveg – Strategist and executioner

We’re used to doing “customer journeys”, Cleve said, but we hardly ever do “author journeys”. Optimise the author journey, focus on making authors proficient and productive.

More exhortations to bring technologists to the table when content management is discussed. Developers shouldn’t just “take the ticket and implement” something; they should ask “Why?” more often – what’s the aim of the requirement?

(I agree, from painful experience – many well-intentioned features I designed turned out, after implementation, to result in such editor overhead in the CMS that the features went unused.)

Lisa Moore @writebyteuk – Agile and Content Strategy

Lisa espoused Agile principles such as “putting deliverables on a diet” – keeping documentation to a minimum (I agree), developing early, and testing often.

The team structure she described had separate content strategists and copywriters – she recommends that these people should be involved in all meetings with IAs and developers. However, in most of my projects these don’t exist as separate roles.

Lisa also recommended using real content as far as possible in wireframes – besides benefiting design, another advantage is that it’s likely to be included in user tests.

Noz Urbina @nozurbina – B2B content strategy

Noz made a compelling case for replacing the traditional marketing content with customer service content – i.e. the stuff you usually got after the sale is then driving the sale (echoing Gerry McGovern from earlier.)

B2B (as opposed to B2C) compels you to focus on customer retention rather than acquisition.

Noz saw our role as “consultant within the enterprise” – that if everyone just stuck to the org chart, eventually it’ll be the customer who suffers.

Two accessibility gotchas

A few weeks ago I watched a live web conference in which the open source Plone content management system was put through its paces by a blind person using a screen reader (JAWS). Two important issues stuck with me, since they affect most websites, are quite serious, but do not seem to be very widely known.

Popups are usually inaccessible

(Ticket 12123) Popups (by which I mean dynamic overlays rather than popup browser windows) are usually inaccessible. For example, in Plone the login button invoked a login form inside an overlay. Popups are also commonly used for modal dialogs. It is common practice to put popup DIVs at the end of the HTML body, but this means screen reader users are usually not aware of their appearance, nor have any way to get to their contents. (Especially if they do not contain forms; forms in popups are still accessible via JAWS “forms mode”.)

There are two alternative solutions:

  • The trigger element should have an anchor link to the popup contents.
  • The popup should be inserted in the markup immediately after the trigger element.

Both have some associated problems, but I prefer the first solution and will try to implement it in future.

Form field help should be part of the label

(Ticket 7212) In the commonly-used “forms mode” in JAWS, the screen reader reads only labels, legends, input fields, select fields and textarea fields, but no other elements in the form, such as paragraphs, DIVs, etc. That means any help text (usually positioned below or alongside the field) is ignored. This also applies to validation error messages.

The recommendation is to include the help text inside the LABEL element, and use CSS positioning to put it in the right place in the layout. (I can see how this can cause problems where the text should appear above or below fields.)

Update: Here’s an example (see also the ARIA enhancements, not quite reliable cross-platform according to the associated Alistapart article)

Update 19/10/2011: This article recommends the use of aria-describedby for both help text and validation message. It advocates telling users of non-ARIA supporting screen readers to upgrade (but claims that JAWS has supported it since version 10 in 2008)

It was good to see a content management system take accessibility so seriously. Here are some more tickets.

UX Lx: Day 1

Thoroughly enjoyed UX Lisbon, organised by ideas e imagens. I don’t think I’ve ever been at a conference where I didn’t regret a single session.

These notes are not necessarily complete summaries of the sessions; they focus on the aspects I found most noteworthy.

1. Prototyping with HTML5 – Todd Zaki Warfel

@zakiwarfel | Slides

Probably the workshop in which I learned least, as Zaki Warfel effectively described exactly the way I currently prototype, and for the same reasons. But was good to have this reinforced.

With all prototypes, including wireframes, set expectations with the client. Choose the level of visual and functional fidelity the project requires. I tend to choose HTML prototypes when I require a high level of functional fidelity, when I don’t know whether an interface works until I can use it.

HTML prototypes have many advantages:

  • closest to final delivery environment
  • browsers & text editors are ubiquitous – fewer things that can go wrong
  • my favourite: producing production-quality HTML for the prototype can shave 30-40% off production time. (This divided attendees as some considered it too high a hurdle. But aligns with my belief that well-structured semantic HTML is akin to IA and the IA is best placed to write it)

HTML + JavaScript is also more capable than any GUI prototyping tool like Axure. But remember, it is still more constrained than a sketch or a wireframe. When you are sketching you can invent things; you’re not constrained by the familiar toolset.

Zaki Warfel stressed the advantages of HTML5, but I came away still unconvinced of any real benefits for prototypes. Only new HTML5 form attributes (date, email, tel, url, placeholder, etc) are a no-brainer if you want to improve usability in iPhones.

In the tricky choice between HTML5‘s <section> and <article> elements, he suggested thinking of using <article> for things that you can imagine going into an RSS feed. But just “choose one road and don’t look back”. (I also find this very confusing, and agree with Jeremy Keith that article and section should be merged into a single element.)

To make HTML5 compatible with older browsers, Zaki Warfel recommended html5shiv instead of modernizr, to keep your HTML cleaner.

CSS3 he didn’t have to sell to me. I already know it saves major time not doing sliding doors or image-based corners, shadows or gradients. But note you have to go back later to fix for IE (or convince the client to accept visual compromises in IE, as Zaki Warfel, echoing Zeldman, advised.)

CSS3 selectors are also a huge time saver in prototyping, but (in my opinion) they will usually require extensive refactoring for IE later.

Zaki Warfel recommended using include files for repeated elements in your HTML, using PHP (already installed in OS X). At first I thought this was unnecessary, but then remembered that in later stages of prototyping I tend to do a lot of difficult find/replace operations across files. So I definitely want to try this in future.

Finally, Zaki Warfel demonstrated the magic of jQuery. I’ve only recently started learning it, and I think every interaction designer should. I use it for all show/hide, open/close, expand/contract, highlight, popup, etc. behaviours in HTML prototypes, and it is a pleasure being able to do this as easily as CSS.

2. Skeuomorphs: The Good, The Bad, and the Silly – Andrew Watterson

@andrewwatterson | Also summarised on Johnny Holland

Skeuomorphs refer to physical metaphors to ease transition to new technologies (mostly touch interfaces in this talk). Explicitly recommended in Apple’s HIG. Criticised by Adam Greenfield “patronizing crutches” and (implicitly) by Jakob Nielsen “Users don’t know where they can click.”

Skeuomorphs can be good:

  • They deemphasise technology in favour of utility (cf. Apple’s ads which suggest familiar usage contexts for new technology, vs. Android’s science-fiction approach.)
  • They can reassure and please, bridge gaps, ease transition. He paraphrases Don Norman’s adage that users have 2 needs, (1) to get something done, and (2) to smile.

But skeuomorphs can cause lots of problems (interestingly, most of the worst culprits are Apple apps)

  • They can mislead with inappropriate metaphors (Apple Calendar and Contacts on iPad have scrollable areas, concealed by the fact that they look like paper; Contacts has a “Groups” icon that looks like a bookmark but is actually a button; the Apple Sound Recorder shows a realistic microphone that conceals the location of the device’s actual microphone)
  • They can cause you to skip opportunities to innovate, preventing you from improving on existing tech (e.g. Apple’s Compass app, which has no more utility than a mechanical one, compared with AR Compass which uses augmented reality to give a far more useful view.)

Watterson didn’t even talk about Apple’s egregious Game Center, which boasts the worst skeuomorphic interface I’ve ever seen.

3. Serendipity: Beyond Recommendation – Pedro Fernandes

@betasolo

In Fernandes’ talk, serendipity refers to fortunate discoveries while looking for something unrelated.

The essence of his argument was that existing recommendation engines frequently resemble echo chambers, limited by their algorithms or the metadata they depend on. If item A always recommends items B and C, and items B and C recommend A/C and A/B respectively, then you’ll never discover items D or Z.

In e-commerce this can undermine “long tail” sales, but in social networking it can result in “cultural tribalism”, online spaces that just reinforce your preferred world view.

Fernandes showed two apps that try to inject more serendipity into the user journey: mowid.com for browsing films, and Serendipicity, a mobile app for tourists. Mowid relies heavily on tag-based browsing, drawn from a rich vocabulary of tags, and Serendipicity lets you explore places primarily via photos (inherently more open to interpretation) taken in the vicinity by other users.

While I agree about the central issue, I was a bit skeptical about both apps.

Designing for Touch – Josh Clark

@globalmoxie

Definitely one of the conference highlights (all the way from the Johnny Cash opening music). Clark proved himself a consummate educator, fitting in so much detailed, actionable information and examples that the fact it was a presentation rather than a workshop didn’t matter.

When designing for mobile devices, forget pixels: think of it as designing a physical device. It’s more like industrial design.

Many design rules follow directly from physical limitations. On small touchscreen devices, follow the rule of controls at bottom, content at top – so that content is not obscured by “meat sticks”. For the same reason, two rows of buttons at the bottom is also not ideal, something which Android is unfortunately stuck with.

Josh Clark demonstrating optimal thumb range on an iPhone

Optimal thumb range

Bottom-left is the top spot – for right-handed users at least, which is what you should optimise for. But consider offering a setting for the 10-15% of users who are left-handed.

Because it’s difficult to fix buttons at the bottom of the screen using JavaScript, he suggested a design pattern for web apps where the main menu is always presented at the bottom of the content, but with a “menu” anchor link at the top of the page.

For the iPad / tablets, rules are different. There are many ways to use them, and many use contexts, with no clear preference for portrait or landscape. It is therefore harder to predict hand positioning. Top-of-screen controls are better, to avoid controls at the bottom sinking into your belly :)

The Instapaper app, with its controls in the top two corners, was praised, and The Daily, where using the page scrubber at the top obscures the thumbnail images just below it, was criticised.

What is the optimal size for a button? The answer turns out to be 44 pixels high (29 wide). This happens to be (no coincidence) the height of the iPhone menu bars, buttons, and keys on the virtual keyboard.

But what is a pixel? Due to differing pixel densities, we should stop thinking of device pixels, and think rather of a physical measure on screen. This is called

  • iOS: points
  • CSS: pixel (which is device-independent)
  • Android: density-independent pixel (dp)

So if you continue to use the px unit in CSS, it automatically does the right thing. However, you need to start producing both a normal and high-density version of all images, e.g. image.png and image@2x.png
This is easy enough to apply using the IMG element, but for CSS background images you have to specify the higher-density image thus

@media -webkit-min-device-pixel-ratio:2 {
		.class { background-image: image@2x.png;
		         background-size: 50%;
		}
	}

It’s not just about size, though, but also spacing: the closer together, the bigger buttons need to be. A useful tip is to invisibly increase the hit area for small elements (Remember the Milk does this for checkboxes.)

You can also use animation to draw attention to or explain buttons or other elements that might otherwise be missed or misunderstood. For example, slide in a menu that can be swiped, or pop up the primary action button (Gowalla does this.)

Single-screen interfaces (utility apps) strengthen the illusion that it is a physical device. They should pass the “glance test” – right information hierarchy at arms’ length. Two examples: Tea Round and Umbrella Today. Strip out everything not needed when rushed and distracted. Clarity trumps Density (of features).

But mobile devices are not always used when rushed and distracted! People don’t want “dumbed down”. People want uncomplicated. Full-featured, just lighter interface. (Story of initial Facebook app which was billed as a companion app, which users rejected because they expected to be able to do everything they could do on the website.) Some mobile apps need to do more than the desktop versions.

Don’t fear extra taps! Web has made us squeamish about number of clicks. Latency not an issue in a native/cached app. Tap quality* trumps tap quantity (*unconfusing)

Similarly, don’t fear scrolling. E.g. USA Today used an accordion interface to avoid a long list of headlines scrolling, but not all users understood it. For long lists of content, scrolling is still better.

Changing orientation: think of it not just as a change in layout, but also a change in mindset. Landscape mode can also be seen as “focused mode”. But beware of depending on it, as it is hard to discover. (Personally, I believe orientation in iOS is flawed, invoked overwhelmingly by accident. I think Ben Summers has the right idea for how it should work.)

Affordances: gestures are the keyboard shortcuts of touch interfaces, they can be hard for users to discover. Shortcuts need backup plans – never rely on all users discovering them.

When you see your app being used, look for unsuccessful gesture attempts and repetitive interactions, and pave the cowpaths. (Have Apple never seen users try to swipe the Calendar app to change day?)

Multi-touch (on phones, not iPad) and the shake gesture generally considered bad.

Don’t get carried away with an impressive, glitzy interface. Showcase the content, not the form. Don’t underestimate the power of the humdrum and familiar – see NY Times and Flipboard apps. Similarly, skeuomorphs, while still window-dressing, can enhance a design; familiarity and intimacy invite touch. But if you’re aping a physical object, choose the right metaphor.

But try to avoid buttons altogether. Buttons are a hack. Look at how a toddler uses an iPhone/iPad – they try to interact directly with the content. Wherever possible, make the content the interface. (Example: see how Twitter for iPad eliminated the Back button.)

Clark signed off on an optimistic note, and an encouragement to experiment: new platforms don’t appear very often. This is the coolest job in the world.

Improving Harvest using seductive interactions

One of my favourite talks at UX London 2010 was Stephen Anderson’s Seductive Interactions – using basic psychological principles to bridge the gap between business goals and users’ behavioural goals. Usability alone, he argues, often merely decreases friction. Using psychology can also increase users’ motivation.

At UXLx, the recent UX conference in Lisbon, I attended Anderson’s workshop on the same topic. By luck, our exercises focused on improving one of my great bugbears, time tracking. And not just any time tracker: Harvest, which we use at Isotoma.

Don’t get me wrong: Harvest is by a long stretch the best time tracker I’ve used. But nevertheless, it remains an activity I hate and avoid, and routinely fill in my time weeks late.

Applying typical utilitarian thinking would result in improvements like –

  • Increasing the font sizes in Day view
  • Week view should highlight active rows and columns, and keep column headers and footers visible when scrolling

– helpful, but unlikely to change my attitude or behaviour.

In the workshop we used Anderson’s “Mental Notes”, each of which describe a psychological principle designers can take advantage of, and used them to generate ideas to improve Harvest. Here are the relevant notes, and what we came up with:

Trigger / Recognition Over Recall / Feedback Loops

Feedback Loops; Trigger; Recognition Over RecallThe application should listen and learn from your actions. Just like a good online supermarket will make it easy to re-order the things you frequently buy, Harvest should always automatically show the jobs you worked on previously.

It should also trigger you to take action. At the end of each day, it should pop up a form: “Did you work on these jobs today?” containing only the jobs you had put time on the day before.

A frequent problem is that a job might not exist in the system yet, because a project manager hasn’t created it, or hasn’t added you to it.

Firstly, in the Day and Week views, you should see all the jobs you’ve been added to, not only the ones you’ve chosen to show. (The dropdown menu to add jobs is a terrible interface.)

Secondly, you should be able to create “placeholder” jobs and assign times to them, which you can later reconcile with an actual job. Placeholder times will be used in your own totals, but not in project managers’ reports.

Harvest popup window

Status / Achievements / Competition / Appropriate Challenges

Competition; Status; Achievements; Appropriate ChallengesAnderson showed the example of Target’s supermarket checkout interface, which uses a game-like system to encourage cashiers to work more quickly (photo).

Similar principles can be used to encourage workers to fill their timesheets in sooner. Harvest should track how long, on average, it takes you to fill in your timesheet (same day, 1 day late, 7 days late, etc.) This should be shown to you as a chart over time, so you can see your average and your trend.

Then it should also show you the same timeliness charts for your colleagues, and where you rank (without names, as no-one would appreciate the feeling of being singled out.) This will act as a subtle but powerful spur to improve your timeliness. The company can also offer rewards for the most timely, or the greatest improvement.

Set Completion

Set CompletionHighlight it when a day totals up to 7.5h or more (or whatever your daily goal is). Completed days should stand out clearly on the Week view and reports. And how about a “Well done!” whenever you complete a day?

Another example: 750 Words encourages writers to stick to a writing schedule by writing only 750 words a day, checking off a box when they’ve done so. (See screenshot here.)

Delighters / Surprise / Humor Effect / Visual Imagery / Self Expression

Delighters; Surprise; Humor Effect; Visual Imagery; Self ExpressionIn a different UXLx talk, Andrew Watterson paraphrased usability pope Don Norman’s adage that users have 2 basic needs: (1) to get something done, and (2) to smile. Adding a touch of humour to an application can go a long way towards making it more pleasurable to use and memorable.

This is why our own Forkd.com uses tomatoes instead of asterisks on forms, amongst many other touches, and I’ve lost count of the number of people who have commented on that.

Forkd.com registration form (excerpt)Some ideas that cropped up here were for the application to offer occasional comments and quips, e.g. advising users to mind their posture, or go for a walk now and then, or pep talk suggesting how hard-working they are. Ex-colleague Jonathan Baker-Bates designed a custom-built time tracker that allowed employees to add their own quips, which rapidly became an impromptu means of humourous communication and teasing within the company (like the #isotoma IRC channel’s notorious topics, I imagine.)

Another ex-colleague, Karl Sabino and I came up with the idea of adding “themes” to the time entry interface. Instead of the current businesslike grey-and-orange, you could fill in your times on something resembling a ticking bomb, or a Superbowl scoreboard, or an Indian restaurant menu. To our surprise, it turned out that Dominos Pizza already had this idea on their patented Pizza Tracker:

Screenshot of Dominos Pizza Tracker

Many customers found this amusing enough to post videos of it on YouTube.

So there you have it: a fresh approach to a design problem that yielded many results I doubt I’d have come up with otherwise. When I tweeted about the workshop, Harvest immediately responded with interest. I hope you like the results! (And my employers will thank you if you can somehow improve my terrible timesheet habits.)

I’d like to acknowledge the input of Karl Sabino and Jonathan Baker-Bates in several of the ideas above.

Twitter redesign: first impressions

Wow, my last post became obsolete fast. Good to know that the talented designers at Twitter have indeed been working on a root-and-branch revamp of Twitter.com all this time. Sorry if I cast aspersions that you weren’t! Let’s see how they fared against my wishlist:

New home page

1. How is “Profile” a suitable label for your Twitter stream?
Well, you still have to click Profile to see your tweets. But they’re clearly titled “Timeline” now, and this is not the only way to get to it.
The tab is also highlighted when selected.
Score: 7/10

2. Why is there not a decent Profile page?
We still don’t have detailed Profile pages, with more than 140 characters and 1 link, and room for more personal information, account statistics and analysis. Curious to know the rationale for this.
Score: 0/10

3. Why on earth is there only a Tweet box at the top of the Home page?
and
4. As soon as you scroll down, you lose the Tweet box.
These problems have been fixed 100%. Not only is the box above both Home and Profile, but there at any point you can invoke a floating, repositionable tweet box from the button in the header.
Score: 10/10, 10/10

New Profile page

5. Inconsistency in the right-hand navigation between the Home page and the Profile page.
Well, the right-hand navigation has been completely changed. These links are now tabs on the Home and Profile pages respectively, and differ between these two pages. At least some thought appears to have gone into which tabs are shown. Would have to see if I find them intuitive. I may still end up wondering how to get to @Mentions while on the Profile page, or where to find things I’ve retweeted. Messages are in a more appropriate location, well separated, in the header.
Score: will have to see.

6. The Home right-hand navigation area is also polluted by, essentially, promotions
They have now been clearly separated from your content, and no longer pushing some of it down. Still occupies a lot of prime real estate, though.
Score: 8/10

7. The right-hand navigation could generally be used a lot better.
OK, still can’t browse my hashtags. And don’t have totals for @Mentions or Retweets, but that’s not really important. The search box is now in a sensible place, in the header. And they’ve done plenty of other useful things with the right-hand column.
Score: 8/10

8. Only when you’re on the Home page (twitter.com) can you easily see who you’re signed in as
Fixed. This is now in a sensible place, top right.
Score: 10/10

9. Twitter.com does a terrible job of showing you recent activity.
Still terrible. Still no alert for an @Mention. Not even a new DM is highlighted in any way.
Score: 0/10

10. Inability to Search inside your own Twitter stream, or inside someone else’s.
No change here, by the looks of it.
Score: 0/10

11. When you look at your, or someone else’s Profile, Twitter could display much, much more useful information.
Besides room for more personal information, I was hoping for statistics and analysis, such as tweet frequency and tweet type ratios, frequently-used hashtags, time on Twitter, etc. No change here either.
Score: 0/10

12. The “More” page-down experience is atrocious.
OK, AJAX (sort of) to the rescue. The new “infinite scroll” is very impressive and (in my tests) super-fast. However, it still does not allow me to easily make big leaps into the past, the way paging links would’ve. When you navigate away, and click Back, you’re also not (instantly) back to where you were, although it tries to do so and catches up after a few seconds.
Score: 6/10

13. The display of user lists is bad, both in the mini form (avatars in the right-hand column) and the full page listings.
This is also improved by the AJAX infinite scroll, and the layout is improved, but still no sorting, or searching for a name within a list of contacts. Also, the experience of mousing over mini-avatars to see names hasn’t been improved.
Score: 3/10

14. Automatically hyperlink a Tweet ID.
Unsurprisingly, no. (I’d be curious to know whether this is feasible, incidentally.)
Score: 0/10

So, in total, of my 14 criticisms, 7 were completely or mostly corrected, and 6 only imperfectly or not at all. One still needs further testing.

My analysis above looks only at the points I raised in my previous post, and does not take into account the smart new layout and many new features, some of them quite wonderful. I especially like the photos, videos and conversations in the new right-hand pane. (Although the pane often seems to be filled up with not-really-relevant stuff just because there’s room.) I want to experience the beta a bit longer to get a feel for everything that’s new.

Why is Twitter.com so badly designed?

Since I’m always complaining about Twitter.com, it’s just fair for me to list what I think is wrong with it and how it should be improved. Please note: I’m only criticising the Twitter.com website, not the service itself or any other Twitter client.

1. Twitter Profile page highlighting top navigationHow is “Profile” a suitable label for your Twitter stream? Can you imagine someone talking about their “profile” on Twitter, when they mean their updates? Making it even more confusing, this menu item is not even highlighted after it’s selected.

2. For that matter, why is there not a decent Profile page? (In the normal social networking sense of the word.) Tweets are restricted to 140 characters for a good reason; there is no reason to have only a 140 character bio and a single link.

3. Why on earth is there only a Tweet box at the top of the Home page? At the very least, there should be one on the Profile page. Really, you should be able to tweet from any page on Twitter.com.

4. As soon as you scroll down, you lose the Tweet box. So if an update lower down on the page spurred you to write, or if you click Reply, bad luck: you have to lose it from sight in order to write.

5. Comparison of right-hand navigation on Twitter Home and Profile pages respectivelyLook at the inconsistency in the right-hand navigation between the Home page and the Profile page. Home has @replies, DMs, Favorites and Retweets (itself a disaster). Profile has Tweets and Favorites only. Is there any reason for them to be different on your own Profile? (On someone else’s Profile it will be Tweets and Favorites only, of course.) I’ve often been confused at the disappearance of, say, @replies, only to realise it’s because I’m on the Profile, rather than the Home page.

6. The Home right-hand navigation area is also polluted by, essentially, promotions, such as “Who to follow” and “Trending”. They are not visually distinguished from real navigation.

7. The right-hand navigation could generally be a lot better. Why not have the hashtags you’ve used, since they’re effectively a way of categorising your tweets? Why not display the total numbers of @replies, Favorites and Retweets (as for DMs)? The Search box is in a silly place, and being directly below your navigation links, implies it’s searching your tweets, not all tweets.

8. Only when you’re on the Home page (twitter.com) can you easily see who you’re signed in as (badge at top right). On the Profile page, only the “That’s you!” under the photo at the top tells you. On other pages, nothing tells you. In a household with multiple Twitter accounts, this is rather frustrating.

9. Twitter.com does a terrible job of showing you recent activity. When you log in to Twitter, you expect bright highlights alerting you of new @replies, if one of your tweets was favorited or retweeted, or if you have new DMs. When done right, this really boosts addictiveness: Facebook and Flickr are just two examples.

10. The inability to Search inside your own Twitter stream, or inside someone else’s, is simply crippling. I assume that this is a technical limitation due to Twitter’s scale, but it should be a top priority. If Google had bought Twitter, I imagine this would’ve been the first thing they’d fix.

11. When you look at someone else’s Profile, Twitter could display much, much more useful information. (Having a real Profile page would help.) Basically, the sort of information that 3rd-party services like MrTweet offers: their tweet frequency, how long they’ve been on Twitter, the nature of their Twitter usage (ratio of posts with links, ratio of posts that are @replies, recurrent hashtags, etc.)

On your own Profile, you’d want to see statistics. Number of views of your stream, number of favourites and retweets, ideally graphed over time (like Flickr does it).

12. The “More” page-down experience is atrocious. It’s slow, you can’t skip by more than a page at a time, so if you want to go back a few days it takes forever. And if you navigate away, and click Back, the entire stream is gone again. Presumably this is another technical limitation.

Ideally, I want actual paging links like on Flickr or Vimeo. (Only for a single person’s stream, not the firehose.)

13. The display of user lists is bad, both in the mini form (avatars in the right-hand column) and the full page listings.

On the mini version, relying on tooltips or the browser status bar to read a person’s name is very user-unfriendly. You want to scrub your mouse over the avatars and easily read the names as you do so.

On the full-page versions, where is the paging? Like the Twitter stream, there’s only a “Next” link. How are you expected to navigate through more than a 100 people? They are in no discernable order, and there’s no ability to order them by username, first or last name, or search within the list (like Facebook).

14. An obvious, if geeky, enhancement: automatically hyperlink a Tweet ID. Currently if you want to link to another tweet, you have to use an URL shortener. (A tweet ID is 9 characters shorter than a bit.ly link.)

And here are just some shortcomings that were fixed scandalously late:

  • Native URL shortening
  • The Follow button used to be a gear for ages
  • Emails alerting you to new followers contained no useful info (the follower’s bio was just added a few weeks ago)

I understand Twitter is focusing on growth, and many shortcomings are unavoidable results of its scale (such as Search and proper paging), but they cannot afford to lose sight of the user experience. But mostly I can’t understand how the remarkably talented UX folks they’ve been hiring — people like Kevin Cheng and Doug Bowman — can allow these problems to persist for so long. (While simultaneously rolling out unimpressive features like the infernal hovercards and “Who to follow”.)

Perhaps Twitter’s attitude is to focus on the API, and leave the user experience to third-party services and Twitter clients. That would be a pity: it’s a jungle out there if you’re looking for decent Twitter services. For every decent 3rd-party service or client there’s a plethora of ones that are ramshackle, spammy or downright malware. I also find this explanation implausible, given their investment in UX design talent.

Solving the real Alt-Tab problem

In his latest blog post, Aza Raskin – interface design guru, creative lead on Firefox, and son of one of my heroes, Jef Raskin – tackles one of my oldest bugbears, Alt-Tab. Aza is a clever guy, but I was disappointed that his post addressed an issue I don’t perceive as important, while failing to address what I see as the very real problems with Alt-Tab. But let me start with some background.

(I use both Windows and Mac every day, and in this post tend to use Alt-Tab and Cmd-Tab interchangeably.)

(EDIT: I originally gave the window-switching shortcut as Cmd-\ since I use an external keyboard, but I probably confused Mac users who know it as Cmd-~ (tilde). Updated.)

History (something Windows got right)

Windows had Alt-Tab since Windows 1.0, although it was only implemented in its familiar visual form since Windows 3.1 (1992). To this day, I know many people who never use the shortcut, but I personally cannot imagine using a multi-tasking operating system without it.

I found its initial absence on Apple Macs unacceptable. In Mac-based studios during the 90s, I always relied on a third-party extension, Task Switcher, to provide the missing functionality.

When Apple finally introduced native Cmd-Tab (in OS 8.5, I think), they at first got it wrong. It cycled alphabetically through running apps, rather than switching between apps on a most-recently-used (MRU) basis like Windows1, so I had to continue using the extension. They changed it to MRU order in a later OS update. (Thank goodness Microsoft didn’t think to patent it.)

Unfortunately, Macs still suffer from another difference which I’ll come to after the following interlude.

Interlude: Why is most-recently-used (MRU) order better than cycling (and Exposé)?

The first reason is obvious: If Alt-Tab simply cycled through open windows in a fixed sequence (say, alphabetically), it would just require far too much tabbing, on average, to reach the item you wanted.

But Raskin alludes to the more powerful reason: spatial memory. If the shortcut always switches to the most-recently-used (MRU) item, this quickly teaches you to make the switch without thinking or looking at the screen. Spatial memory is awesome because it’s a background faculty, not foreground. Like reaching for your mouse, it does not interrupt your concentration, or where you’re looking on the screen.

This “toggle” behaviour, using a single shortcut key to switch back and forth between two windows only, is worth mentioning as an important feature in itself. It allows the shortcut to be used to compare the contents of two windows, not simply to switch from one to another. (The Undo/Redo shortcut in Photoshop – Cmd-Z for both – also brilliantly uses this principle.)

So, MRU order allows you to switch between two tasks pretty much subconsciously. Personally, I have learned to switch between up to 3 apps without relying on the visual aid (i.e. using spatial memory alone). For switching between more than that I need to look at the interface, but MRU ordering still reduces the number of times I need to Tab.

The trouble with Macs

Cmd-Tab on the Mac works less well than on Windows, due to the Mac’s application-centric model, as opposed to the document-centric model of Windows. In Windows, for example, you can Alt-Tab between two emails, or two browser windows; on the Mac you can’t. To switch between application windows on the Mac you have to use a different shortcut, Cmd-~, which again uses cycling rather than MRU order. On top of that Cmd-Tab on the Mac has the annoying habit of bringing all an application’s windows to the foreground, often covering up the window you were trying to compare against.

I believe window-switching is closer to the mental model of what this shortcut accomplishes. The clue is in the original feature name: “Task Switcher”. It switches between “things I am doing” – I do not care about which applications they happen to be in. So to see it as an “application switcher” is to miss the point.

It’s also my theory that the deficiencies of task-switching on the Mac spurred the development of Exposé, which I consider a sticking-plaster solution. Exposé is nice, but it does not utilise spatial memory – it forces you to look at the interface.

I admit this is debatable: some people (to my amazement) find application switching on the Mac more natural than window-switching on Windows.

The real problem

As more and more applications adopt a tabbed workspace, Alt-Tab is becoming less useful regardless of which operating system you use. And this is especially serious with browsers, because more and more of our daily tasks happen in browsers nowadays. You’re no longer just “browsing”. Just as often you’re composing documents, managing your calendar, filing bug reports, etc. I often find myself automatically attempting to Alt-Tab between two things I’m doing, but failing because they happen to be in two separate Firefox tabs. And then I have to use the mouse.

In Windows I can improve the situation slightly by opening more browser windows. That way I can use a single window, say, for Google Calendar, one for GMail, one for the blog post I’m writing, a multi-tab one with lots of things I’m reading, etc. On OS X I can’t, since application windows can only be cycled through using Cmd-(Shift)-~.

The keyboard shortcut for switching tabs is usually Ctrl-Tab (on both Mac and Windows), but again, this cycles rather than using MRU. Interestingly, there are a few applications who opted to use MRU with Ctrl-Tab, e.g. oXygen (my favourite HTML editor.) I appreciate it enormously when using the application. Weirdly, Firefox occasionally seems to do this (uses MRU rather than cycling), but this is unreliable and I cannot get it to do so now.

Aza Raskin’s proposal

I’m disappointed that Raskin – evidently a lifelong Mac user – implicitly accepts “application-switching” as the point of the shortcut. As I have tried to explain above, this is fundamentally less useful than task-switching.

In his article he attempts to come up with an improvement to the shortcomings of MRU. (He doesn’t even mention cycling so I assume he is not in favour of it.) MRU’s shortcoming, Raskin says, is that it is only useful for toggling between two things (he says apps, I’d say windows), and frustrates your tendency to form spatial memory habits for more than that. Personally I don’t experience the problem he describes with juggling 3 apps. It’s hard-wired in my spatial memory that Cmd-Tab switches to the last thing, and Cmd-Tab-Tab switches to the last-but-one. For more than this I need to shift my attention to the switcher interface, and spatial memory is no longer of help. But this is infrequent enough not to matter.

He then proposes a “habit-respecting MRU” (HRMRU) to solve this problem I don’t perceive. He ponders using heuristics or even a Markov model to detect users’ habits. Personally I see this failing for the very reasons he himself described – it would just result in a seemingly capricious interface.

But the bigger problem I have with Raskin’s article is that he doesn’t address the real erosion in the usefulness of this shortcut: The loss of MRU due to tabs, and the co-existence of both MRU-ordered switching and cycling. (And the greater problem on Mac OS by having 3 switching modes: Apps, windows and tabs, two of which don’t use MRU.)

My proposal

Tabbed interfaces are not going away. They’re a necessary way of managing the ever-increasing number of windows we have to juggle. If I had to Alt-Tab between all of them, it could number over a 100. So I think two switching modes are inevitable.

So I propose that only two shortcuts are necessary: Alt-Tab / Cmd-Tab for window-switching, and Ctrl-Tab for tab-switching inside a window. (Or document-switching in applications that don’t use tabs.) Both should work exactly the same way: MRU order, with the addition of Shift to reverse the order. There is no reason for application-switching to exist.

This would be a minor change on Windows, but a fundamental one on the Mac. Perhaps, as OS X insists on having 3 switching modes with 3 different shortcut keys, they could at least redefine the second one – Cmd-~ – to be window-switching across all apps (i.e. like Windows) rather than within the current app only. And all should use MRU.

Some people may find MRU order in a tabbed interface confusing, and crave a keyboard shortcut that cycles instead, but then a different application-specific shortcut could always be provided. E.g. Firefox already has Cmd-Alt-Left/Right arrow (Mac) or Ctrl-PgUp/PgDn (Windows). These are more appropriate shortcuts for cycling as their names imply directionality.

  1. Windows actually uses Z-Order, but in practice this generally works like MRU. The behaviour was slightly changed in Vista, but the 6 most recent windows still uses MRU order.

How Google will kill Internet Explorer and save the web

Update: This open letter from the EFF to Google makes some of the same points, particularly how Google is probably the one company able to establish an open video standard for the web.

Update 2 (19/5/2010): Steps 1 and 2 in my prediction appears to have happened. Google is open-sourcing VP8 in the hope of making it (in the form of WebM) the standard for internet video. They will transcode all YouTube video to WebM. And Firefox and IE9 (in a half-assed way) have already committed to supporting it.

Update 3 (8/6/2010): Some of the best writing on this topic can be found on Diary Of An x264 Developer by Jason Garrett-Glaser (aka Dark Shikari), especially this and this. In short: he also anticipates the YouTube gambit (“Blitzkrieg”), and would welcome a truly open, patent-free video format for the web to oust Flash, but points out many existing and potential problems with WebM/VP8 that may be its undoing, and does not see H.264 going away. Wait and see, basically.

Update 4 (30/6/2010): YouTube speaks: “While HTML5’s video support enables us to bring most of the content and features of YouTube to computers and other devices that don’t support Flash Player, it does not yet meet all of our needs. Today, Adobe Flash provides the best platform for YouTube’s video distribution requirements, which is why our primary video player is built with it.” So, not soon, anyway.

I’d like to make a prediction. I’m probably not the first to make it, and I may be utterly wrong, but just in case I prove to be right, I’d like to have it on record.1

I believe Google is planning to kill off Internet Explorer, within the next two years, and I think they can succeed. By “kill off” I mean turn it from the majority browser into a niche browser (<20% for all versions combined.) I believe the strategy relies on Chrome Frame, YouTube, and HTML5 video using the VP8 format.

The game plan

Step 1. It is rumoured Google will soon open-source the VP8 video compression format by On2 Technologies, whom they bought earlier this year. They’ll do so in the hope that it would become the default video format on the web, over Theora (open but technically inferior) and H.264 (superior but patent-encumbered). If they did so, Mozilla, Webkit and Opera browsers, with their fierce competition and fast update cycles, will likely hedge their bets and quickly add support for VP8, in addition to the formats they already support.

Step 2. Google will transcode all videos on YouTube to VP8 format, and serve this as the default to capable browsers. Converting such a vast amount of video is a monumental task, but Google has the resources to do it.

Step 3. Once the release versions of all the major non-IE browsers are capable of displaying VP8 HTML5 video without a hitch2, Google will make its final move. Notices will appear on YouTube that they will soon turn off support for Flash, and serve all video as VP8 only. If you use Firefox, Safari or Chrome, you won’t notice a difference. But if you’re using Internet Explorer, not to worry: all you need to do is install a simple plugin: Chrome Frame.

Chrome Frame effectively turns Internet Explorer into Chrome. It still looks like you’re running IE, but the rendering engine has been replaced by Google’s. (Only on request, though: web authors have to explicitly ask for Chrome Frame to be used if available. The rest of the time IE remains unchanged.)

YouTube is Special

YouTube is unique on the web in that pretty much everyone uses it: it is the third most visited site after Google and Yahoo. I would wager that, within a month, some 80% of web users will have visited YouTube, and the vast majority of Internet Explorer users will have installed the plugin they need to continue getting their funny cat fix. Where else would they go? Sure, there are other video sites out there, but none truly compete with YouTube, in terms of volume of content, or audience size.

At the same time, very few people would be able to lambast Google for breaking something that harms their business or access to vital information. Very few people need YouTube, and very few of those will be unable to install the plugin or switch to a different browser.

In a matter of months, the vast majority of IE users will either have switched to a different browser, or installed Chrome Frame, effectively turning it (on demand) into Chrome. IE’s market share (if you look at the actual rendering engine) will collapse from 55% today3 to under 20% (and hopefully much lower).

This will reveal Google’s acquisitions of YouTube, On2, and their development of their own Chrome browser, merely as components in a masterpiece of long-game strategy. Without every one of these components, each monumental and expensive in themselves, the strategy couldn’t succeed. Nobody but Google could’ve done it.

The result

And what a future this will win for the web. Look at this demonstration of the capabilities of HTML5 and CSS3, and imagine a world in which every new website can use every part of it. This could be seen as a massive upgrade for the internet. Imagine not needing to support outdated versions of IE anymore. Only if you had to support a significant customer base locked in by IT policy, a rapidly-dwindling segment, would you still need to support IE.

How will this affect other players? It will be a mortal blow against Adobe, with Flash rapidly losing its hold over internet video over the ensuing months. This would suit Apple fine, who are already doing their best to keep Flash off Apple hardware. (They’re currently putting their weight behind H.264, but that’s simply the best option at the moment.) The Flash plugin will likely remain ubiquitous for a while still, but will be increasingly marginalised, and find its place usurped by JavaScript, Canvas and SVG as support for these open technologies become near-universal.

Microsoft can only respond by getting IE users to upgrade to the latest versions as quickly as possible, and add support for VP8 video. This will suit Google and the web just fine, since IE9 promises to be on par with the competition in its support for modern web technologies. But they will no longer be able to act with the hubris of majority market share, and will be forced into a position of playing catch-up to faster-evolving browsers.

For Google, of course, this is essential for its vision of the browser as operating system4.

  1. I deliberately did not do a web search to check for other articles making the same prediction, as I wanted to think it through for myself.
  2. Here’s a possible weak point in my argument: Unlike H.264, VP8 currently does not benefit from hardware acceleration (especially important on mobile platforms.) If this proves to be a major factor, the timeframe may need to be longer to allow for the natural cycle of hardware upgrades. (Fortunately this is more rapid for mobile devices.)
  3. http://marketshare.hitslink.com/report.aspx?qprid=3
  4. The front-end of the Internet operating system, that is.