The Truth About HTML5 - a review

One of the best things about working with the web is the pace of change, which affords regular opportunities to use elegant new technologies.

But learning new techniques takes time. I'd say that on average I spend a day a week just keeping up with the knowledge I need to do my job. And as web technologies rise and fall it's all too easy to invest time learning new languages, frameworks and apps that end up not going anywhere. Many are rapidly superseded, or, despite everyone's best intentions, turn out to be deeply flawed.

I've been down a few dead ends over the years. If you are of a certain age you'll remember VRML, Virtual Reality Markup Language, which back in the late 90s promised a brave new world of 3D shopping malls and cityscapes. By 2000 it was all over. I then puzzled over Java applets for a while before adopting Flash, which was in turn superseded by web standards.

Aside from a few other false starts - Dreamweaver Contribute? VBScript? - my radar has gradually improved, and following the occasionally rocky path of web standards and open source content management has proved - more or less - the right way to go.

But lately I've begun to worry that a couple of technologies in which I've invested heavily for some time aren't actually going anywhere.

Two or three years ago I started making extensive use of microformats to help search engines to pick out chunks of content such as events, news entries, reviews and addresses. It's an elegant methodology. But I'm concerned that the launch of the Schema.org microdata initiative, with the support of Google, Microsoft and Yahoo, has effectively killed microformats.

And early last year I switched from HTML4 to HTML5, using as many of the new HTML elements as seemed prudent, including the major new sectioning elements <section>, <article>, <aside> and <nav>. The new tags promised to take some of the guesswork out of web page architecture, and there was a perception that they would render pages more accessible to screen readers and search engines.

I've certainly given it a go but the truth is I haven't found the new tags terribly easy to use, and it's worrying that screen reader support for them is still so dreadful.

Web designers I respect have highlighted all sorts of difficulties - see Roger Johannson's HTML5 sectioning elements, headings, and document outlines - but till now I've followed what seems to be the consensus advice and soldiered on towards the sunlit uplands, with the help of excellent resources such as HTML5 Doctor.

The Truth About HTML5

Last week I finally had an opportunity to read a book that's been on my list for some time, The Truth About HTML5 by Luke Stevens, which has helped put my thoughts in order.

This isn't another HTML5 book that takes the usual line of encouraging us to stick with all aspects of the spec, despite teething problems. It's a no-nonsense polemic, voicing strong opinions on the good and the bad of HTML5. Quite similar in tone to Jeffrey Zeldman's Designing with Web Standards. And crucially, Luke's conclusions are exceptionally well researched. Having read it a couple of times now I'm persuaded his essential arguments are sound. To summarise:

  1. Many of the new HTML5 elements, particularly the sectioning elements, are confusing to implement and have been designed not so much as to provide tags 'patching the cowpaths' of long standing markup patterns, but for the purposes requiring designers to craft complex document outlines that, because they aren't actually supported by screen readers, are effectively redundant.
  2. The philosophy driving HTML5's expansion of the basic HTML tag set is fundamentally wrong. HTML5 extends the range of tags with the intention of helping designers craft more semantic documents. It's actually better to keep HTML streamlined and instead layer micro-semantics on top of a few simple, easy-to-understand tags.
  3. The Schema initiative, supported by Google, Microsoft and Yahoo has, whatever the rights and wrongs of it, effectively killed well established micro-semantic methodologies, notably micro formats and RDFa.

Luke must be rather weary of reviewers focusing on just the parts of his book concerned with sectioning elements and semantics, because it's actually much wider in scope. There are, for example, extremely informative chapters on the state of HTML audio, video and forms, and the SVG and Canvas frameworks that have been rolled into the HTML5 brand. I learned a lot from these chapters, but the pros and cons of these new technologies have been well covered in other articles and books, and, inevitably, there's not much new Luke can say that hasn't already been said. But much of what he says about HTML5 tags and semantics is new, or at least it hasn't been expressed with anywhere near so much power. So that's what I'm going to concentrate on here.

The HTML5 sectioning elements are not what they seem

Marking up an HTML page has always been a subjective business. Till HTML5 we've had just a few rather open-ended tags to work with, workhorses like <div>, <p> and <li>. Using them to markup the infinite variety of content web pages can hold involves a fair amount of guesswork.

By expanding the toolbox of elements HTML5 seeks to help designers refine markup by providing new elements matching common markup patterns. But sites such as HTML5 Doctor testify to the difficulty designers have had in working out how to use the new tags appropriately. As Luke argues:

[T]hese new structural tags have created a strange, quasi-religious experience where you have to consult the high priests (the HTML5 gurus) for their interpretation of vague religious texts (the HTML5 spec) just to mark up a darn web page.

Using HTML4 we might markup a simple page like this:


<body>
 <div class="banner">
  <div>Logo</div>
  <ul>
   <li><a href="">Navigation link 1</a></li>
   <li><a href="">Navigation link 2</a></li>
   <li><a href="">Navigation link 3</a></li>
  </ul>
 </div>
 <div class="content">
  <div class="main">
   <h1>Main heading</h1>
   <p>Page content.</p>
   <h2>Main subheading 1</h2>
   <p>Page content.</p>
   <h2>Main subheading 2</h2>
   <p>Page content.</p>
  </div>
  <div class="sidebar">
   <div class="local-nav">
    <ul>
     <li><a href="">Link 1</a></li>
     <li><a href="">Link 2</a></li>
     <li><a href="">Link 3</a></li>
    </ul>
   </div>
   <div class="feature">
    <h2>Feature heading</h2>
    <p>Feature text.</p>
   </div>
  </div>
 </div>
 <div class="footer">
  <p>Footer text</p>
 </div>
</body>

The markup here indicates the broad sense of the document, but it's rather blunt: the tags don't really communicate much about the content they wrap.

If we take the new HTML5 tags at face value we might think that it would be appropriate to recast along these lines:


<body>
 <header>
  <div>Logo</div>
  <nav>
   <ul>
    <li><a href="">Navigation link 1</a></li>
    <li><a href="">Navigation link 2</a></li>
    <li><a href="">Navigation link 3</a></li>
   </ul>
  </nav>
 </header>
 <div class="content">
  <section class="main">
   <h1>Main heading</h1>
   <p>Page content.</p>
   <h2>Main subheading 1</h2>
   <p>Page content.</p>
   <h2>Main subheading 2</h2>
   <p>Page content.</p>
  </section>
  <aside class="sidebar">
   <nav class="local-nav">
    <ul>
     <li><a href="">Link 1</a></li>
     <li><a href="">Link 2</a></li>
     <li><a href="">Link 3</a></li>
    </ul>
   </nav>
   <section class="feature">
    <h1>Feature heading</h1>
    <p>Feature text.</p>
   </section>
  </aside>
 </div>
 <footer>
  <small>Footer text</small>
 </footer>
</body>

If you haven't studied the HTML5 spec closely you'd be forgiven for thinking that's how they work. The names of the tags suggest uses that correspond closely to long standing markup patterns.

But the example above is actually completely wrong. If we're going to use HTML5 as intended, the markup would have to be considerably more complicated, a bit like this:


<body>
 <header>
  <h1>Header heading</h1>
  <div>Logo</div>
  <nav>
   <h1>Nav heading</h1>
    <ul>
     <li><a href="">Navigation link 1</a></li>
     <li><a href="">Navigation link 2</a></li>
     <li><a href="">Navigation link 3</a></li>
    </ul>
   </nav>
  </header>
  <div class="content">
   <div class="main">
    <h1>Main heading</h1>
    <p>Page content.</p>
    <h2>Main subheading 1</h2>
    <p>Page content.</p>
    <h2>Main subheading 2</h2>
    <p>Page content.</p>
   </div>
   <aside class="sidebar">
    <h1>Aside heading</h1>
    <nav class="local-nav">
     <h1>Nav heading</h1>
      <ul>
       <li><a href="">Link 1</a></li>
       <li><a href="">Link 2</a></li>
       <li><a href="">Link 3</a></li>
      </ul>
     </nav>
     <section class="feature">
      <h1>Feature heading</h1>
      <p>Feature text.</p>
     </section>
    </aside>
   </div>
  <footer>
   <h1>Footer nav</h1>
   <small>Footer text</small>
  </footer>
</body>

Why? Because the new HTML5 tags should only be used if you understand how to use them to create a coherent document outline.

A document outline describes the deep structure of an HTML document. The outline is important because its used by screen readers to allow users to move seamlessly between the document sections, for example from primary navigation to main content, main content to sidebar, sidebar to footer, and so on.

It's relatively simple to structure an HTML4 document so as to produce a reasonable coherent outline, but rather harder to structure an HTML5 document appropriately. As Luke notes, nearly every HTML5 document on the web is structured incorrectly, and has broken outlines. To structure an HTML5 page correctly it's necessary to get complex rules right, such as:

  • Every sectioning element, <section>, <article>, <nav> and <aside> must have a heading. That's annoying for designers because we won't want many of those headings to appear. For example we'll rarely want headings appearing in navigation menus. So it's necessary to hide quite a few headings on every page using CSS.
  • Contrary to almost universal practice <header> and <footer> are not standalone sectioning elements like the other four: they should only be used to define areas within one of those standalone elements. (So my revised HTML5 markup is wrong after all, because I've got the <header> and <footer> elements on a par with the others.)
  • <section> should not be used to indicate the main content of the page, because, according to the spec, anything that isn't in <header>, <footer>, <aside>, or <nav> is by definition primary content. So we don't need an explicit element. Not very intuitive.
  • Use of <section> and <article> is highly ambiguous. Articles can be nested within articles; articles can be broken up by section; a section can be broken up into articles, which can in turn have individual sections.

So HTML5 pages are hard to structure. And even if you do get it right it's something of an academic exercise. At present no screen readers support HTML5 document outlining properly, and given the glacial pace of accessibility software development, won't be doing any time soon. The latest version of JAWS is alone in offering partial support. One might think that a well marked up HTML5 document might facilitate its indexing by search engines, but, again, as Luke points out, HTML markup doesn't make any difference to real world search rankings: quality of in-bound links is infinitely more important.

The hard truth that the HTML5 sectioning elements may well actually damage a page's accessibility is, for me, a quite devastating and decisive argument against their use.

Luke notes one other accessibility concern: the new tags are not recognised by Internet Explorer versions 6 to 8. There's a popular polyfill, the HTML5 Shiv, but that still leaves IE 6, 7 and 8 users with JavaScript disabled high and dry. Luke writes:

In 2010, Yahoo published the results of research they did into this very question—how many visitors do have JavaScript disabled? It turns out that 2.06% of visitors hitting Yahoo’s US websites (which includes significant non- US traffic) had JavaScript disabled, as did 1.29% in the UK, 1.46% in France, and 1.28% in Spain. (Brazil was an outlier with just 0.26%.).

Now I personally don't think this is necessarily a showstopper - if the sectioning elements were indeed useful then I think in the interests of progress this is an issue we would just have to cope with: the fact is that JavaScript is becoming more or less a condition of accessing today's web, despite the legitimate concerns of accessibility experts. But it's another problem.

Keep markup simple, layer semantics on top

I'm persuaded by Luke's argument that the very concept of expanding HTML to improve accessibility and refine semantics is the wrong way to go. If we had a perfect markup language that we could reference for every possible type of web page content, and all designers used it correctly, and all screen readers supported those tags, then a truly semantic, fully accessible web might be conceivable. But that's never going to happen: the HTML spec would balloon, designers would have to be relied upon to implement it correctly, and screen reading software would need to keep pace.

Instead we should persist with a minimal HTML spec consisting of basic tags - much like HTML4 - and add accessibility and semantic value on top of that foundation by using WAI-ARIA landmark roles and micro-semantics.

WAI-ARIA landmark roles

The WAI-ARIA spec offers some extremely useful keywords for indicating document landmarks, including 'banner', 'contentinfo' (equivalent to footer), 'main', 'navigation' and 'complementary' (sidebar), which can be added to basic HTML tags like <div> simply by using the 'role' attribute. As Luke notes:

They work right now in screen readers that support ARIA landmarks, such as JAWS version 10 screen reader, NVDA 2010.1 and VoiceOver on iPhone IOS4+.

So if we take out the HTML5 tags and simply use WAI-ARIA roles instead, we have a much simpler page:


<body>
 <div role="banner">
  <div>Logo</div>
  <ul role="nav">
   <li><a href="">Navigation link 1</a></li>
   <li><a href="">Navigation link 2</a></li>
   <li><a href="">Navigation link 3</a></li>
  </ul>
 </div>
 <div class="content">
  <div role="main">
   <h1>Main heading</h1>
   <p>Page content.</p>
   <h2>Main subheading 1</h2>
   <p>Page content.</p>
   <h2>Main subheading 2</h2>
   <p>Page content.</p>
  </div>
 <div role="complementary">
   <div role="nav">
    <ul>
     <li><a href="">Link 1</a></li>
     <li><a href="">Link 2</a></li>
     <li><a href="">Link 3</a></li>
    </ul>
   </div>
   <div class="feature">
    <h2>Feature heading</h2>
    <p>Feature text.</p>
   </div>
  </div>
 </div>
 <div role="contentinfo">
  <p>Footer text</p>
 </div>
</body>

It's much easier to markup, and, crucially, it's actually accessible.

Micro-semantics

Micro-semantics allow us to refine markup by annotating basic HTML tags with attributes and values that machines can read. I agree with Luke that this is a much more flexible methodology for enhancing semantics than extending HTML.

Over the past few years a couple of lively communities have formed around two approaches to micro-semantics: micro formats and RDFa. I've been using microformats myself for quite a while with the intention of making it easier for search engines to pick out reviews, news entries, events, addresses.

But, and this is another very important point emphasised in the book, the launch of Schema.org last year rather blows microformats (and RDFa) out of the water:

In mid-2011 a handful of engineers across Google, Microsoft and Yahoo! decided they didn’t like the current, community-driven approaches, and announced they were picking HTML5’s microdata as the winning infrastructure (ie the HTML attributes we should use to add micro-semantic data). And so they released Schema.org — a list of vocabularies, or 'schemas', that the major search engines would use to display richer search results.

It wasn't pretty:

And how did they launch it? With a blog post and a website that had all the pizzazz of a 'My First HTML Page' template knocked up during a hurried lunch break. And they also managed to single-handedly piss off everyone already invested in the process who’ve been evangelizing micro-semantics for years. Not a good start.

The big three have effectively enforced their preferred, in-house solution on the micro-semantics community. But it Schema microdata looks like the future:

It’s being used right now: Companies like eBay, IMDB, Rotten Tomatoes, and others have implemented Schema.org’s semantics and are benefiting from improved display of their search engine results right now.

Whether we like it or not we need to learn it and start using it. There's now a de-facto standard for micro-semantics supported by Google, Microsoft and Yahoo!, and effectively that's the only thing that matters. One of the most valuable characteristics of Luke's analysis is its lack of sentimentality. At the end of the day systems for marking up web pages only matter if the big players support them. In the case of accessibility it's crucial that our markup actually work with the main screen readers. And micro-semantics is only meaningful so far as its supported by the major search engines.

In conclusion

It's not often you read a web design book quite as opinionated as this. But importantly those opinions are very well researched, and I found most of them convincing (I recommend reading Bruce Lawson's review on HTML5 Doctor for a respectful but opposing perspective.)

I don't think Luke's book could have been better timed. With the near simultaeneous emergence of HTML5 and the mobile web designers and developers have had a lot to assimilate and make sense of in the past couple of years. The Truth About HTML5 cuts through much of the confusion and, so far as markup, accessibility and semantics is concerned, suggests a clear path forward:

  • Keep markup simple.
  • Use WAI-ARIA roles.
  • Embrace micro-semantics (which in practice means getting to grips with Schema.org).

Thanks to Luke Stevens for taking the considerable time to research, write and self-publish such a useful book.

Categories: