Theresa O’Connor

HTML5: normativity & authoring guides

I think if the [HTML WG] ends up delivering an authoring guide, it might likely be of more direct value to the wider community of authors and Web developers than anything else we produce.

Michael(tm) Smith, 19 November 2008

The HTML5 spec is big. Really big. When formatted as a PDF for US letter paper, revision 3768 of the spec (the latest when I wrote this) weighs in at 662 pages. Not only is it big, but it’s full of complicated algorithms and other browser implementation details that even standards-aware web authors probably don’t care about. Because of this, there are several efforts—both within the working groups and without—to provide more easily digestible reference material on HTML5 for web authors. For instance:

If you’re a web author with questions about HTML5, I highly recommend taking full advantage of these resources.

Two of the HTML5 Super Friends (Jeffrey Zeldman in HTML5 For Smarties and Eric Meyer in HTML5 And You) have recently called web authors’ attention to what might appear to be another authoring guide effort, Michael(tm) Smith’s draft titled HTML 5: The Markup Language (AKA H:TML). With the extra attention the HTML5 effort is getting due to the Super Friends, I think it’s important to clarify the ways in which H:TML might not be the authoring guide you’re looking for. In fact, it’s not an authoring guide at all.

Both Zeldman and Meyer appear to believe Mike’s draft to be a stripped-down version of the HTML5 spec. Zeldman describes it as edited presentation of the spec, while Meyer says that it’s a version of the HTML5 draft with[…] implementor sections stripped out. While it’s true that parts of Mike’s document are programattically derived from the spec, it’s not a stripped-down profile like the authors’ view at all, but a separate spec which sets out to normatively define the elements and attributes of HTML5. Quoting Mike (emphasis mine):

The intent of it is for it to serve as a normative definition of the syntax and structure and semantics of HTML, without attempting to be an detailed authoring guide.

Aside: Whether or not the HTML WG [should] produce a separate document that is a normative language reference is being tracked by the WG as ISSUE-59 (normative-language-reference).

Many object to publishing Mike’s document with normative language in it, including participants from major browser vendors like Mozilla and Apple. Several of these objections are like Henri Sivonen’s: [The HTML5 spec] already normatively defines the HTML5 markup language. It doesn’t make sense for the working group to compete with itself by publishing two normative documents about the same thing. Opera’s Lachlan Hunt explained it like so:

This is a problem because it duplicates and restructures a lot of information from the spec itself, but not always by copying it verbatim[…] Since both would be normative, what would happen in the event of a conflict? Although, ideally, there shouldn’t be any [conflicts] by the the time they’re finalised, it’s still possible. In fact, it’s happened in one obvious case already. c.f. the repetition template attributes[…]

Håkon Wium Lie of CSS fame thinks this synchronization problem is unsolvable: I don’t think any amount of QA effort can keep two different documents describing the same matter from being in conflict with each other.

Don’t get me wrong—there’s a lot to be said for H:TML as a document about HTML5. Many of the folks working on HTML5 support publishing it as an informative syntax guide, including Lachlan Hunt and James Graham from Opera, Henri Sivonen from Mozilla, and Maciej Stachowiak from Apple. Mike himself said that it’s not outside of the realm of possibility that we could eventually end up deciding that it should only be informative and not normative[…] I suppose that even if we were not to make it normative, this document could also have some value as an informative source. I should note, however, that it’s very unlikely that Mike would continue to work on the doc were it merely informative.

Personally, I really like Mike’s document—I’ve found it to be useful as a quick reference to HTML5’s elements and attributes (although Simon Pieters’s HTML5 Elements and Attributes might do a better job of that). I can imagine H:TML being a more accessible guide to the language for people (AKA assholes) who are used to reading documents written in the style of previous W3C specs. In fact, I think that might explain the appeal of H:TML to standardistas: they’re accustomed to using normative text (the HTML 4.01 spec) as quick-and-easy reference material, because contemporaneous authoring guides—such as the ubiquitous-in-search-results w3schools—are of such low quality. Unlike those dark days, we have several high-quality documents (listed above, and others) that standardistas can refer to as references while authoring HTML5.

I think it would be a mistake for the WG to publish H:TML with normative text in it. I’d prefer it to be reworked into auxiliary author guide material. Specifically, I’d love it if H:TML and Lachy’s authoring guide could be smerged somehow. I imagine Mike’s build system could be used to help keep the authoring guide in better sync with Hixie’s frequent spec edits.

In summary, while I applaud the HTML5 Super Friends’ efforts to bring about greater awareness of HTML5 in the web designer and developer communities, I’d prefer it if they recommended Lachy’s web developer’s guide to HTML 5 or the authors’ view of the spec as reference material for standards-aware web authors.