DOM Consistency

by Theresa O’Connor on 8 April 2009

This is in reply to Sam Ruby’s HTML Reunification—go read that first. (When I tried to post this as a comment, his comment system didn’t seem to be working.)

As a web developer, one of the most appealing (if not the most appealing) parts of HTML5 as currently defined is that HTML is a DOM language with two serializations. I don’t have to care if the browser (or other tool) used an XML parser or an HTML parser, because I can write code that works on the DOM that comes out at the other end. This applies to both browser-based scripting as well as other tools I write that process web content (thank you, html5lib).

Whenever the DOM differs between the two serializations, web developers suffer. Their code grows different code paths, etc. Basically, the DOM Consistency principle is probably the most important of the design principles from the perspective of someone doing serious DOM scripting in modern web applications. The extent to which other technologies intended to be part of the web platform strain this principle is the extent to which they are unsuited to be part of the web platform.

Don’t relax the DOM Consistency principle. That would be bad.


  1. [ I believe the comment problem is fixed. Meanwhile, your post has been excerpted so that others can follow the conversation back to this point. ]

    Is there any chance I can talk you into comparing notes with the RDFa and ubiquity-xforms folks?

    As I understand it, they each have infosets which can't be serialized into HTML5 as it is defined today, but can be serialized into XHTML5. Furthermore, that serialization will round trip in that if it is parsed by either XHTML5 or HTML5 and then reserialized you get back the original. Finally, they have libraries which deal with the differences.

    Sam Ruby, 8 April 2009

  2. Philip Taylor

    I've not looked at RDFa libraries, but ubiquity-xforms doesn't seem to deal with the differences beyond a pretty basic level. Compare xmlns:xf vs xmlns:xf2, which should be equivalent in a proper XML-like namespace-aware system. In Opera 9.6, there's no styling in the second example; in Firefox 3 there's not even any form controls. In IE7 there's no styling in the second example, and in both examples I get a script error and no data in the form fields. So this doesn't look like an adequate demonstration that it's feasible to deal with the differences via scripting. (And this is before thinking about author scripts dynamically modifying the page.)

    Philip Taylor, 8 April 2009

  3. "As I understand it, they each have infosets which can't be serialized into HTML5 as it is defined today, but can be serialized into XHTML5."

    Nope, the opposite.

    ubiquity-xforms has infosets that have elements that have e.g. "xf:input" as the local name and the namespace of HTML elements as the namespace. Such an infoset is not serializable as XHTML5, because XML doesn't allow the colon in local names.

    Henri Sivonen, 8 April 2009

  4. Henri, you've interpreted my statement in a quite different manner than I intended it. Permit me to clarify: XForms defines elements, such as input in the "" namepace, that can not be serialized into HTML5 as it is defined today.

    Sam Ruby, 8 April 2009

  5. Sam, your proposal doesn't make it possible to serialize elements from the "" namespace as text/html.

    Henri Sivonen, 10 April 2009

  6. Henri: Agreed. Instead I said "serialization will round trip". Clearly a much lower bar.

    Sam Ruby, 10 April 2009