Theresa O’Connor

Distributed extensibility

Generativity is the ability of a self-contained system to provide an independent ability to create, generate or produce content without any input from the originators of the system. The Web is perhaps the most wildly successful generative system made by humans. Seriously, think about it: there are at least 25 billion web pages, authored by millions of people, only a vanishingly small fraction of whom personally consulted with Tim Berners-Lee et al. while authoring markup.

But why has the Web been so wildly successful? What provides the Web with this generative character?

The ability for authors to easily see how other authors accomplish things (⌘U), to easily borrow or steal such techniques (⌘X & ⌘V), and to immediately get feedback about how well their edits work in browsers (⌘R), this is the engine of Web generativity. This is why the Web works. View Source has a posse.

And, of course, HTML—the language of the Web—is what people cut and paste in the above steps. On disk, HTML seems to be a relatively simple text format made up of angle brackets that represent a tree of elements, some with attributes and content. <a href"http://zombo.com/">zombo com</a>, that sort of thing.

A quick note about XML. XML is also a relatively simple text format made up of angle brackets that represent a tree of elements, some with attributes and content. XML looks so much like HTML, it’s easy to forget that they’re actually entirely different things. Your eyes deceive you; don’t trust them.

HTML is a fairly small language. It has a vocabulary of about 100 elements and about 175 attributes. But what if that’s not enough? How can authors go about expressing novel semantics in HTML?

Adding meaning via class="" and rel=""

One way of extending a language like HTML would be to have a mechanism for altering the meaning of existing elements. With such a mechanism, authors could layer additional semantics on top of the existing language. This can already be done with several existing features of HTML: the class="", id="", and rel="" attributes, <meta>, etc. In fact, the practice of using semantic class names to imbue markup with additional meaning has been well-established among web authors for many years. For instance, you might mark up a blogroll within <ul class="blogroll">, and you could mark up normative statements like so: <em class="RFC2119">MUST</em>.

reference disambiguation post

Adding new elements and attributes

OK, enough about fiddling with the meaning of elements and attributes we already have. Using custom elements and custom attributes seems to be what some people really mean when they say “distributed extensibility.”

Of course, such extensibility suffers from at least as many problems as does extensibility via attribute values.

centralized v. distributed

centralized: HTML WG

distributed: ?

distributed extensibility (ISSUE-41)

the promise of XHTML, circa 1998

Distributed extensibility ends up reducing to a power-struggle between browser vendors who want to dictate what the vocabulary of the Web is—and content creators who do not want to cede this right entirely to the browser.

TV Raman, W3C TAG

There is no vast, browser-wing conspiracy.

Really.



Honest.

<apple>: computer or fruit?

Coping mechanisms

1. punt

<blink> & <marquee>

<canvas>

We’re stuck with them.

Support Existing Content

2. Sufficiently Interesting(TM) prefixes

<moz-blink>, <ms-marquee>, <apple-canvas>

(Think Python package names)

-moz-border-radius

3. Controlled prefix minting

DNS

(Java package names)

<com.netscape.tags.blink>
<com.microsoft.tags.marquee>
<com.apple.tags.canvas>

somewhat ugly

4. URI-based qualification

foo

becomes

{http://example.com/ns#, foo}

Names become {uri, term} pair

Gets ugly FAST

<{http://example.com/ns1#, foo}>
  <{http://example.com/ns2#, bar}
   {http://example.com/ns3#, baz}="quux"/>
</{http://example.com/ns1#, foo}>

Namespaces in XML

Prefix-based indirection

<foo xmlns="http://example.com/ns1#"
     xmlns:alpha="http://example.com/ns2#"
     xmlns:beta="http://example.com/ns3#">
  <alpha:bar beta:baz="quux"/>
</foo>

Common case

<foo>
  <alpha:bar beta:baz="quux"/>
</foo>

Prefix is ephemeral

<foo xmlns="http://example.com/ns1#"
     xmlns:alpha="http://example.com/ns1#">
  <bar/>
  <alpha:bar/>
</foo>

Lifts part of element name into metadata

Data and metadata

The accuracy of metadata is inversely proportional to the square of the distance between the metadata and the data which it proports to describe.

Ruby’s Postulate

HTTP headers

v.

<meta name="http-equiv">

authoritative metadata

document fragments

  1. ⌘U
  2. ⌘X & ⌘V
  3. ⌘R

engine of metacrap

view source has a posse

There is no such thing as risk-free copy and paste.

Shelley Powers

Disambiguation redux

The only risk is copying and pasting them into a document that doesn’t provide namespace definitions for the prefixes[…]

[… or are] you thinking that someone will be using different namespaces but the same prefix? Come on — do you really think that will happen?

Shelley Powers

Yes. Yes, it will.

morons and assholes

SearchMonkey v1: hard-coded prefixes

Google’s handling of RDF blocks for license declarations is all done with regular expressions instead of actually parsing the namespaces[…]

Ian Hickson

Feed parsing

And frankly, few people will be doing copying and pasting. Most metadata will probably be added either as part of an underlying tool[...]

Shelley Powers

argumentum ad adminiculum

the tools won’t save us

Hand authoring

[E]arly adopters having difficulties is indicative of a feature that will be significantly more problematic when used by the (less well informed/ competent) population of authors as a whole.

James Graham

1. Adding elements and attributes:

require wide (WG) review

Prefix-based indirection within a URI-based extensibility scheme: benefits not worth the cost

2. Augmenting existing elements and attributes

Profiling is overkill; head@profile unused in practice by the very communities it was for

Sufficiently-Interesting, Google-able, semantic class names appear to get us all the disambiguation we need

Java-style DNS-derived names for the extra-paranoid

Upcoming microdata attribues @item, @itemprop, @subject avoid colloding with class names in the wild

Look ma, no namespaces

This post is a cleaned-up version of a talk I gave at BarCamp San Diego this past spring.

Whether and how HTML5 should allow decentralized parties to create their own languages[…] and exchange them in[…] text/html serializations. is being tracked by the HTML WG as ISSUE-41 (decentralized-extensibility).

http://lists.w3.org/Archives/Public/public-html/2008Aug/0134.html