Distributed extensibility
Generativity
is the
ability of a self-contained system to provide an independent
ability to create, generate or produce content without any input
from the originators of the system.
The Web is perhaps the most wildly successful generative system
made by humans.
Seriously, think about it: there are at
least 25 billion web pages, authored by millions of people,
only a vanishingly small fraction of whom personally consulted
with Tim Berners-Lee et al. while authoring markup.
But why has the Web been so wildly successful? What provides the Web with this generative character?
- ⌘U (View Source)
- ⌘X (Cut)
- ⌘V (Paste)
- ⌘R (Refresh)
The ability for authors to easily see how other authors accomplish things (⌘U), to easily borrow or steal such techniques (⌘X & ⌘V), and to immediately get feedback about how well their edits work in browsers (⌘R), this is the engine of Web generativity. This is why the Web works. View Source has a posse.
And, of course, HTML—the language of the Web—is
what people cut and paste in the above steps. On disk, HTML
seems to be a relatively simple text format made up of angle
brackets that represent a tree of elements, some with
attributes and content. <a
href"http://zombo.com/">zombo
com</a>
, that sort of thing.
A quick note about XML. XML is also a relatively simple text
format made up of angle brackets that represent a tree of
elements, some with attributes and content.
XML looks so
much like HTML, it’s easy to forget that they’re actually
entirely different things. Your eyes deceive you; don’t
trust them.
HTML is a fairly small language. It has a vocabulary of about 100 elements and about 175 attributes. But what if that’s not enough? How can authors go about expressing novel semantics in HTML?
Adding meaning via class=""
and rel=""
One way of extending a language like HTML would be to have a
mechanism for altering the meaning of existing elements. With
such a mechanism, authors could layer additional
semantics on top of the existing language. This can already
be done with several existing features of HTML: the
class=""
, id=""
, and
rel=""
attributes, <meta>
, etc.
In fact, the practice of using semantic
class names to imbue markup with additional meaning has been
well-established among web authors for many years.
For instance, you might mark up a blogroll within <ul
class="blogroll">
, and you could mark up
normative statements like so: <em class="RFC2119">MUST</em>
.
reference disambiguation post
Adding new elements and attributes
OK, enough about fiddling with the meaning of elements and attributes we already have. Using custom elements and custom attributes seems to be what some people really mean when they say “distributed extensibility.”
Of course, such extensibility suffers from at least as many problems as does extensibility via attribute values.
centralized v. distributed
centralized: HTML WG
distributed: ?
distributed extensibility (ISSUE-41)
the promise of XHTML, circa 1998
Distributed extensibility ends up reducing to a power-struggle between browser vendors who want to dictate what the vocabulary of the Web is—and content creators who do not want to cede this right entirely to the browser.
TV Raman, W3C TAG
There is no vast,
browser-wing conspiracy.
Really.
Honest.
<apple>
: computer or fruit?
Coping mechanisms
1. punt
<blink>
& <marquee>
<canvas>
We’re stuck with them.
2. Sufficiently Interesting(TM) prefixes
<moz-blink>
,
<ms-marquee>
,
<apple-canvas>
(Think Python package names)
-moz-border-radius
3. Controlled prefix minting
DNS
(Java package names)
<com.netscape.tags.blink>
<com.microsoft.tags.marquee>
<com.apple.tags.canvas>
somewhat ugly
4. URI-based qualification
foo
becomes
{http://example.com/ns#, foo}
Names become {uri, term}
pair
Gets ugly FAST
<{http://example.com/ns1#, foo}>
<{http://example.com/ns2#, bar}
{http://example.com/ns3#, baz}="quux"/>
</{http://example.com/ns1#, foo}>
Namespaces in XML
Prefix-based indirection
<foo xmlns="http://example.com/ns1#"
xmlns:alpha="http://example.com/ns2#"
xmlns:beta="http://example.com/ns3#">
<alpha:bar beta:baz="quux"/>
</foo>
Common case
<foo>
<alpha:bar beta:baz="quux"/>
</foo>
Prefix is ephemeral
<foo xmlns="http://example.com/ns1#"
xmlns:alpha="http://example.com/ns1#">
<bar/>
<alpha:bar/>
</foo>
Lifts part of element name into metadata
Data and metadata
The accuracy of metadata is inversely proportional to the square of the distance between the metadata and the data which it proports to describe.
HTTP headers
v.
<meta name="http-equiv">
document fragments
- ⌘U
- ⌘X & ⌘V
- ⌘R
engine of metacrap
There is no such thing as risk-free copy and paste.
Disambiguation redux
The only risk is copying and pasting them into a document that doesn’t provide namespace definitions for the prefixes[…]
[… or are] you thinking that someone will be using different namespaces but the same prefix? Come on — do you really think that will happen?
Yes. Yes, it will.
SearchMonkey v1: hard-coded prefixes
Google’s handling of RDF blocks for license declarations is all done with regular expressions instead of actually parsing the namespaces[…]
Feed parsing
And frankly, few people will be doing copying and pasting. Most metadata will probably be added either as part of an underlying tool[...]
argumentum ad adminiculum
the tools won’t
save us
[E]arly adopters having difficulties is indicative of a feature that will be significantly more problematic when used by the (less well informed/ competent) population of authors as a whole.
1. Adding elements and attributes:
Prefix-based indirection within a URI-based extensibility scheme: benefits not worth the cost
2. Augmenting existing elements and attributes
Profiling is overkill; head@profile
unused in
practice by the very communities it was for
Sufficiently-Interesting, Google-able, semantic class names appear to get us all the disambiguation we need
Java-style DNS-derived names for the extra-paranoid
Upcoming microdata
attribues @item
, @itemprop
,
@subject
avoid colloding with class names in the
wild
Look ma, no namespaces
This post is a cleaned-up version of a talk I gave at BarCamp San Diego this past spring.
Whether and how HTML5 should allow decentralized
parties to create their own languages[…] and exchange them
in[…]
is being
tracked by the HTML WG
as ISSUE-41
(decentralized-extensibility).
text/html
serializations.