Marking up RFC 2119 text in HTML
I’ve updated my proposal; see “Revisiting RFC 2119 markup” after reading the below.
Specifications from the IETF and
other organizations often make use of RFC
2119’s language for expressing requirements. To
use RFC 2119, authors… should
incorporate this phrase near the beginning of their
document:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
(This is called RFC 2119 boilerplate.)
After including the boilerplate, specifications make a bunch of statements—normative, informative, and definitional—utilizing RFC 2119 vocabulary for the normative parts.
Recently, on the microformats-new mailing list, Dr. Orlovsky asked about creating a microformat for marking up RFC 2119 terms. Scott Reynen thinks that creating a microformat for this would probably be overkill, and that simply authoring your spec as POSHly as possible should suffice. I agree.
So let’s try to figure out what nice, semantic markup for RFC 2119 text should look like. Our goal is two-fold:
- to mark up the boilerplate text quoted above;
- to mark up each instance where we use an RFC 2119 word.
Starting with 1, here’s the bare
minimum: I’ve wrapped the boilerplate inside a <p>
element, and linked to
the RFC:
<p>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
<a href="http://www.ietf.org/rfc/rfc2119.txt">RFC 2119</a>.
</p>
That link is begging to be souped up with some link relations. HTML 4 defines several link relations, two of which seem relevant in this case:
- Glossary
- Refers to a document providing a glossary of terms that pertain to the current document.
- Help
- Refers to a document offering help (more information, links to other sources information, etc.)
Of the two, glossary
most
closely captures the semantic we want, so let’s use it.
Note, though, HTML 5 has help
but has dropped glossary
.
(That is, the
current version of the draft lacks it—it may include
it before publication) Thus, I’ve placed both glossary
and help
on the link.
<p>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
<a href="http://www.ietf.org/rfc/rfc2119.txt"
rel="help glossary">RFC 2119</a>.
</p>
Next, we should mark up each RFC 2119 term in the boilerplate.
We want to say, “look to this paragraph for the meaning of
this term.” Brian
Suda suggested the use of the <dfn>
element, which HTML
4.01 provides for precisely this purpose:
- DFN:
- Indicates that this is the defining instance of the enclosed term.
HTML 4 gives us very little guidance with regard to <dfn>
; the above quote is actually all
it has to say on the matter! So how are we to know what the
definition for the term is? In Brian’s
post, he copied the definitions from RFC 2119 and placed
them into title
attributes on their
<dfn>
elements. This is a
perfectly reasonable thing to do, given HTML 4, but I think
it’s worrisome for two reasons:
-
I’d like the definitions themselves to only exist in RFC 2119. When we invoke RFC 2119, we turn over the meanings of these terms to it; semantically, the definitions should be external. DRY and all that.
Also, by duplicating its definitions, we run the (admittedly very small) chance of bit-rot.
-
While HTML 4 didn’t provide much in the way of useful material on
<dfn>
, the HTML 5 certainly does. HTML 5 defines an algoritm for determining what the term is and what the definition is. Brian’s use ofdfn/@title
breaks under the HTML 5 algorithm:If the
title
attribute of thedfn
element is present, then it must only contain the term being defined.I think it’s worth striving to be both forwards- and backwards-compatible, so putting the definition in
dfn/@title
seems problematic.
Here’s where we are now:
<p>
The key words "<dfn>MUST</dfn>", "<dfn>MUST NOT</dfn>",
"<dfn>REQUIRED</dfn>", "<dfn>SHALL</dfn>", "<dfn>SHALL NOT</dfn>",
"<dfn>SHOULD</dfn>", "<dfn>SHOULD NOT</dfn>",
"<dfn>RECOMMENDED</dfn>", "<dfn>MAY</dfn>", and "<dfn>OPTIONAL</dfn>"
in this document are to be interpreted as described in <a
href="http://www.ietf.org/rfc/rfc2119.txt"
rel="help glossary">RFC 2119</a>.
</p>
Let’s move on to goal 2: how should we mark up the individual instances of RFC 2119 terms that appear elsewhere in the document?
Here’s how HTML
5 associates terms with <dfn>
elements:
Any
span
,abbr
,code
,var
,samp
, ori
element that has a non-emptytitle
attribute whose value exactly equals the term of adfn
element in the same document, or which has notitle
attribute but whosetextContent
exactly equals the term of adfn
element in the document, and that has no interactive elements ordfn
elements either as ancestors or descendants, and has no other elements as ancestors that are themselves matching these conditions, should be presented in such a way that the user can jump from the element to the firstdfn
element giving the defining instance of that term.
Out of those possibilities (the <span>
, <abbr>
, <code>
, <var>
, <samp>
, or <i>
elements), only <span>
and <i>
are semantically compatible with
what we’re trying to do.
I think , and have suggested it be added to this list in an
email to the WHATWG list. I’ve updated my
proposal to use strong
would be best for RFC 2119
termsem
; see “Revisiting RFC 2119
markup.”
If browsers supported HTML 5’s term/definition association
algorithm, we’d be done now. However, it sounds like it’d
be pretty hard to support the algorithm as specced, so this
is likely to change before publication. Let’s help things
out a bit with a touch of @class
. A nice semantic
class name that fits our needs is defined
.
I’ve written up an XMDP for it.
So, summing up, here’s what POSH RFC 2119 use looks like:
<p> The key words "<dfn>MUST</dfn>", "<dfn>MUST NOT</dfn>", "<dfn>REQUIRED</dfn>", "<dfn>SHALL</dfn>", "<dfn>SHALL NOT</dfn>", "<dfn>SHOULD</dfn>", "<dfn>SHOULD NOT</dfn>", "<dfn>RECOMMENDED</dfn>", "<dfn>MAY</dfn>", and "<dfn>OPTIONAL</dfn>" in this document are to be interpreted as described in <a href="http://www.ietf.org/rfc/rfc2119.txt" rel="help glossary">RFC 2119</a>. </p>
…<p> … The frob <strong class="defined">MUST</strong> be frobnicated vigorously until done. … </p>