Pages

Wednesday, August 22, 2012

Is Nofollow Standard?

For some reason, I recently wondered if I should be annoyed that Web site owners are generally expected to include the non-standard rel="nofollow" attribute in links in their HTML. The problem with this thought is that calling the attribute “non-standard” is not particularly accurate. It depends on exactly what you mean by “non-standard.”

First, to be absolutely clear, adding rel="nofollow" to links is perfectly acceptable according to the HTML 4.01 standard. It passes the W3C’s validator. Frankly, though, I doubt that many people would lose sleep if it didn’t validate.

With that said, let me explain what I mean by “It depends.” When I first turned to the spec to fact-check myself, I came across the list of link types (which are the acceptable values for the rel attribute in a link), and “nofollow” isn't on it. The "nofollow" value does appear in the spec (Scroll to “Robots and the Meta Element”), but in the context of <meta name="robots" content="nofollow">, not the rel attribute. Besides, that section is “informative, not normative.”

On the other hand, there is nothing in the spec that prohibits Web developers from creating their own link types. (Whether it’s actually a good idea is another matter, which isn’t relevant here due to the widespread adoption of nofollow.) So anyone can use rel="nofollow". But there’s a catch. Here's what the section on link types has to say:
Authors may wish to define additional link types not described in this specification. If they do so, they should use a profile to cite the conventions used to define the link types. Please see the profile attribute of the HEAD element for more details.
I have to admit that this is the first I’ve even heard of the profile attribute. Is this just a case of my own ignorance? Well, yes, but then this isn’t the most common attribute. I didn’t see it on Wikipedia, even though Wikipedia uses nofollow. My Blogger blog doesn’t use it, either. (I also didn’t see it on a handful of other high-profile sites I checked, but to my surprise they didn’t use nofollow, either, so I won’t bother listing them.) I did notice it on my Wordpress blog, though. But not so fast: In that last case, the profile attribute points to http://gmpg.org/xfn/11, which does not list nofollow.
This isn’t necessarily a problem. The above quotation from the spec says that authors “should use a profile,” and in this context, “should” has a specific meaning. The spec says:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. However, for readability, these words do not appear in all uppercase letters in this specification.
RFC 2119, in turn, has this to say:
3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course.
While I’m skeptical as to whether most site owners have actually “carefully weighed” the “full implications” of using nofollow without an accompanying profile, It’s not really that important in the scheme of things. The point is that there are “valid reasons” to use link types, and that means the spec clearly allows nofollow to be implemented in exactly the way I have observed.

All this is just a long-winded way of saying that nofollow is perfectly acceptable according to the standard, and only “non-standard” in the sense of not being explicitly included in the standard itself. The standard absolutely allows for it.

(It’s also in the W3C’s draft HTML5 spec and in WHATWG’s Living Standard, so if you’re the HTML5 type, rel="nofolow" is definitely standard. For now.)

The moral of the story, if there is one, is “Always check your facts before you complain about something.”

No comments: