[Bug 6609] New: negative keywords-not meta tags

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6609

           Summary: negative keywords-not meta tags
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: HTML 5: The Markup Language
        AssignedTo: mike@w3.org
        ReportedBy: Nick_Levinson@yahoo.com
         QAContact: public-html-bugzilla@w3.org
                CC: public-html@w3.org


Rumor to the contrary notwithstanding, keyword meta elements do work, albeit
within limits. I did a test and also found confirmatory recent discussion
online about major search engines.

Insofar as they work, what's needed is a way to clarify relevance to one theme
by distinguishing it from another. Negative keywords would thus be helpful. For
example, a page about "virus" could be about computer viri or biological viri
but usually won't be about both. While major search engines may be intelligent
enough to distinguish in that well-known case, new subjects may not be well
known to search engine managers, and thus an author may prefer to control how
their theme is understood from the date of going live. A negative keyword could
quickly clarify the theme of the page.

Using body text may not be adequate. Consider a doctor writing a carefully
exhaustive article about aspirin's less-well-known uses and thus without
discussing headaches, since almost everyone already knows about that use. Being
careful, the doctor writes in the introduction that "the article will not
discuss headaches." Someone does a search for "aspirin NOT headache". They
should get that paper but they do not. A negative metatag may aid a search
engine in understanding the doctor's thematic intention and thus in supplying
what a searcher is seeking. Search engine designers would have to do some
careful work to handle the aspirin case as intended but they could do that far
more easily if we page authors have an HTML facility that would give search
engines something to work with.

Keyword metatags long ago lost favor after their widespread abuse. However,
they are used by search engines; and I don't see how negative keywords are any
more susceptible to abuse than positive ones. Further, a page author could use
either positive or negative keywords without having to offer both so there'd be
no unwanted increase in the designer's workload. Optimizers could use
essentially the same tools to generate either kind of keyword. The only risk, I
think, is putting a word in both, but I think that would only be an author's
error, so each search engine could prepare for that eventuality any way they
see fit and editing software and validators could choose to alert an author to
the apparent conflict without requiring an author to change an element. Thus,
if a page author uses the same word in both but with differing case because one
represents a common product and the other a brand name the page author would
take the risk of being misunderstood by a search engine while a search engine
might observe the case distinction and consider how to handle it. The page
author could also use longer phrases either positively or negatively and thus
ease distinguishing themes.

Because of the relevance of Boolean NOT searches and for relative brevity and
to avoid an abbreviation that may not be familiar to speakers of other
languages, I propose calling it "keywords-not". I'm shortly proposing it in the
Wiki at http://wiki.whatwg.org/wiki/MetaExtensions. The synonyms I'll list
there do not relate to legacy content, of which I know none, but are what
people would likely think of. I'm preparing to include keywords-not in a
website I'm designing, but I don't know when the site will go live. My method
will probably be to use a separate meta tag following the metatag for keywords
used positively, since they can't be combined into one element, but I see no
reason to require any position other than that both go into the head, as one
tag already must. E.g.,

<head>
. . . . .
<meta name="keywords" content="aspirin,heart,blood" />
<meta name="keywords-not" content="headache" />
. . . . .
</head>
<body>
<h1>Aspirin Except For Headaches</h1>
<p>. . . .</p>
</body>

This responds to <http://www.w3.org/TR/html5/single-page/>, Working Draft, 12
February 2009. For Bugzilla, I selected all OSes; I develop on Win95a and 98SE
and Linux and want pages to work on whatever users use.

Thank you.

-- 
Nick


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Sunday, 22 February 2009 08:14:02 UTC