Tim Berners-Lee
Date: 2012-10-15, last change: $Date: 2013-03-06 22:21:06 $
Status: personal view only. Based on a message"Working without being ambushed by Ambiguity" sent to the TAG list on 15 Oct 2012. Editing status: probably has typos.

Up to Design Issues


Working despite Ambiguity

(I guess this is one of these things which is perennial. I have not studied much of the history of philosophy but I do find one needs to be prepared to jump in in order to keep the course of what I otherwise regard as engineering still on track... as I have said before, this is philosophical engineering we are doing...)

The point which David Booth has brought up, not for the first time, and which Pat has expounded very well, that no symbol can ever have completely unambiguous meaning is, yes, quite valid. There are several such points which we have to go over every now and again (preferably out of the critical path of working group work) and agree we all understand it and agree that we can all continue in practice without it. And indeed continue in theory without it as well. And Pat, you have lead us through that journey from philosophical foundationlessness to logical foundations before and maybe you can help us again or just point to where you did before. And Graham you make an important distinction.

There are lots of models, I am sure, one can make of ambiguity and language and communication which will allow us to do this, and they may differ in how they work and it probably is best that we agree they exist but not get hung up arguing about which one is "right". They will all be imperfect, but good enough.

Physics Analogy

I have before and will now compare this with classical and quantum physics. We go through our young lives with classical physics, and are taught that a billiard ball has a given diameter, a given mass, and a given position and a velocity, all of which we can measure. We learn how to build houses and drive cars all based on this physics. And then we get older and people tell us that actually a billiard ball does not have a well defined diameter. Not only, if you look closely at it edge, is it a mass of atoms, but also those atoms in fact have only a probability of being in any one place at any one time. And even the billiard ball itself, if we measure its position too accurately in principle we can only do it by losing knowledge of its momentum. Now the naively pedantic response may be to insist, that everything we learned in Classical Physics be thrown away. This is the response which says that it is no use talking about the position of a ball anyway, as its atoms could in fact just randomly move 3 inches east at the same time. So it is that those who see that in a deep enough analysis almost given term admits of ambiguity might say that the Architecture of the WWW" is useless as it says URIs should only be used to denote one thing.

But in fact we really need to use the physics we have learned. We need to keep all we know about the way billiard balls interact at human scale. Even though we have to be aware of quantum effects every now and again, when we find light being diffracted through a grating instead of being scattered, or electrons tunneling though a thin layer, we have ways of going into the details of the quantum effects where appropriate, and interfacing that thinking with the classical thinking. So it is with denotation by names. We need to keep the models of ambiguity in our back pocket and bring them out when we need them, but not use them to ambush any discussion in the classical form. We should not use them to suggest that any use of the idea of a name having something it denotes is to be thrown away.

Ok, so in physics there is maths which allows you to show that in the large scale, the quantum model of the world in fact gives rise, to a very high degree of approximation, to the classic model.

Various Ways of Dealing with Ambiguity

So now how do we construct a practical ability to use terms like the thing that a string denotes from the morass of ambiguity which is communication? There are a number of models, none of which is perfect. What have we, then, as examples?

  1. The Authoritative Dictionary model. The guy who puts together the Oxford English Dictionary just knows more than anyone else about how people use words, and we all make sure we use words just as they are described there. If we don't find a use in it we want, we sent him a note.

    (This is perhaps the model we have in kindergarden)

  2. The naive "meaning as use" model, sometimes blamed on Wittgenstein. You use terms however you like, as meaning is use, and so you can never be using them inconsistently with their meaning.

    (Sometimes this may be -- who knows -- a response to realizing that the model 1 is not perfect)

  3. The Expertise model. The OED applies as above, but also we send lawyers to school for several years to agree on a set of terms which are more closely defined so we can use them in cases where we need unambiguity, like in contracts. To know what something means, ask a lawyer and if necessary go to court to add enough extra definition to be able to continue.

    Pat describes some of the great lengths to which lawyers sometimes have to go

  4. The Areas of Expertise model. As above, but add in groups of people with expertise in given areas. Ask them to write anything you need in that area, and in court bring them in as expert witnesses.

  5. The Standards Committee model. A committee writes a standard for use in a particular area writes it using a mixture of words which it feels are well enough defined in models 1 2 or 3, and terms which it defines specifically locally for its own use within the standard specification. It discusses and ruminates until it feels it has found a set or terms which are all mutually well defined and tight enough to make a standard which people will use without undesirable consequence through misunderstanding. (Not a standard which everyone will understand unambiguously in exactly the same way, note).

    (From time to time, the group may share its work with others and be horrified to find it has in the now larger community involved go through much longer discussion and rumination.)

    There is recourse in that others can, while the group is extant in some form, challenge it to resolve perceived ambiguities in the terms it uses or the things it writes.

These are imperfect models in the sense of being very simple apporoximations to what happens in any case. Books can be (and have been) written about each model, and there are many other models. But the point is lots of models exist.

A Framework which covers all these approaches

A common facet of all these models is that they do not give complete unambiguity at all, just a good enough definition. "Good enough for government work" as they saying goes. Where "government work" is defined within some community of some size (See http://www.w3.org/DesignIssues/Fractal).

We can continue listing these sorts of models. More importantly, we can engineer them. The initial philosophers seemed to treat language as a natural or god-made thing to be investigated not engineered invented things, but in fact dictionaries and court procedure and standards bodies are all engineered systems. So we can design the ones we need.

We therefore can improve on these systems, and, given that there is so much violence and counter productivity in the world and that much of it one might imagines stems from misunderstandings of some sort, it may behoove use to improve on them. That said, lets talk about this for URIs and specifically the Semantic Web design.

The Semantic Web meta model.

In a way the semantic web out-metas the model question. By focussing on the interchange of data in a restricted normal form, it can treat mathematically the systems above -- and other systems -- in a logical way impossible with natural language terms.

The semantic web itself is a design, not a philosophical observation about how language works anywhere else.

It decrees that there should be terms defined in the http: URI space, and decrees that the DNS be part of a system of delegation of Ownership of each term. (I'm not going to quibble here about whether ownership of terms delegated within domains) By realizing that there are many communities of people using all sorts of combination, and allowing people to create new terms very easily and being able to avoid re-use of the same string, it allows us to set up a system where the participating parties agree

- The DNS, and further systems within many domain's http spaces, allow a social entity to allocate a name in HTTP space. That social entity is deemed the "Owner" of the name. Ownership is defined - The network and the HTTP allows a machine to look up the name and get information back - This information you get back provides elucidation in two forms, in natural language (with various models of ambiguity relief) and logic (where the core terms such as the syntax of turtle, and rdf:type are defined in mode 5 by the W3C working groups etc).

Everyone who uses the semantic web has to then sign up to this meta-model, though they can pick and chose models above.

Importantly, implicit or explicit in the information which is returned is information about which model is used to relieve ambiguity.

So the crucial design, then, is that when one agent sends another a message, that agent will pick a set of terms which have different owners who operate or curate different vocabularies using different models above or indeed combination of models and new models.

The vocabularies are picked so that the disambiguation is good enough. Good enough for the situation, for the sending and receiving agent.

(We tend to call the information which we get back over HTTP the definition of the term. Well, we would except that we would be ambushed by people who want to use the word "definition" specifically for a definition using one or other particular model).

Of course in parallel with the actual looking up of stuff on the web, also people share understandings over beers in bars as they always have done, but the semantic web linked data system is cool in two ways: Firstly, it instantiates the models of disambiguation providing a way to "look up the meaning" of something without having to have a notion that meaning is unambiguous. Secondly, it gives us the ability to write programs to help us, because of the logic interchanged. That's really handy.

Now we have to, mainly, get on with the business of building systems, but we have to be aware of when the ambiguity case arises. We need, in our discussions, to have things to point people to so that naive pedantic arguments don't derail perfectly good discussion and logic based on the idea that names denote things. But we need to be aware of when the pedantry is appropriate, and have avenues ready to go down.

Example 1.

In our semantic web based world, When you are using a form, you may fill in details about, say, a seminar you are organizing, and generally the prompts on the form allow you to fill in things like "Date", "Start time" and "End time" without likely damage due to misunderstanding. If you have to choose in a pull-down menu whether to categorize it as a talk or a class or a seminar or a concert, you might be more puzzled, but a good app will pull in comments from the ontology when you hover over it uncertainly, giving you enough more detailed information to make your decision. You can maybe even clock off and follow a link to bring up the detailed information from the ontology, and also you can search for members of each subclass, to see what existing things have been categorized each way and do on. So a user can well use the meaning lookup system, resolve the meaning well enough.

Example 2

Consider now the person who is creating the form. Each time they add a field, they will hopefully pick an associated property for it. And hopefully they will pick a property from an existing ontology which will give it wide interoperability. You want the events defined by users of the form to appear on people's calendars, for example, and feeds of upcoming talks. So at this point the user as form creator is more aware of the different organizations, and the different disambiguation models, which apply to each. The user will at this point quite likely pick a number very standard terms, a few from other ontologies, and then be stuck and have to make up a few properties. This is when the system needs ideally to be able to give the user a feel for the cost of getting others to agree on the ontology, of keeping it up.

This is where there should be buttons to invite comments and buttons to form a group, an buttons to to allow one to ask another group to collaborate, and so on. And depending on the sort of group formed and the sorts of groups to be collaborated with, the social processes will be of all kinds.

End of examples.

So we can build systems which instantiate and enhance the social processes which we use to resolve ambiguity.

So yes there many times when all the details of the way the semantic web resolves ambiguity enough for us to be able to talk about names having a single thing they denote, and even having a definition.

And we understand the extent to which that breaks and where it affects us and we have a task of creating systems (technical and social) which behave appropriately and allow us to agree enough on the meaning of old terms and new ones to be able to collaborate better and better.

But right now these social systems are in place in various forms so we need not be ambushed by the many rat-holes around this, some of which need to be charted and left rarely visited.


Pat Hayes, message: "This is not the forum to argue cases in philosophy of language, but for a practical everyday argument, just look at what happens to English prose when there is a real need to pin down meanings and referents with enough precision to survive many subsequent readings by a variety of readers: legal contracts, diplomatic communications, statements of regulations and laws and patent applications. Hardly any word is simply used: almost all of them are given explicit definitions, often with special 'guard' language, for example explicitly denying what might have been normal or intuitive understandings of the words. Often, special words are used which have exactly defined meanings in special contexts, like all the latin stuff on legal prose. This is not normal language: to even use it requires specialized training, and it is dangerous for non-experts to attempt, exactly because it has hard unambiguous meanings which are independent of context."

Up to Design Issues

Tim BL