Defining Meaning and Behavior in a spec

When defining the meaning of a language, one asserts that such token of the language used in a given context denotes this signification, in a way similar to what a dictionary does between words and definitions. For instance, the HTML 4.01 specification binds the element to emphasis on the enclosed content.

A specification defines the behavior of agents on a language by describing how an agent is supposed to react to such or such token of the language. For instance, the HTML 4.01 specification suggests rendering the content of an element in italics.

Should a spec only define meaning (and why)? When defining behavior, how should this be done?

In short, this page defends that a language should define its semantics in priority, but insists that defining behavior based on these semantics is crucial to interoperability.

A few examples to illustrate what is this opposition between meaning and behavior specifications:

the RDF specifications only define the semantics carried by an RDF document (with an exception when talking about throwing an error in some conditions)

the HTML 4.01 specification define both semantics and behaviors (not always consistently); e.g. the title attribute has an associated meaning (advisory information about the element for which it is set) and a suggested behavior (tell users about the nature of the linked resource)

the SVG 1.1 specification first define the meaning attached to the various elements, and then define requirements for implementations to ensure some common behaviors
(interestingly enough, this is an issue that the WSDL Working Group is precisly working on those days, see a proposal from David Booth on this topic)
Meaning allows wider re-use
When defining a language, some people argue it's better to define the meaning rather than the behavior of the agents "using" an instance of this language. E.g.:
TimBL in the essentials of a specification

the RDF WG as reported in a thread on www-qa-wg
The benefits of defining the meaning without constraining the behavior is that it allows very diverse agents to use the same language.
See TimBL's views on meaning on the Web.
@@@ Integrate ideas from repurposing
Defining behavior increases interoperability
Pitfalls of not defining behaviors at all include reduced interoperability e.g. on error handlings; the general idea behind that is that if you don't define behavior, each implementation has to define a mapping between the meaning and the behavior of the implementation, and that the variation between 2 given mappings may imply lack of interoperability (in the sense of lack of "substitutability" as in dbooth's doc).
One example of this pitfall has been given at the QA Workshop by ThierryKormann about the danger of implementation dependent interpretations. Implementation dependent interpretations occures when something is not defined, for example a default value which has not been set in the specifications and different user agent choose different non compatible values [@@@ but is this really meaning vs behavior? in Thierry's example, one can argue that the meaning of passing a NULL parameter to the said function is not defined; the behavior would be implied then by the mapping that the DOM spec has for error handling].
From above, it is clear that the limitation of only defining a specification in terms of behavior is that it makes it hard to re-use the concept/conformance terms defined by the spec in contexts that the specification had not foresee.
So, there is a tension between allowing as different applications as possible of a language and allowing as good interoperation as possible between these applications; what are the keys to gauge where to put the limit between each direction? this is probably a function of use cases, profiling for various needs, etc; but a better refinement and at least a thumb rule would be good to have.
Semantics for document, behavior for agents ?
It seems it is possible to allow this same diversity by defining behaviors of agents based on the type of operations they do (parse, display, create, ...); for instance, XML has a very clear conformance model for 2 types of parsers (validating or non-validating), which serves as a basis for other conformance models (display through xml-stylesheet, mixed vocabularies through xml namespaces, hypertext through xlink).
So, a possible way to resolve the tension between semantics and behavior in a language could be:
always define the semantics of a language (semantics can also be referred as abstract model, abstract data model)

define at least the behavior of the agents for some well-known use cases, so that one can use one implementation in the stead of another without losing interop

try to define behaviors at the highest level possible, that is applying to the widest set of agents (based on the operations they are supposed to accomplish); then subset this in more refined conformance clauses if needed (see e.g. the SVG 1.1 Conformance clause

define the behavior of the agents based on the semantics of the language, rather than on the syntax, since that allows more flexibility

anchor the behavior of the agents in a well-defined conformance clause
Still up in question: for a semantics-only specification, how do you define conformance? how do you test its implementation?
(this page used to have a long series of examples and discussions developed at the same time as the theory above ; as the theory now stabilizes, I've removed it - it may belong to a different topic, but it wasn't obvious to which --DomHazael)
QA