W3C

Talks by W3C Speakers (2007)

Many in the W3C community — including staff, chairs, and Member representatives — present W3C work at conferences and other events. Below you will find a list some of the talks. All material is copyright of the author, except where otherwise noted.

January 2007

February 2007

March 2007

April 2007

  • 2007-04-03 (3 APR)
  • 2007-04-11 (11 APR)

    Markup Languages and Schema Languages for Linguistic, Textual, Documentary Resources

    by Michael Sperberg-McQueen

    Datenstrukturen für linguistische Ressourcen und ihre Anwendungen (GLDV Frühjahrstagung)
    (Data structures for linguistic resources and their applications)

    Tübingen, Germany

    Relevant technology area: XML Core Technology.

    Abstract:

    Markup Languages and Schema Languages for Linguistic, Textual, Documentary Resources

    C. M. Sperberg-McQueen

    This paper will consider design issues in the construction of schemas and schema languages for textual resources intended for linguistic computing, computational linguistics, and computer philology. The emphasis will be on SGML and XML vocabularies and schema languages for specifying them, with occasional reference to other systems.

    Like any good metalanguage, a good schema language must support good design at the language level. Good language design practices should be encouraged, bad practices should be discouraged or (if the metalanguage designer is ambitious) made impossible. (As Orwell writes, "The Revolution will be complete when the language is perfect.") And to be useful, the metalanguage must allow the language designer to express their design decisions, preferably clearly, preferably concisely.

    Some design issues of importance for markup languages will be outlined.

    Over- and under-generation

    In the ideal case, the schema for a language provides a formal recognition criterion which recognizes every sequence which we wish to accept as a sentence in our language, and does not recognize any other sequence. In less ideal cases, it may be necessary to live with some discrepancy between the language as we imagine it and the formal definition we work with. Is it better to under-generate? Then we can be sure that every sequence recognized by the schema is truly acceptable, at the cost of having some intuitively plausible utterances fail to be recognized by the schema. Or is it better to overgenerate? Then every acceptable sequence will be recognized, as will some number of non-sensical, unacceptable sequences. Which is preferable depends on the purpose of the schema: schemas serving as a contract between data producers and data exchange partners have one role; schemas used primarily to provide automatic annotation of the data have another; schemas which express our understanding of a corpus, in the form of a document grammar, have yet another. The notions of descriptive and prescriptive grammar also play a role.

    Concrete vs. abstract structures

    The feel of a markup language depends, more than anything else, on the designer's choice of element types. Will there be chapter, section, and subsection elements, or a single generic 'div' element with an attribute to distinguish the kind of textual division involved? Some aspects of this fact are obvious. Will element types be chosen to reflect typographic distinctions? Rhetorical and compositional distinctions? Linguistic phenomena? Equally important - and far more difficult to resolve satisfactorily - is the desire to capture both concrete details of the document (leading often to fine-grained distinctions among element types) and regularities visible only at a more abstract level. If the markup language provides a wide variety of phrase-level element types (as conventional document-oriented language often do), how can we capture generalizations true for all phrase-level types (e. g., in a stylesheet, or in a scholarly annotation). If the markup language were to provide only a single phrase-level element (with an attribute, perhaps, to allow us to distinguish different kinds of phrases), then such generalizations would be easier to capture. But the details of the text would be somewhat more cumbersome to capture. The choice of concrete or abstract structures has serious implications for validation of the data, at least with current validation technologies. Microformats, as currently used in some HTML, provide a useful concrete illustration both of the design issues involved and of the validation issues.

    Ontological commitments

    One of the issues most keenly felt by some designers and users of markup languages is that of ontological commitment. Providing names for things can be, and usually is, interpreted as entailing a claim that the things named actually exist, or can exist. It is not always easy to reach agreement, within a design team, about the nature of the ontological commitment involved in defining a particular element type, or a particular attribute value. And vocabularies intended for wide use must reckon with the possibility that different members of the target user community will have different and conflicting ontological leanings; sometimes the ontological commitments of a vocabulary are left intentionally vague.

    Variability in the material

    When existing material is digitized, an interesting pattern of variability in the material is sometimes found. In a given dictionary, for example, or in a collection of dictionaries, most articles may follow a fairly simple pattern; some will be more complex; a few will be simply anomalous. What should the schema author do? We can write a document grammar that captures the regularities in the vast majority of cases, at the cost of declaring some small portion of the material invalid. We can write a more forgiving document grammar that accepts everything in the corpus, at the expense of failing to capture the regularities which dominate the material in practice; the problems of over- and under-generation recur here in different guise.

    Data Structures

    SGML and XML are readily interpreted as describing trees; other markup systems are most conveniently understood as serializations of other data structures. What is to be done when the 'natural' data structure for our material doesn't seem to match the data structure of the markup system? Also - can we perform schema validation without trees? Is it possible for a schema to be incorrect? Is it desirable for it to be falsifiable in principle? Some errors of schema design are worth noting and warning against:

    1. the Waterloo error (extreme over-generation; may take the form of deciding not to define a schema at all and relying instead only on the material being well-formed)
    2. the tag-everything error (systematic overkill in the vocabulary, often proceeding from a desire to "tag everything that might be important", lest useful information be lost through not being marked up; reflects a failure to engage with the practical impossibility of marking everything)
    3. the insignificant-order error (a technical issue involving the interleave operator of some schema languages)

    Design issues at the language level are only half the problem, though. There are also design issues at the metalanguage level. Metalanguage designers continually trade off expressive power against tractability of validation and other processes. Convenience features for schema authors compete for attention with the simplicity and regularity that make a schema language easier to implement. Should the schema language (and by extension most schema-informed processes) be monolithic or modular? If modular, do the modules form a sequence of layers or are there interactions more complex? How does one best serve the maintainability of the schema? What operations on schemas would it be useful to support? How should the schema language go about supporting openness and extensibility in schema-defined vocabularies? How do we suport extensibility in the schema vocabulary itself? Examples will be drawn largely from the experience of the last decade in the design, implementation, and use of XML Schema 1.0 and 1.1.

  • 2007-04-12 (12 APR)

    Web Accessibility – It’s not Magic, it’s Art

    by Shadi Abou-Zahra

    Relevant technology area: Web Design and Applications.

  • 2007-04-14 (14 APR)
  • 2007-04-19 (19 APR)

    Semantic Web: Anspruch und Wirklichkeit (Semantic Web: Claims and reality)

    by Klaus Birkenbihl and Ivan Herman, in cooperation with the Germany and Austria Office

    Handlungsschemata als Grundlage visueller und begrifflicher Strukturierung in der Wissensrepräsentation
    (Visual and Conceptual Structuring of Action-dependent Knowledge Representation)

    Paderborn, Germany

    Relevant technology area: Semantic Web.

  • 2007-04-20 (20 APR)

    Ubiquitous Web Applications

    by Dave Raggett

    W3C Japan Members Meeting

    Tokyo, Japan

  • 2007-04-20 (20 APR)

    How the W3C Process got its Stripes

    by Dan Connolly

  • 2007-04-23 (23 APR)

    Introduction to the Semantic Web (tutorial)

    by Ivan Herman

    Semantic Days 2007

    Stavanger, Norway

    Relevant technology area: Semantic Web.

  • 2007-04-24 (24 APR)

    Informationsstandarder - mervärde eller förutsättning (Information Standards -- added-value or precondition)

    by Olle Olsson

    Workshop Informationsstandarder och rättslig informationsförsörjning
    (Workshop Information standards and Provisioning of Legal Information)

    Stockholm, Sweden

    Abstract:
    The presentation provides an overview of the concept of standards for information representation and access. The objective is to highlight the value of using standards, what kinds of standards one might encounter, costs and benefits of standards, and how standards can contribute to strategic business objectives.
  • 2007-04-24 (24 APR)

    State of the Semantic Web

    by Ivan Herman

    Semantic Days 2007

    Stavanger, Norway

    Relevant technology area: Semantic Web.

  • 2007-04-29 (29 APR)

    W3C and A2K (panel)

    by Daniel Dardailler

    Yale A2K Conference

    New Haven, USA

    Abstract:
    A general presentation of W3C with a focus of our effort for access to knowledge.

May 2007

June 2007

July 2007

  • 2007-07-04 (4 JUL)
    Abstract:
    W3C has rules and tools in place to standardize in an open and transparent way.
  • 2007-07-11 (11 JUL)

    Making Sense of Language Identification: How changes in ISO 639 and IETF BCP 47 affect language tagging and selection

    by Addison Phillips

    Relevant technology area: Web Design and Applications.

    Abstract:
    The recent addition of ISO 639-3, reorganization of ISO 639 in general, and the adoption of RFC 4646 and RFC 4647 as the latest version of IETF BCP 47 (language tagging and language tag matching) has resulted in a proliferation of new choices when identifying the language of content and for the selection of content according to language. This presentation, by one of the editors of BCP 47, details what the changes are, what options are available, how to work with the new standards, and how to choose among them for your application.
  • 2007-07-15 (15 JUL)

    (lunchtime talk)

    by Tatsuya Hagino

    Web標準の日々
    (The Days of Web Standards 2007)

    東京 秋葉原, Japan

    Abstract:
    皆さんは World Wide Web Consortium (W3C)のことをあまりよく知らないまま、Web 標準について語っていませんか? W3C のことを知らなくても Web 標準は使えます。しかし、Web 技術の国際標準化は、何も一部の標準化の専門家だけが取り扱うものではありません。むしろ、世界中の Web 技術者や Web デザイナーの皆様にこそ、知っていただきたいものなのです。本セッションでは、W3C の専任スタッフ自らが、W3C とは何か、という問いにお答えいたします。W3C 設立の経緯や参加形態、W3C における標準化手続きを知ることによって、「W3C 勧告」と呼ばれる Web 標準の本当の意味や、Web 技術の国際標準化に参画することの意義が、自ずと見えてくるはずです。
  • 2007-07-15 (15 JUL)2007-07-16 (16 JUL)

    (booth)

    by Yasuyuki Hirakawa

    Web標準の日々
    (The Days of Web Standards 2007)

    東京 秋葉原, Japan

  • 2007-07-15 (15 JUL)

    従うのは面倒だがWeb標準はイケてる (通訳付) (Conformance is boring. Web standards are cool!)

    by Karl Dubost and Olivier Thereaux

    Web標準の日々
    (The Days of Web Standards 2007)

    東京 秋葉原, Japan

    Relevant technology area: Web Design and Applications.

    Abstract:

    Web サイトを構築するにあたり、標準に準拠しているかどうかを、制作の最終段階での確認のみに頼ることがしばしば見受けられます。しかし最終段階での確認だけでは、エラーが頻発した場合など、標準に準拠させるための修正作業が膨大になり、うんざりしてしまうことこの上ありません。

    本セッションでは、W3C の専任スタッフ自らが、Web サイトを構築する際に求められる品質確保の方法について焦点を当てます。クールな Web サイトの構築にも一役買う、実践的な技術手法や利用可能なツールについてご紹介いたします。

    When developing a Web site, we often rely on checking standards at the end of the creation and development process. Web standards are then perceived as a burden. During this session, we will focus on how to introduce quality in your Web projects. We will focus on practical techniques and tools that will help you to build cool Web sites.

  • 2007-07-17 (17 JUL)

    W3C Richtlinien für das Mobile Web (Webcast)

    by Philipp Hoschka

    Relevant technology area: Web of Devices.

    Abstract:
    W3C organizes a free webinar in German entitled "W3C Richtlinien für das Mobile Web" on Tuesday 17 July 2007. Philipp Hoschka will show how you can benefit from the expertise collected through the documents and tools provided by the W3C Mobile Web Best Practices Working Group. Registration is now open .
  • 2007-07-20 (20 JUL)

    Offene Standards im Internet (Open Standards in the Internet)

    by Felix Sasaki, in cooperation with the Germany and Austria Office

    Zentrums für Medien und Interaktivität, Justus-Liebig-Universität Gießen

    Gießen, Germany

    Relevant technology areas: Web Design and Applications and Web of Services.

  • 2007-07-23 (23 JUL)

    Das Neueste vom W3C Web Accessibility Initiative (WAI) (The Latest from the W3C Web Accessibility Initiative (WAI))

    by Shadi Abou-Zahra

    2. Accessibility-Stammtisch
    (2. Accessibility Get-Together)

    Vienna, Austria

    Relevant technology area: Web Design and Applications.

  • 2007-07-25 (25 JUL)

    W3C and MWI

    by Daniel Dardailler, in cooperation with the Israel Office

    Relevant technology area: Web of Devices.

  • 2007-07-25 (25 JUL)

    Microformats: what are they, and why should we use them?

    by Dan Connolly

    XML Summer School

    Oxford, United Kingdom

    Abstract:

    Copying yet another soccer schedule or flight itinerary into a computer's calendar by hand, one field at a time, will eventually drive anyone insane. The Web made exchanging documents easier, but there's been little progress for data.

    There is hope - with the emerging hCard and hCalendar microformats, data can flow seamlessly from web pages into my calendar and contact tools. The trick is to encode the data in HTML, using the class attribute to say what it is. But wait... why encode this in HTML? Why not use an XML vocabulary for contacts and calendar information? Or Semantic Web technologies like RDF and the Web Ontology Language (OWL)?

    In fact, all of these have been tried. In this session, we'll explore what works and why, looking at both the social and the technical factors that will determine what we use in the future and how we use it.

August 2007

  • 2007-08-07 (7 AUG)

    Advanced approaches to XML document validation

    by Jirka Kosek and Petr Nalevka

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: XML Core Technology.

    Abstract:
    Relaxed is an open-source automated tool that provides support for validation of XML documents using predefined or custom compound languages. For example, it can validate documents that use combinations such as XHTML 1.0 + MathML 2.0 + SVG 1.1, testing for conformance with constraints expressed in Relax NG, Schematron, and other schema languages, under the control of NVDL (Namespace-based Validation Dispatching Language). In addition to the maintenance of these schemas, the Relaxed project also includes an extensible validation engine written in Java. Examples that demonstrate the usefulness and practicality of combining multiple kinds of validation and constraint expressions in a convenient automated framework are discussed.
  • 2007-08-07 (7 AUG)

    Representation of overlapping structures

    by Michael Sperberg-McQueen

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: XML Core Technology.

    Abstract:
    Markup of overlapping structures is, depending on your point of view, either a perpetual hot topic or a trivial edge case in the study of markup. Starting from the belief that overlapping structures are not just common but important, not only in the analysis of literary works but also in the management of changing content, we explore ways to represent overlapping structures in tractable ways. The requirements for representing overlapping structures differ from those for storing simple tree-structured information. An exploration of these requirements is followed by a detailed description of one way to represent Goddag structures in relational form.
  • 2007-08-07 (7 AUG)

    Writing an XSLT optimizer in XSLT

    by Michael Kay

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology areas: XML Core Technology and Web Design and Applications.

    Abstract:
    In principle, XSLT is ideally suited to the task of writing an XSLT or XQuery optimizer. After all, optimizers consist of a set of rules for rewriting a tree representation of the query or stylesheet, and XSLT is specifically designed as a language for rule-based tree rewriting. The paper illustrates how the abstract syntax tree representing a query or stylesheet can be expressed as an XML data structure making it amenable to XSLT processing, and shows how a selection of rewrites can be programmed in XSLT. The key question determining whether the approach is viable in practice is performance. Some simple measurements suffice to demonstrate that there is a significant performance penalty, but not an insurmountable one: further work is needed to see whether it can be reduced to an acceptable level.
  • 2007-08-08 (8 AUG)

    Streaming validation of schemata: The lazy typing discipline

    by Paolo Marinelli, Fabio Vitali, and Stefano Zacchiroli

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: XML Core Technology.

    Abstract:
    Assertions, identity constraints, and conditional type assignments are (planned) features of XML Schema which rely on XPath evaluation. The XPath subset exploitable in those features is limited, for several reasons, including (apparently) to avoid buffering in evaluation of an expression. We divide XPath into subsets with varying streamability characteristics. We also identify the larger XPath subset which is compatible with the typing discipline we believe underlies some of the choices currently present in the XML Schema specification. Such a discipline requires that the type of an element be decided when its start tag is encountered and its validity when its end tag is encountered. An alternative “lazy typing” discipline is proposed in which both type assignment and validity assessment are fired as soon as they are available. Our approach is more flexible, giving schema authors control over the trade-off between using larger XPath subsets (and thus increasing buffering requirements) and expeditiousness.
  • 2007-08-08 (8 AUG)

    Localization of schema languagesCharacterizing XQuery implementations: Categories and key features

    by Liam Quin

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: XML Core Technology.

    Abstract:
    XQuery 1.0 was published as a W3C Recommendation in January 2007, and there are fifty or more XQuery implementations. The XQuery Public Web page at W3C lists them but gives little or no guidance about choosing among them. The author proposes a simple ontology (taxonomy) to characterize XQuery implementations based on emergent patterns of the features appearing in implementations and suggests some ways to choose among those implementations. The result is a clearer view of how XQuery is being used and also provides insights that will help in designing system architectures that incorporate XQuery engines. Although specific products are not endorsed in this paper, actual examples are given. With XML in use in places as diverse as automobile engines and encyclopedias, the most important part of investigating an XML tool’s suitability to task is often the tool’s intended usage environment. It is not unreasonable to suppose that most XQuery implementations are useful for something. Let's see!
  • 2007-08-08 (8 AUG)

    Localization of schema languages

    by Felix Sasaki

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: Web Design and Applications.

    Abstract:
    Internationalization is the process of making a product ready for global use. Localization is the adaptation of a product to a specific locale (e.g., country, region, or market). Localization of XML schemas (XSD, DTD, Relax NG) can include translation of element and attribute names, modification of data types, and content or locale-specific modifications such as currency and dates. Combining the TEI ODD (One Document Does it all) approach for renaming and adaptation of documentation, the Common Locale Data Registry (CLDR) for the modification of data types, and the new Internationalization Tag Set (W3C 2007), the authors have produced an implementation that will take as input a schema without any localization and some external localization parameters (such as the locale, the schema language, any localization annotations, and the CLDR data) and produce a localized schema for XSD and Relax NG. For a DTD, the implementation produces a Schematron document for validation of the modified data types that can be used with a separate renaming stylesheet to generate a localized DTD.
  • 2007-08-09 (9 AUG)

    Mind the Gap: Seeking holes in the markup-related standards suite (panel)

    by Chris Lilley, James David Mason, and Mary McRae

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: XML Core Technology.

    Abstract:
    The XML 1.0 specification was admired for many things, among them its simplicity and brevity. The XML spec has since been joined by many other associated specifications and standards, not all of them simple, few of them short. There are specifications for vocabularies, constraint languages, transformation and formatting languages, data models and functions, packaging and encrypting specifications, and for a wide variety of other things one can do with, to, or about XML. Many of us think there are too many XML-related specifications. However, we continue to identify holes in the markup-related suite of standards: areas in which new specifications would be useful. In this session the audience will suggest (to representatives or members of several organizations that develop and/or promulgate XML-related specifications) areas in which new specifications would be useful. This will be an information gathering activity, not an evaluative process; all suggestions and heresies welcome.
  • 2007-08-09 (9 AUG)

    Converting into pattern-based schemas: A formal approach

    by Fabio Vitali, Antonina Dattolo, Angelo Di Iorio, Silvia Duca, and Antonio Angelo Feliziani

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology area: XML Core Technology.

    Abstract:
    A traditional distinction among markup languages is how descriptive or prescriptive they are. We identify six levels along the descriptive/prescriptive spectrum. Schemas at a specific level of descriptiveness that we call "Descriptive No Order" (DNO) specify a list of allowable elements, their number and requiredness, but do not impose any order upon them. We have defined a pattern-based model based on a set of named patterns, each of which is an object and its composition rule (content model); we show that any schema can be converted into a pattern-based schema without loss of information at the DNO level. We present a formal analysis of lossless conversions of arbitrary schemas as a demonstration of the correctness and completeness of our pattern model. Although all examples are given in DTD syntax, the results should apply equality to XSD, Relax NG, or other schema languages.
  • 2007-08-10 (10 AUG)

    Declarative specification of XML document fixup (panel)

    by Henry Thompson

    Extreme Markup Languages

    Montréal, Canada

    Relevant technology areas: XML Core Technology and Web Design and Applications.

    Abstract:
    The historical and social complications of the development of the HTML family of languages defy easy analysis. In the recent discussion of the future of the family, one question has stood out: should ‘the next HTML’ have a schema or indeed any form of formal definition? One major constituency has vocally rejected the use of any form of schema, maintaining that the current behavior of deployed HTML browsers cannot usefully be described in any declarative notation. But a declarative approach, based on the Tag Soup work of John Cowan, proves capable of specifying the repair of ill-formed HTML and XHTML in a way that approximates the behavior of existing HTML browsers. A prototype implementation named PYXup demonstrates the capability; it operates on the PYX output produced by the Tag Soup scanner and fixes up well-formedness errors and some structural problems commonly found in HTML in the wild based on an easily understood declarative specification.
  • 2007-08-31 (31 AUG)

September 2007

October 2007

November 2007

December 2007

Extra links