StructuralSemantics

From Digital Publishing Interest Group
Jump to: navigation, search

Introduction

Structural semantics provide authors and publishers a method of conveying intent and specific meaning to HTML tagging. A digital publishing structural semantics vocabulary defines a set of properties or behaviors relating to specific elements of a publication. This can improve general user experience as well as accessibility.

This document provides use cases for a structural semantics vocabulary for digital publications and recommendations for next steps toward incorporating the vocabulary into the OWP, as envisioned by the W3C Digital Publishing Interest Group and the International Digital Publishing Forum. The use cases are provided as a means to drive forward the conversation about standards in this arena.

Use Cases

Enabling Enhanced User Agent Behaviors

Intrinsic glossaries

A user is reading a publication which includes large amounts of specialized terminology. The built-in glossary term database of the User Agent (e.g. reading system or browser) does not include these terms. As the Publisher has included explicitly marked-up glossaries for the specialized terms in the actual content, the User Agent is able to harvest these additional term definitions, and expose them to the user via the common term lookup UI.

Publications as domain-specific dictionaries

A student is writing a dissertation on old french phonetics, and while reading a medieval french novel, she needs access to etymological and phonological data to help interpret the text. The student locates an e-book dictionary about old french phonetics and imports it into the User Agent. Now she is able to click on a verb in the novel, and have the etymological and/or phonological information displayed dynamically in a pop-up window.

Intelligent indexes

A user is reviewing the index of a lengthy book about World War II. She can tap on the entries listed under the term “Normandy” to preview the content to which the index refers. This enables her to assess whether she wishes to access that link. Likewise, while reading a section about Normandy, the user can access the index by viewing a pop-up display of the relevant index entries and sub-entries. The pop-up might display related index terms or a snippet of the index.

Conditional exposure of optional content

A user is reading a book with an extensive amount of footnotes on a device with limited screen real-estate. The user accesses the User Agent’s preferences, and activates a mode where footnotes are hidden from view by default, and instead exposed in a pop-up window only when the user activates a footnote reference link.

Domain-specific content fragments

A user is reading a STEM textbook. The content is structured into the domain-specific fragments (e.g. physics: experiment, law; mathematics: theorem, proof) which frequently reference other fragments ("as Experiment 2 showed, this approximation is not yet accurate", "by Theorem 4 the function must be continuous"). The user agent can integrate (activated) references into the reading flow, allowing the user to adapt the level of detail of the content.

Exposing MathML annotation-xml content

A user reading a book with content encoded as MathML. The author has provided alternative semantic representations as annotation-xml elements (see also: Repurposing). The user agent enables the user to access the embedded content or uses it to replace the original content with the richer representation.

Assistive Technologies

Provisioning of AT context cues

A blind student who is using a screen reader is reading a book with a heavily nested sectioning structure. Just having followed a link to a new position within the content, she has lost her orientation. The student invokes the screen reader function to report the position within the document hierarchy. As the publisher has provided structural semantics for relevant sections within the content, the screen reader is able to report the position as “volume 1, part 2, chapter 3, subsection 4” although the actual current nesting level is ten levels deep. The student then invokes the screen reader function to read current chapter and subchapter title. The screen reader traverses the sectioning structure upwards until it has located the nearest subchapter and chapter, and renders the title of these to the user. Similarly a blind student is exploring the content of an infographic in SVG format. Though the SVG elements can have textual values, titles and descriptions that a screen reader can voice, the students may need additional context to understand what SVG elements are the legend, data elements or labels. More detailed information on use cases can be found on the WAI wiki.

Enhanced Navigation

A blind student needs to be able to move efficiently not only between major sections of the publication, but also needs to be able to quickly locate print-equivalent page boundaries, tables, figures, audio and video clips and other landmark-type content features that they might have chosen to skip during the initial read. The inclusion and identification of specialized nav elements allows a user agent to open custom navigation options for the user when requested.

Facilitating efficacy of reading in alternate modalities

A student who is blind is reading the pre-recorded audio version of a synchronized hybrid text-audio ebook. Although the User Agent provides a time scale modification feature that allows the audio to be played at faster than 1:1 speed, the student is facing challenges to read with the same efficacy as her peers as she has no means of having the audio rendition skip over optional or less important parts of the content, such as notes and case studies. The student accesses the preferences of the User Agent and activates an automatic “skip” feature for footnotes, and now the audio rendition automatically skips over these; she chooses however to have case studies still be read where they occur. In case the case of a graph, the visually impaired student may use an alternate auditory modality technique called sonification, which produces a tone that varies with values in the data path. The Assistive Technology that is providing sonification needs to be able to identify in an SVG graphic which elements correspond to the data line, so that it can generate the corresponding tones. You can find more details about this concept at DPUB Math Sonification Use Cases.

Facilitation of Content Repurposing

Remixing: chapter label and ordinal

A teacher is creating a customized publication for her students by selecting a set of components from five different source publications. The teacher has neither time nor knowledge to perform the task manually, so relies entirely on an automated process. As the tool that performs the automated process compiles the resulting publication, it automatically performs a normalization of chapter title labels and ordinals, such that the source components (which originally used labels such as "Chapter", "Unit" and "Lesson") are now all called "Unit", and that the given ordinals ("1", "2", '3") now appear in a natural order.

Remixing: include linked and rearmatter content?

A publisher a customized publication for an institutional customer by selecting a set of components from five different source publications. The publisher uses an automated tool to process requests for dozens of a customers a month. As the tool that performs the automated process traverses the selected chapters to determine what content to include in the result, it analyzes the included hyperlinks and includes the content referenced from links that are rearnote references, but excludes content referenced from links that are generic hyperlinks.

Repurposing: encoding STEM content in MathML

A publisher is producing a chemistry textbook. To avoid the use of static images, chemical formulas are realized using Presentation MathML. To provide the chemical meaning, the publisher adds annotation-xml with an alternative representation (such as ChemML).

Reusing: using Content MathML for computational reuse

An author is writing a book with a focus on scientific computing. To enable the reader to reuse computational content in scientific computing environments, the author adds Content MathML via annotation-xml.

Approaches:Solution Criteria and Options

For addressing the ebook structural semantics use cases in the OWP, a number of options are available. The following criteria have been identified as having an impact on the choice made:

  1. Native OWP. The solution should use a native OWP construct, such that the resulting content remains valid vanilla HTML5.
  2. Serialization agnostic. The solution must be directly usable in both serializations of HTML5, without any variations in syntax.
  3. Decentralized vocabulary support to allow expression of domain-specific concerns. The solution should allow domain-specific values to be used without an impact on validity, while also including a contract for how User Agents must deal with unrecognized values.
  4. Clear AT contract. The solution must come with a native OWP contract that describes how Assistive Technologies would make use of the provided semantics when rendering the content. The contract must also describe AT behaviors for when the provided semantics are unrecognized, as well as AT behaviors for the case where the inflected semantics conflict with the given host language semantic.
  5. Static provisioning. The solution must not require support for script execution within the User Agent, such that that the provided information remains available in circumstances where scripting is disabled, or not supported at all.
  6. Simplicity and terseness. The solution should be easy to learn, and difficult to get wrong.
  7. Appropriateness. The solution's specification makes it appropriate for this specific usage.

The following solution options have been identified:

  1. Use the HTML5-defined namespace extension slot, and define an attribute in a separate XML namespace (as already done in e.g. epub:type).
  2. Define a new attribute using the hyphen-space extension method in a separate specification (as done by e.g. ITS 2.0), and work to ensure that the W3C validator is updated to recognize this attribute (as it did for ITS 2.0).
  3. Use RDFa or Microdata.
  4. Use Custom Elements (to be precise, the new Web Components framework)
  5. Use the ARIA role attribute. This would require working with PF to make sure that there is agreement re a non-AT-exclusive usage of this attribute, agree on centralized vs decentralized vocabulary approaches, and that the contract re using native host language semantics in case of unrecognized values is clarified. For usage within HTML5, it would also require using the "applicable specifications" gloss to enable usage of values beyond those allowed by HTML5.
  6. Use the class attribute
  7. Use the HTML5 data-* attribute construct

The following table lists how the identified options meet the identified criteria (these values are only a proposal of Ivan, not yet discussed and approved by the IG!)

option Native OWP Serialization agnostic Decentralized vocabulary support Clear AT contract Static provisioning Simplicity Appropriateness
@xmlns yes (for XHTML) no yes no yes yes for end users/authors; no for script developers yes (in XHTML)
@hyphen yes yes moderately (any value should be approved by the HTML5 WG to get the validators running) no yes yes yes (modulo the registration mechanism which gives the values the official blessing)
RDFa/MD yes yes yes no yes unclear; many claim it is complicated, others disagree. MD is probably (marginally) simpler than RDFa Lite. RDFa Full is definitely more complex, but that may not be needed. not really. The problem is what the subject is of a specific statement. If, say, an HTML page talks about an concert, RDFa/MD is typically used to add information about the _concert_, not about the div element holding the concert's description. Putting another way, RDFa/MD is appropriate to add information on a chapter as a whole, on the book as a whole, ie, a possible alternative to what DC terms are used for today. But may not be appropriate for, say, the 'intrinsic glossaries' use case.
Custom Elements yes (eventually, when the specification is published in final form) yes yes At this moment probably not. ?? Not really. A new element should be 'registered' through a simple script to be used properly (see, eg., a separate tutorial). That being said, the script may be very simple if the goal for the element is to be used by 'declarative' processes. yes for end users; moderately for script developers yes, but this seems like a sledgehammer approach
@role yes yes Unclear; new values should probably be registered through a mechanism that is still to be defined by the PF Working Group, or the HTML Working Group, or both yes yes yes yes, modulo the registration issue
@class yes yes yes probably yes yes Although the @class attribute is defined in very generic terms, ie, not only for styling, the practice is that it is mostly used for styling. As a consequence there is a major danger of value clash between styling and other purposes; as such, it is inherently error prone with very-difficult-to-find bugs.
@data-* yes yes yes  ?? yes yes, both for end users/authors and for Javascript programmers, for whom even specific interface methods exist to handle them smoothly per HTML5 specification, no.

Collaboration with PF: status