DPUB IG Telco, 2015-08-17: aria describedat recap, personalization

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

ARIA, describedat attribute meeting recap

The issue on @described-at, and the role of this interest group on the relevant telco, has already been documented on the last meeting. The meeting itself happened last Thursday, the meeting minutes are also public. Deborah Kaplan gave a recap of the meeting.

The group produces a set of requirements that any solution should fulfill and this was presented at the meeting last week. The DPUB community accepts any solution that answer to these, although it must be consistent and without polyfills. Lots of publishers and related institutions were present, and that is by itself was important: it is the first time that the publishing was raising its voice in such a meeting in great numbers.

One thing that came out of this meeting is an agreement that it was too complicated to talk about backwards compatibility. Worrying about backwards compatibility was one of the things dragging us down. If you have newer tech, then you will be able to have access to this feature—but probably not in older browsers. (Unfortunately IE is now being considered an older browser, in spite of its wide presence, but Windows 10 and Edge may change the picture.)

It looks probable that the future will go towards the usage of the details element. But that is not a conclusion yet: there are some technical issues with that elements, some features are still missing mainly due to its generalities (e.g., it is not clear which elements in a content is meant for accessibility and which one is present for some other purposes). The next step will be to produce requirement “grid” with issues, requirements, but also deployment plans (that grid will be created by the PF Working Group with an active participation of this group).

CSS requirement Publishing

There was an agreement to publish the CSS Requirements Doc as a first public draft. This should happen later this week.

User style sheets, personalization

Chris Lilley gave a short intro to user style sheets. There are 3 sources for CSS in a page. User-agent, which it applies to all content. It was a polite fiction, but it actually works. Then there are the document stylesheets, then lastly with the highest priority are the stylesheets provided by the user. Been in since CSS1, and lets a user change font-size, etc. However, there are some problems. Firstly, it’s not consistently implemented (if at all). Users do not know they can do this—similarly to alternate stylesheets: the user interface aspects are essentially unsolved. Also, modern web content is heavily tweaked—so it’s easily broken. Lots of current web-design makes it so that changing styles could easily break things; meaning that user style sheet may not be the good place to do restyling. Another solution is browser preferences. This has the advantages that it isn’t in the cascade—but overrides things. Basically, the whole mechanism is fragile, and probably something else is needed.

(There are some work around the @document rules in CSS 3, b.t.w., that may be helpful: this may allow per document styling on a browser level.)

There were some discussions on the fact that reading systems may represent a different constituency, where the user interface issues may be solved even if general browsers do not do it, but even if this was an acceptable route, the other problems remain. Indeed, even if the user interface allows a change (like many reading systems do with the change of character size or font families, for example) but its practical implementation may become extremely complex. (As an example, books cannot be submitted to Amazon with a text-align property used, because that would make such changes too difficult for the Kindle…)

The way forward is to have a clear set or requirements on what the publishing industry needs for personalization; that document would help finding the right solution in the future. That has now become one of the new future deliverables for this group.

DPUB IG Telco, 2015-08-10: education and outreach, aria describedat, prioritization in EPUB+WEB

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Education and Outreach Update

Nick Ruffilo gave an update of the education and outreach plans. Nick and Karen have established relations to media outlets (like Publishers’ Weekly, DBW), and there are some agreements to get regular updates in the forms of small articles, publishing perspectives, webinars, etc, to show the group’s work. As a start, a presentation of Jeff Jaffe will be turned into prose.

The questions arising are general like (1) what is W3C, how is it related to IDPF (2) W3C membership importance for business users, (3) W3C membership importance for technical users. We hope to run a Webinar by September 1st.

ARIA, describedat attribute, meeting on the 13th

The IG has been asked by the ARIA WG and the joined task force to comment on the usage of the @describedat attribute, which is the subject of discussions right now in that group. There are currently some controversies, reminiscent of the discussions surrounding the @longdesc attribute in HTML5 which, back then, led to a formal objection and a detailed W3C response. (The @describedat attribute is very similar in many respect, it can be considered as the same as @longdesc except that it is usable on all elements.) The issue is that the same functionality can be achieved through some other means in HTML5.1 (namely the details element) and the objection against @describedat is, essentially, against the duplication of functionalities.

The DPUB IG is not in position to decide on which functionality should be used. Instead, the group can describe the requirements the industry has and make it sure that the requirements can be met through all approaches put forward. Such requirement document has already been put together by the DPUB IG A11y task force, but should be possibly revised to make it technology neutral. That will be the main topic of a one-off meeting to be held on the 13th.

Prioritization in the EPUB+WEB work

A discussion has been started on the prioritization within the EPUB+WEB work, more exactly on the role of EPUB as a technology v.a.v. the work to be pursued. The key issue (raised by Leonard Rosenthal) is what the role of EPUB is in the work: would it be a direct continuation, with 100% backward compatibility, of EPUB3 or would backward compatibility be potentially broken. Because if the latter, then all bets are off, and the document should not really on EPUB at all and should, possibly, look at radically different solutions (or at least be open to it).

However, the issue may not be so clear-cut (as raised by Ivan Herman); the goal of any standard work is to ensure acceptance by the industry, i.e., a radical departure from EPUB would be detrimental to the future of the work. While a 100% backward compatibility may not be ensured, all efforts should be taken to keep the incompatibilities minimal and, possibly, restrict them to the “administrative” part of EPUB (e.g., package file, manifests, the packaging format itself, etc) and the real “content” (based on OWP) should remain unchanged, i.e., really backward compatible.

(The discussion had to be stopped by the end of the call, to be continued.)

CSS Grid Layout Module Level 1 Draft Published

The Cascading Style Sheets (CSS) Working Group has published a Working Draft of CSS Grid Layout Module Level 1. This CSS module defines a two-dimensional grid-based layout system, optimized for user interface design. In the grid layout model, the children of a grid container can be positioned into arbitrary slots in a flexible or fixed predefined layout grid. CSS is a language for describing the rendering of structured documents (such as HTML and XML) on screen, on paper, in speech, etc.

Posted in New W3C documents | Comments Off on CSS Grid Layout Module Level 1 Draft Published

DPUB IG Telco, 2015-08-03: Pagination, portable documents

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Pagination and Prioritization

A first draft of the “Priorities for CSS from the Digital Publishing Interest Group” is now available publicly. Some other issues are still to be edited into the document. There is a face-to-face meeting of the CSS Houdini project at the end of the month in Paris, the goal is to have a more stable version of the document available by then. The plan the group accepted is to publish this also as an Interest Group Draft sometimes mid-August.

Portable Package Requirements

Tzviya Siegman has edited an initial document on the wiki. The document makes a distinction between three ‘forms’ of documents:

  1. Online
  2. Offline (cached)
  3. Portable (network independent)

The document focuses on the issues mainly related to the third alternative because that is where packaging may come in. The group agreed that this terminology may be misleading, though, mainly for “off-line” (which, for most of the people, seems to indicate network independence. Instead, the terminology to be used would rather be

  1. Online
  2. Cached
  3. Portable (offline)

The main question arising during the meeting was to understand what the main goal is for this document, and to filter the various issues through those goals. One major goal is to serve as a basis with other groups at W3C on whether work on Web packaging format (currently in Working Draft) should be pursued in the first place, i.e., whether publishing does have specific requirements that should be taken into account.

(Worth noting that, though the IETF started some work on a top level media type for archives, that initiative has been abandoned due to a lack of enough manpower.)


The group also spent time discussing administrative issues (steps to be taken for the charter renewal, preparation for the face-to-face meeting in October).

DPUB IG Telco, 2015-07-27: Portable Documents, STEM Update, Math Role

See minutes online for a more detailed record of the discussions.

Portable Documents

The IG has been talking about the abstract concept of a package. Tzviya Siegman presented a draft document outlining requirements for a portable document. The group discussed the distinction between the package as an offline state, functioning as an extension of the browser cache, and a truly portable publication that exists without a network and is persistent. The group will clarify the document to distinguish these issues and add comments about maintaining identifiers.

STEM Survey

Peter Krautzberger reports that the STEM task force is cleaning up the data from their survey and slicing it in interesting ways. They have created a spreadsheet that will contribute the TF’s note.

Math and the role attribute

Peter Krautzberger, MathJax manager, discussed concerns about the ARIA role “math” that he encountered in conversations with AT vendors. The role is primarily useful for content that is MathML (uses the <math> tag). However, most browsers do not support MathML. Role=”math” is more valuable for polyfills and converters, but the role conveys very little information. It would be helpful if ARIA exposed some of the underlying of MathML to AT. The IG will pass Peter’s discoveries on to PF.

DPUB IG Charter Renewal

If you have not already voted to renew the DPUB Charter, please do!

Posted in Activity News | Comments Off on DPUB IG Telco, 2015-07-27: Portable Documents, STEM Update, Math Role

DPUB IG Telco, 2015-07-13: dpub-aria, fragment identifiers

See minutes online for a more detailed record of the discussions.

Digital Publishing WAI-ARIA Module

The first public working draft of the Digital Publishing WAI-ARIA Module is out, and the IG is now focusing on receiving input from the community about its viability and future. A separate blog post identifies a set of questions that needs discussion.

Meanwhile, the IG taskforce will continue working on the next public draft of the module. In particular, we will focus on

  • investigating additional terms to be added to the document
  • discussion of moving of certain terms into ARIA 1.1 core
  • coming to terms with the handling of link types (the role and rel attributes)
  • starting work on the separate AT API mappings companion document

Fragment Identifiers Status Update

Through the notion of using Service Workers as the vehicle for handling offline/caching in EPUB+WEB, we reach a point where the discussion of specialized fragment identifiers for digital publications becomes moot. The identifiers taskforce will instead be able to focus on working with the relevant media type authorities to make sure that fragment identification needs of digital publishing are met in the generic fragment identifier schemes for these media types (HTML, SVG, etc). One example of such currently unmet needs is the ability to specify a range of text in an HTML document using a fragment identifier in a URL.

From the work on the Range Finder API within the Web Annotations WG, it has become increasingly clear that the ephemeral nature of web content sometimes clashes with the needs within (certain) publishing domains to have completely persistent and reliable identifiers. For example, the range of text returned by the Range Finder may change over time as the document changes; this would not be workable in for example scholarly and legal publishing.

There is not yet a URL syntax for Range Finder, but we have it on good authority that the Web Annotations WG is working on this.


DPUB IG Telco, 2015-07-06: CSS , Houdini, & Pagination

See minutes online for a more detailed record of the discussions. (The header below links into the relevant section of the minutes.)

CSS, Houdini, and Pagination

After some introduction (by Dave Cramer) to the current work at the DPUB IG on these issues, Chris Lilley, technical director of the Interaction Domain at W3C, gave an overview of Houdini.

A feature of the web is using polyfills—so people don’t have to wait for features to be added (this term is also used for non-CSS features, but at the moment we concentrate on CSS only). This, sort-of, works but tends not to work if you use a bunch of them together. It ends up doing lots of re-implementation, which is pointless as the browser already knows how to do it. Also there are some things that are really hard to extend as it happens under the hood. The idea of Houdini (and it’s named after a magician) because it’s trying to remove some of the hand-waving. In contrast to the more declarative nature of CSS, this is more an API based work (done together with the TAG).

To make it less abstract, the plan is to expose the box tree through API-s. Pages appear there as well, which belong to the box tree rather than to the DOM tree.

The discussion that followed concentrated on how this project influences the work of the publishing community, i.e., and how the requirements of that community are better formulated for the CSS WG. The situation is that

  • There should be a “traditional” declarative layer in CSS to describe what is needed for pagination; this is used by authors who would not and should not “see” houdini at all
  • Reading system implementors, as well as experimentation with the new CSS features, would have to happen through the houdini API. The reading systems should implement polyfills

This interest group should then concentrate on the first: do we have all the features in CSS that publishing authors need in terms of pagination? The answer is (obviously:-) ‘no’, and a document is in the making that concentrates on this issue (as an outcome of the work that has been happening for the last few weeks). What would be needed is

  • an analysis between what is specified and what browsers actually do—CSS has suffered from multiple inconsistent implementations
  • an analysis of the features in CSS around pagination to decide which features are actually useful and usable and which should be left behind because it is not really good design.

An important question is how pagination as a whole is seen by many CSS implementers. There is a misconception that pagination is only important for print; as a consequence, it is usually pushed aside by browser vendors as not really important. We have to make it clear that pagination is a much more general and important concept: it includes slide shows, flashcard, cards, tiles, and it may also be an important UI features when very long texts are read. All the documents should make this fact much more visible to raise interest in pagination overall.

DPUB IG Telco, 2015-06-29: Annotations, CSS, minor white paper changes

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Update on the Annotation WG

Ivan Herman gave an update on the progress in the Annotation WG.

The group started with an input document provided by the Open Annotation Community Group. They had a specification for a data model, i.e., how to store annotations, what their structure is, etc. Their specification has been essentially re-published as a Working Draft, and we are continuing work on it. The biggest difference is that the CG document had all its example in the Turtle language, whereas the WG’s document includes both Turtle and JSON-LD. This is because, while the CG was strongly Linked Data oriented, i.e., Turtle was fine, the WG’s target is at Web Developers who feel more comfortable with JSON.

Next step is to transfer annotations through the network; for that, the Web Annotation Protocol has been developed. This document is in a fairly good shape, it will be published as a WD soon. It is actually a specialization of an existing W3C Recommendation, called Linked Data Protocol (LDP) which is good, because there are already implementation that can be reused, for example.

The final piece, soon to be an official Draft, is the RangeFinder API. This is a (JavaScript) API specification to find ranges of text or DOM nodes in a document, i.e., to be able to “anchor” of finding a text that may not have its own @id attribute, whose context may change, etc. This document is an API, i.e., aimed at developers; however, as part of the group’s goals we will also discuss a ‘serialization’ of that, i.e., a possibility to define a URI (more exactly, a fragment identifier) using the RangeFinder concepts.

There are also some other issues that may be discussed in the group, though the work has not really started yet. This includes a possible Client side API (i.e., an API whereby Javascript developers could handle annotations on a higher level, hiding the details of the data model), or a HTML based serialization. The latter could be used in a client to add such elements into the DOM tree; since it’s in terms of the DOM, it can be styled easily in general—which is probably something very useful. Whether that would use existing HTML elements, or whether it would require an extension to HTML is still to be discussed.

There are some overlaps between the DPUB IG’s and the Annotation WG’s membership, which is a good thing.

It was noted that the annotation WG’s further use cases could also be very useful for this group, and more regular contacts would be good.

CSS Prioritization

Dave Cramer has begun a spreadsheet listing some of the CSS features that are important for the Publishing Community and that are not fully covered by the CSS work. (There is also a textual version which will, eventually, possibly merge with the latinreq document.) Eventually, this document should be communicated with the CSS Working Group to synchronize the needs and priorities. The group (and everybody) is encouraged providing comments, adding their wish lists, etc, to this document.

A question arose around footnotes and why they are not of a higher priorities; but the problem is that there is no real consensus within the digital publishing community on what the optimal approach handling those would be, i.e., how that should reflected in CSS. There were also discussion on how to include MathML related features into the document; there are clearly missing features (like aligning equations vertically on a specific character).

There was a longer discussion on the discrepancy between browsers and reading systems on the level of control they provide to end users in terms of styling (fonts, character size, etc). CSS had the notion of user stylesheet, which pretty much disappeared from browsers, and it is also not sure that is the right level of control; further discussion is needed on how that would translate into CSS.

It was also agreed that the document should strictly separate those features that do exist in a CSS spec, but are poorly implemented, from those features that don’t exist in specification (and should). It was agreed that the table would be extended accordingly (e.g., looking at what XSL-FO has, or what systems like Antenna House implements for publishers).

Finally the issue of efficiency was also addressed (like, e.g., the blog on the subject) that is indeed a problem, although it is difficult to see what this Interest Group or indeed the CSS WG could do about it.

Small changes on the white paper

Ivan Herman also reported that the draft version of the EPUB+WEB have been updated to reflect a recent discussion on caching and resulting architectural principles; it would be good to get the relevant section reviewed as soon as possible, because the changes may have effects on the new charter, too.

Planning the future of the Digital Publishing Interest Group

(Reproduced from the “central” W3C blog.)

Time flies… it has almost been two years since the Digital Publishing Interest Group started its work. Lot has happened in those two years; the group

  • has published a report on the Annotation Use Cases (which contributed to the establishment of a separate Web Annotation Working Group);
  • has conducted a series of interviews (and published a report) with some of the main movers and shakers of metadata in the Publishing Industry;
  • is working with the WAI Protocols and Format Working Group to create a separate vocabulary describing document structures using the ARIA 1.1 technology (and thereby making an extra step towards a better accessibility of Digital Publishing);
  • maintains a document on Requirement for Latin Text Layout and Pagination, which is also used in discussion with other W3C groups on setting the priorities on specific technologies;
  • made an assessment of the various Web Accessibility Guidelines (especially the Web Content Accessibility Guidelines) from the point of view of the Publishing Industry, and plans to document which guidelines are relevant (or not) for that community and which use cases are not yet adequately covered;
  • established a reference wiki page listing the important W3C specifications for the Publishing Industry (by the way, that list is not only public, but can also be edited by anybody with a valid W3C account);
  • has conducted a series of interviews with representatives of STEM Publishing and is currently busy analyzing the results;
  • commented on a number of W3C drafts and ongoing works (in CSS, Internationalization, etc.) to get the the voice of the Publishing Industry adequately heard.

However, the most important result of these two years is the fact that the Interest Group contributed in setting up, at last, a stable and long term contacts between the Web and the Publishing Industries. Collaboration now exist with IDPF (on, e.g., the development of EPUB 3.1 or in the EDUPUB Initiative), with BISG (on, e.g., accessibility issues), and contacts with other organizations (e.g., Readium, IDAlliance, or EDItEUR) have also been established.

The group has also contributed significantly to a vision on the future of Digital Publishing, formalized by experts in IDPF and W3C and currently called “EPUB+WEB”. The vision has been described in a White Paper; its short summary can be summarized as:

[…]portable documents become fully native citizens of the Open Web Platform. In this vision, the current format- and workflow-level separation between offline/portable (EPUB) and online (Web) document publishing is diminished to zero. These are merely two dynamic manifestations of the same publication: content authored with online use as the primary mode can easily be saved by the user for offline reading in portable document form. Content authored primarily for use as a portable document can be put online, without any need for refactoring the content. […] Essential features flow seamlessly between online and offline modes; examples include cross-references, user annotations, access to online databases, as well as licensing and rights management.

But, as I said, time flies: this also means that the Interest Group has to be re-chartered. This is always a time when the group can reflect on what has gone well and what should be changed. The group has therefore also contributed to its new, draft charter. Of course, according to this draft, most of the current activities (e.g., on document structures or accessibility) will continue. However, the work will also be greatly influenced by the vision expressed in the EPUB+WEB White Paper. This vision should serve as a framework for the group’s activities. In particular, the specific technical challenges in realizing this vision are to be identified, relevant use cases should be worked out. Although the Interest Group is not chartered to define W3C Recommendations, it also plans to draft technical solutions, proof-of-concept code, etc., testing the feasibility of a particular approach. If the result of the discussions is that a specific W3C Recommendation should be established on a particular subject, the Interest Group will contribute in formalizing the relevant charter and contribute to the process toward the creation of the group.

The charter is, at this point, a public draft, not yet submitted to the W3C Management or the Advisory Committee for approval. Any comment on the charter (and, actually, on the White Paper, too!) is very welcome: the goal is to submit a final charter for approval reflecting the largest possible constituency. Issues, comments, feedbacks can be submitted through the issues’ list of the charter repository (and, respectively, through the issues’ list of the White Paper repository) or, alternatively, sent to me by email.

Two years have passed; looking forward to another two years (or more)!

Posted in Activity News | Tagged , | Comments Off on Planning the future of the Digital Publishing Interest Group

DPUB IG Telco, 2015-06-22: ARIA, STEM survey, CSS, Web Publications

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

ARIA described-at attribute

There are some discussions around the aria-describedat attribute defined for ARIA 1.1, and the group was asked to formulate an opinion whether this attribute would be used by the publishing community. The discussion led to the conclusion that

  • the attribute is important for the publishing community and would be good for digital publishing
  • the publishing industry moves slowly, so it cannot be expected to be implemented right away; i.e., its acceptance for ARIA 1.1 should not depend on that
  • the fact (and objection) that aria-describedat may lead to an “outside” document (e.g., can require an external link when reading an offline document) should not be considered as major because of the general trend trying to make the differences between offline and online fade away

It has been agreed that Deborah Kaplan will create a more formal answer to the Protocols and Formats Working Group (the guardians of ARIA).

STEM Survey

Peter Krautzberger gave a status overview of the STEM survey evaluation. The data, extracted from the survey, has been put into an SQL database, and the task force is busy formulating “questions” by cross referencing the various tables. The results will be compiled into a W3C Note. The deficiencies of the survey were also discussed; many questions were around workflow rather than tech issues, and there were probably too many of them.

One possible goal would be to see if there are formats (akin to MathML) that could/should be standardized at or around W3C and that the STEM Publishing community would need. 3D, chemical markup formats came up, but, on a different level, standardization of the iPython (now Jupyter) format may also come to the fore (although this is still very early and not sure whether it is appropriate for W3C).


Tzviya Siegman reported on the advances for the DPUB-ARIA document; the latest draft is now ready to go for a formal First Public Working Draft. The major change is to adopt the dpub-* style for all the attribute to avoid clashes with other vocabularies (e.g., dpub-abstract) and an explicit callout to the role of IDPF in the creation and commenting of the spec.

CSS Priority

Shinyu Murakami has add some CJK specific items to the CSS priority list. During the discussion the issue of an explicit mention of Bopomofo came up, and it was agreed that this would be added.

On a more general level the need of adding comments and priorities to each CSS entry came up, and Dave Cramer agreed to start working on this.

Web Publication (packaging, etc)

Markus Gylling and Ivan Herman reported on some discussion they had, as a followup on packaging and related subject. The important point that came up is that we may need an abstract concept of a “Web Publication”, which refers to a group of resources that together can be considered to be a publication. Such a Web Publication should have a unique ID, and it is regardless of wether the publication is offline or online. When online, an HTTP GET may return a Web manifest (which then would list the constituents that clients may cache and store), when offline, it may refer to a real physical package that can be downloaded and unpacked (and may also contain a manifest). The core issue is that the primary identifier should be transparent to online/offline status. The online version may be the “canonical” one, when it “goes” offline it needs to carry with it that original Identifier to handle incoming references.

What should be done is to work out some scenarios using HTTP protocol work and some elements of a client’s functionalities.