DPUB IG Telco, 2015-04-27: STEM Survey, Fragment ID-s, footnotes

See minutes online for a more detailed record of the discussions.

STEM Survey

The STEM Task Force has conducted a survey among experts on their experience in publishing STEM content. The Survey is now closed; there was a first glance at the results during the call (all this is preliminary, a more systematic evaluation is still to be done).

There were 34 responses (out of 93 asked). Overall, the results are fine, although (at first glance) nothing overly exciting. Most of the responders were “end users”, i.e., researchers who publish in the area. The answers also highlighted some issues with the survey itself, e.g., the questions may not have been as clear as necessary. The bias or the responders was clearly towards CS and Mathematics.

There was a clear tendency towards making the content reusable and using the Web as a primary platform. Beyond MathML, no one additional STEM format came to the fore as major trend (CML was mentioned several times). As for delivery format, HTML was ahead of PDF as a primary format, but publishing in PDF is almost always present as a secondary format (without enthusiasm, just out of necessity).

The next goal is to have a more systematic evaluation of the result with the goal of summarizing in a note. The raw results of the survey will also be put into public, although it has to be strictly anonymized first.

Fragment ID-s

There was already a discussion on identifiers a few weeks ago, that referred to the selectors of the open annotation model as a possible approach for defining fragment identifiers in EPUB-WEB. That meeting was followed by and email discussion with the Web Annotation Working Group (that works on the model), to see if the selectors could be transformed into bona fide fragment identifiers.

The problem that arose during the discussion is the way fragment identifiers are defined (in general). Indeed, fragment identifiers are never defined in isolation; they are defined for a specific media type and registered as such by IANA. In this sense, serializing the selector model in general is not a real option. However, it is possible to do so for specific media types; in the case of EPUB-WEB, HTML is an obvious target.

It has been emphasized that if such a serialization is done, it should be done together with the Web Annotation Working Group to avoid discrepancies. That Working Group has already touched upon this issue (in the context of rangefinder) in their recent F2F meeting.

In the context of EPUB-WEB, CFI has to be evaluated first, though; after all, CFI defines, essentially, a fragment ID for EPUB3 already. Finding out whether CFI works (or not) for EPUB-WEB, if yes, how, if not why, is important before engaging into anything else. This is clearly a topic for the upcoming F2F meeting of the Interest Group.

HTML5 and footnotes

There was a recent email discussion on the possibility of defining a footnote element in HTML5. This was followed by some separate discussion with the experts of the HTML WG. As of now, the situation is that HTML will not have a formal proposal for such an element, so the DPUB IG should pursue ARIA Role approach for defining footnotes. Maybe it will be taken up in the future, though.

DPUB IG Telco, 2015-04-20: Consensus call for DPUB ARIA Module, F2F Agenda

See the minutes online for a more detailed record of the discussions.

Consensus call for DPUB ARIA Module

After DPUB-ARIA task force met last week, the DPUB IG put out a call for consensus to put the Digital Publishing Module of ARIA to FPWD. The task force did not resolve all outstanding issues but will attempt to resolve them in a call with PF next week and move forward with FPWD. All present voted for the publication. An email will be sent formally requesting consensus.

Finalize Agenda for May F2F

Agenda for May F2F  will be closed at end of day today.

DPUB IG Telco, 2015-04-13: Aria Module, Packaging, Identifiers

See the minutes online for a more detailed record of the discussions.

Discussion on the ARIA module

The DPUB ARIA module should get a publication approval (to publish the document as a first public draft) from both the DPUB IG and Protocols and Format (PF) Working Group. However, it seems that, during their last teleconference, the PF Working Group has raised some issues that may have to be solved before this first publications. The issues they have are:

  • some participants would prefer to have prefixes for the terms for modules, such as the publishing one; essentially they end up in different domains
  • there were also some concerns about specific terms that may clash with similar terms elsewhere in ARIA

The IG discussed these issues; it seems that the Digital Publishing community at large would be very much against the usage of extra prefixes to the role attribute terms; some publishers may decide to completely ignore the terms altogether if that was the case.

The issue was discussed and was agreed that an email discussion should follow to flesh out the issues before a telco planned with the PF Working Group in about two weeks

Packaging examples

A new Wiki page has been created to list the functionality of the current packaging used in EPUB: what additional information, files, etc, are defined and used. On longer terms, the use cases on packaging should be used to identify possible differences between the current packaging format and the Web Packaging format as worked on elsewhere at W3C. This is an ongoing work.

Identifiers

There were some discussion on the mailing list and this led to a refresh of the corresponding wiki page of the task force. An interesting approach is provided by the so called “selectors” or the Open Annotation Model (which is currently a Working Draft): this provides a general structure to describe ranges, exact positions, etc, in a very flexible manner.

The problem with that approach, however, is that the selectors are not expressed in form of a URI. Indeed, the example in the document:

 "selector": {
    "@id": "http://example.org/selector1",
    "@type": "oa:DataPositionSelector",
     "start": 4096,
     "end": 4104
}

is a structure describing an anchor point in a document, but it is not a fragment identifier that can be part of a URI. Although it may be possible to translate that into a fragment, i.e., something like:


#selector(type=DataPositionSelector,start=4096,end=4104)

The ideal would be if there were some sort of a standard to make this mapping if possible. It was agreed that the question should be asked to the editors of the annotation document to find out whether there is, or has been, work on this, or whether there are fundamental issues that makes this type of mapping impossible or undesirable.

DPUB IG Telco, 2015-03-30: Structural Semantics, Packaging Use Cases

See the minutes online for a more detailed record of the discussions.

DPUB-ARIA a.k.a. Structural Semantics update

For background of this work (quoting from the abstract of the document):

Accessibility of web content requires semantic information about widgets, structures, and behaviors, in order to allow assistive technologies to convey appropriate information to persons with disabilities. This specification defines a WAI-ARIA module encompassing an ontology of roles, states, and properties specific to the digital publishing industry. These semantics are designed to allow an author to convey digital book user interface behaviors and structural information to assistive technologies and to enable semantic navigation, styling, and interactive features used by digital book readers. It is expected this will complement HTML5.

The ARIA DPUB Module has undergone significant changes, and the first public Working Draft (published by the W3C PF WG) is planned to be published mid-April. Comments, issues are very welcome, e.g., through emails or github issues.

The big change, compared to previous versions, is that the definitions of terms have been tightened up significantly. They used to be very book-centric, but they are now more general, meaning that the same terms can be used more broadly on the Web. Lots of details had to be handled (and there is still work to do) to align with terms used elsewhere in ARIA. The superclass roles, related aria attributes, examples for all terms, etc., have all been added.

Subsequent discussions concentrated mostly on how to make this document more understandable to Digital Publishing experts who are not familiar with ARIA. It has been agreed that more examples should be added using the aria attributes, too, that a few words should be added on how this work relates to the current practice of epub:type, the work happening within the EDUPUB initiative, etc.

Packaging use cases

A first batch of packaging use cases has been published on the group’s Wiki pages.

These cases don’t address the basics but they are packaging requirements required for a publishing workflow. Some examples:

Some other issues and possible directions were also discussed at the call (necessity—or not—to add DRM related features, finding metadata, etc.) The goal is to develop these and other use cases and provide these as input for the ongoing Web Packaging Work at W3C (which will have an influence on EPUB-WEB).

Administrativia

Due to Eastern Monday, which is a holiday in most of Europe, there will be no meeting on the 6th of April.

DPUB IG Telco, 2015-03-23: CSS fragment draft review, Identifiers, latinreq update

See the minutes online for a more detailed record of the discussions.

Review on CSS Fragments

The Interest Group was asked to review the CSS Fragmentation Module Level 3 Draft, published in January. The overall view of the specification is that it will provide, when finalized, will take care of a lot of problems on how to handle page-break, column-break, fragment-break, etc., combined with various situations like float.

There were also a number of areas that were identified as practical problems and that are may not have been (yet) addressed (or addressed adequately) by the draft. One notable area is with placed elements like images, video, or movable blocks: how to you handle those (e.g., by possibly reducing their size a bit) to still keep the page breaks acceptable, etc. In general, dynamic reflow when handling pagination is not (yet) addressed.

The reviewers in the DPUB IG have collected their detailed comments in one or several mails that have been posted to the discussion mailing list of the CSS WG; the detailed discussions will be continued there.

Identifiers

The IG has set up a new task force, on Identifiers whose goal is to consider the technical challenges, in relations to EPUB-WEB, on defining identifiers. Some introductory materials has been prepared, and has been added to the Wiki page, with some background materials and a rough proposed strategy.

The discussion concentrated on what the detailed goals of the Task Force will be. The feeling was that the group should, primarily, formulate what kinds of requirements the Publishing Community in general, and EPUB-WEB in particular, would have v.a.v. identifiers.

When considering packages, there are two different aspects: how to get to a specific content file within a package, and then how to get to a final content within the content. The latter should reuse, whenever possible, existing media fragment definitions, e.g., as registered by an xpointer scheme and/or by IETF; the former requires further work (and is exemplified today by EPUB’s CFI or the Fragment specification of the Web Packaging Draft). However, it should be emphasized that, on long term, if one creates a URI, that should look the same no matter what the publication is (archive or online) which is an important thing to remember moving forward.

The discussion will continue on the mailing list…

Latinreq

The status of Latinreq (a document trying to document how page layout should be done in Western languages): it is considered to be a living document, with new issues and aspects added to it. Contribution on additional features to be added are very welcome.

During the meeting additional feature requests were mentioned: e.g., how to publish fitting monolithic content into a fixed-size page, placing captions relative to images, handling tables (e.g., diagonal headers for tables)

Miscellaneous

The group also handled administrative issues like open action items and plans for upcoming face-to-face meetings.

Posted in Activity News, Meeting reports | Comments Off on DPUB IG Telco, 2015-03-23: CSS fragment draft review, Identifiers, latinreq update

DPUB IG Telco, 2015-03-09: Some Task Force Updates, discussion on Web Packaging

See the minutes online for a more detailed record of the discussions.

Task Force Updates

STEM Task force

The Questionnaire has been sent out to a number of people in several. With the last round (last week) the number of people who have been contacted is around 90, with a deadline for responses set at the end of the month. At the moment, there are 15 respondents plus some others who, though not replying themselves, have forwarded the questionnaire to colleagues.

Accessibility Task Force

The task force has surveyed the W3C Accessibility Guidelines to see which techniques are relevant for Digital Publishing. Most of them (a dozen or so) are not really required, and some others are not clear (e.g., PDF related techniques). However, most of what is in the current guidelines are very much relevant for the Digital Publishing Industry.

The more complicated question, which has not been addressed yet, is whether there are issues in Digital Publishing that are not addressed by the guidelines. Such issues may be related to page numbers, drop caps, etc., although some of these things could be addressed via other specs (like CSS).

The issue is that the task force is a little bit low on resources at the moment…

Content & Markup Task Force (Update to the role module)

The Task Force has been working with the W3C PF Group on a draft for a role module. This is now an early editors‘ draft; some terms have been cut out that could be addressed elsewhere. There is a need to also remove some ambiguity from the terms definitions and make sure things have context outside of Digital Publishing as well.

The challenge is to determine the scope of publishing and the definition of the terms. There are a large number of potential terms (almost a thousand) and a good balance must be found for a module that is neither too large nor too little.

Discussion on Web Packaging

The discussion included Yves Lafon, the W3C staff contact in the Web Application Working Group. Yves gave an update:

The Web packaging format started as a way of identifying—with the URL—a way to identify packaging. It then derailed to some use-case as to why there was a need for a package format or document. One of the main drivers was the need for Javascript libraries. There is also a strong relationship to using service workers; it is kind of a portable cache format. Without the need of a configuration. We wanted to actually know if the work we’ve done will be actually useful to our people. We started to gather input from other people, and we got some security input, signatures—part of the document from inside the package—and of course it would be good for us as we know this IG would be interested in this type of packaging, if our approach was good for you, what would we need to make better. The current point is trying to figure out who would be the perfect customer of the specification.

Subsequent discussions concentrated on technical as well as organizational issues. In general, it was agreed that better Digital Publishing use cases, and resulting requirements, should be collected and forwarded to the Web Application Working Group to represent Digital Publishing, and what is required from a packaging format. Eg, publishers are looking at ZIP alternatives to work well on mobile, that could also efficiently include large data sets, etc.

One of the main technical issues for the Digital Publishing community is the pros and cons of abandoning ZIP in favor of a new packaging format. One of the arguments against ZIP is that it is not properly streamable. However, it may be possible to add some restrictions to ZIP so that the result is actually streamable. If so, there is a legitimate issue whether abandoning ZIP, which is largely deployed through EPUB3 publications, is a acceptable alternative. On the other hand, it is in the interest of the Publishing Community to use a packaging format that can and is natively implemented by browsers.

It was noted that IETF also plans to look at packaging issues (see the IETF WG charter) and is currently considering the W3C Web Packaging Work, too.

The plans for this Interest Group is to (1) find a definite answer on whether ZIP files can be made streamable and (2) collect use cases to be submitted to the Web Packaging work.

Posted in Activity News, Meeting reports | Comments Off on DPUB IG Telco, 2015-03-09: Some Task Force Updates, discussion on Web Packaging

DPUB IG Telco, 2015-03-02: Houdini project, EPUB 3.1 workplans

See the minutes online for a more detailed record of the discussions.

CSS WG’s Houdini project

The Houdini Project of the CSS Working Group had its first meeting a few weeks ago in Sydney, and some participants gave a short overview for the IG. The goal of the Houdini Project is to extend CSS. At present, CSS is a big black box where stuff goes in and formatted display comes out; if the magic isn’t what you want, it is difficult to make changes. The goal of the Houdini project is to “open up” that so that scripts might get additional information and control over the layout process and possibly modify how browsers lay things out.

The sense of the Houdini meeting is that there was a great enthusiasm, but also some level of skepticism on how all this can be done. But, behind the large picture, there are a number of little things, plumbing, etc, that will be done and that are useful. E.g., for the Digital Publishing community the possible control over pagination may be the biggest win: putting pagination on top of existing browser using scripting require some low-level elements in order to make a good reading experience in the browser. Such work will be accelerated if the lower-level work gets going.

It was agreed that it is important that the use cases and possible implementation experiences of e-readers, i.e., of the Digital Publishing Community, should be communicated to the Houdini project.

There is a good introduction and report by Simon Sapin, as well as a summary of Vivliostyle, and a report of the project by Peter Linss to the TAG.

IDPF’s workplaces on EPUB3.1

Epub 3.1 was released several months ago: that included bug fixes, and ISO wording + backwards compatibility. IDPF is now thinking of the next version, and this was presented, by Markus Gylling, at a recent EDUPUB Symposium in Phoenix. Various features to be added were mentioned like 3D format, migration of epub:type to the role attribute, or to HTML5. Some features may also be deprecated, like switch. However, at present, all those are just discussion items, no formal decisions or timeline yet; instead, a discussion among IDPF members should take place.

HTML Image Description Extension (longdesc) is a W3C Recommendation

The HTML5 Image Description Extension (longdesc) was published today as a Recommendation by the HTML Working Group, with the approval of the Protocols and Formats Working Group. This extension for HTML5 adds a longdesc attribute that is used to provide links to detailed descriptions of images, and is part of W3C’s work to ensure that the Open Web Platform is accessible to people with disabilities.

Posted in New W3C documents | Comments Off on HTML Image Description Extension (longdesc) is a W3C Recommendation

W3C Pointer Events is a Recommendation

The W3C Pointer Events Working Group has published a W3C Recommendation of Pointer Events. The Pointer Events specification defines a unified set of events and interfaces for device-neutral pointer input, such as a mouse, touchscreen, and pen-tablet, including capabilities for handling pointer pressure, contact geometry, and tilt; it also defines a mapping to traditional mouse events. This specification provides additional functionality not available in the related Touch Events specification; for more information on the relationship between these two specifications, see the Touch Events Community Group.

Posted in New W3C documents | Comments Off on W3C Pointer Events is a Recommendation

DPUB IG Telco, 2015-02-23: Identifiers, packaging, & manifests

(Meta comment: the W3C Digital Publishing IG has weekly teleconferences. The minutes of the meetings, as well as a short summaries, are available on line. However, to give a greater visibility, from now on these summaries will be published on this blog rather than just putting them on the wiki.)

The meeting mostly concentrated on some technical issues around the EPUB-WEB vision. See the minutes online for a more detailed record of the discussions.

Metadata Task force and identifiers

Some of the crucial issues related to EPUB-WEB are around identifiers, fragments, etc. It was suggested that the former Metadata Task Force would concentrate on these, identifying use cases and requirements primarily in the area of fragment identifiers. While the problem area around fragments is relatively clear, the issues on identifiers, and how that would affect EPUB-WEB are more complex. Indeed, many identifiers used out there are based on registries and are only loosely coupled with HTTP URI-s; also, many discussions in that space are happening outside this group. The way forward is probably to “reset” the Metadata Task Force, essentially by creating a new task force to make the intentions clear.

(There are some very initial thoughts on identifiers and EPUB-WEB on the epubweb wiki.)

Overview of the Web Packaging draft

The W3C Web Packaging draft was discussed to see how it would fit in the EPUB-WEB vision (as a possible alternative to ZIP). Ivan Herman has prepared some notes on the document on a wiki page.

Three main areas of attention in the draft are:

  1. Packaging itself, based on (essentially) a multipart Mime approach. The important point is that, conceptually, a package is a concatenation of HTTP responses, including HTTP Headers, for specific resources into one package resource; the package itself may also have its own HTTP Header. This approach brings the package very close to current Web technologies, and provides a rich possibility of metadata on each resource as defined in the HTTP standard. (E.g., and ePub “spine” can be implemented through these headers)
  2. Fragment identifier, as defined in the document, is based on the idea of:
    1. define a set of “candidate” parts within the package (listing a set of possible URL-s, for example)
    2. choose among the candidates using some filters (essentially content negotiations based on type or lang).
    3. use a fragment as defined for that specific media type; i.e., EPUB-WEB can rely on existing and evolving fragment identifications for different media without having to reinvent its own.
  3. “Link relations”, either in form of an HTTP Link header or an HTML <link> element. These provide a suitable entry point to an EPUB-WEB document: e.g., a landing page refers to the package (i.e., the possibly offline document).

Subsequent discussions looked at the question where such a packaging would be advantageous compared to ZIP. The document mentions facilities of streaming, tooling support, and richer per-part metadata; the feeling on the call was that the last argument is the strongest in favor of Web Packaging (although the availability of HTTP related tooling when handling the content of a package was also deemed to be important).

It is worth mentioning that Dave Cramer made a test on how the (ubiquitous) Moby Dick could look like in a package. The package can be downloaded from the Web (note that the fact that it is a “ZIP” file is just a means to make the file smaller in an email; the package itself can be looked at in a text editor.)

It was emphasized that the Digital Publishing community is in a unique position to strongly influence the evolution of Web Packaging, because the work is at its starting phase; joining the relevant Working Group, possibly acting as editor, is in a window of opportunity right now.

Overview of the Manifest draft

The W3C Manifest draft was also discussed to see its relations to EPUB-WEB. Tzviya Siegman has prepared some notes on the document on a wiki page.

The question, from the EPUB-WEB point of view, is whether that manifest format can be used as a manifest for EPUB-WEB documents.

The manifest is a JSON-LD file that can be associated to a resource via a specific <link> element. It has a number of metadata term that are currently aimed at web applications (icons with their sizes, display formats, etc.). Three specific issues were brought forward:

  1. The manifest has a notion of “scope”: a URL that represents the scope of URLs that can be navigated within context (note that web packaging also has the notion of a “scope”). It is not clear whether that functionality is enough for EPUB-WEB to help in identification
  2. Display mode: this is one of the terms defined by the manifest and may be very important for personalization
  3. Openness (or closeness) of the manifest terms: is it possible to add/define additional terms that are more important to the publishing community. It was felt that some sort of an extension structure, whereby various communities could add their own terms, would be a way forward, rather than cast a specific set of terms in concrete.