Web Annotations Workshop Report

W3C convened its first workshop on Web Annotations on 2 April 2014 in San Francisco, California, sponsored by Hypothes.is, colocated with the second annual I Annotate summit, a community event around annotation of everything from ebooks to Web pages to data. The next two days of the I Annotate summit also fed into the topic, as well as the following hack day. Subsequent to these events, standardization discussion continued with key stakeholders on next steps, culminating in a presentation and discussion at the 2014 Advisory Committee meeting in Boston, Massachusetts in June.

Main workshop discussions

This is a summary of the main presentations and discussions that took place. For a briefer summary from an attendee's point of view, see Kevin Marks' Web Annotation workshop notes.

Introduction

Doug Schepers (W3C, USA) set the stage for the workshop with a presentation that covered the who, where, when, why, what, and how of Web Annotations. He emphasized the goal of a decentralized architecture. He reminded the attendees that annotations are a broad topic, with many different viewpoints and use cases. Schepers concluded with a description of the workshop format and expected outcomes.

Session 1: Existing Annotation Systems & Implementation Issues

The first talk by Randall Leeds (Hypothes.is, USA) gave a technical overview of Web annotations, citing existing mechanisms like linking and pingbacks as related existing mechanisms. He described the core nature of Web annotations not as the content (or body) of the annotation, but as the association between the annotation body and the primary (e.g., target) content. Leeds also introduced two different models for web annotation structure: inline, in which the annotations and the selection markers are stored in the original target document; and out-of-band, in which the annotations are stored separately from the original target document, along with a set of selectors that describe the selections, which are dynamically mapped to the target content when the annotations are loaded. He also stressed the importance of decentralization for Web annotations, and the concomitant requirements for interoperability and exchange, comprising a common data model and component model, and for distribution, comprising syndication and discovery. Leeds touched on where annotations are stored, including local storage for both ebook readers and for browsers. He identified several technical challenges and opportunities for standardization, including selection, anchoring, styling, discovery, and notification, and offered possible solutions, but also urged a minimalist approach to give developers the basic building blocks, rather than a more comprehensive standards approach.

The second talk by Chris Gallello (Microsoft, USA) described the existing annotation and commenting functionality of the Microsoft products Office Online (which uses annotations for review comments) and Visual Studio Online (where annotations are used for code errors and break points); he proposed accessibility extensions based on experience with these online products. Gallello explained the user experience (UX) of a reader discovering annotations and the decision tree on when to read or interact with the annotations rather than the annotated content; for sighted users, this revolves around largely visual cues, while for users of screen readers, this involves more auditory cues and keyboard navigation. He proposed new ARIA roles (aria-annotationtype, aria-annotatedby, aria-annotationfor) to facilitate this discovery and navigation, for the inline annotations use case where the annotation authoring environment has the ability to explicitly add markup to the annotated selections of the target content (i.e., write control), which is applicable for some Web annotations.

Anna Gerber (University of Queensland, Australia), the third speaker, described their project's experience with supporting scholarly annotation over the last decade, for a wide variety of primary resource types, and derived several requirements for Web annotations. Among these requirements are the ability to cite annotations, requiring a unique identifier such as a URI; to indicate precise selections of multiple media types (e.g., text, HTML, PDF, raster images, vector images, video, audio, 3D objects, research data such as protein crystallography models), repository types, and dynamic web application such as mapping services, all of which require the selector model to be extensible; to apply an annotation over multiple versions and locations of the same document, or to reference the concept a document represents rather than the document itself; to preserve the integrity of the original document (i.e., not only inline annotations); and to have multiple targets, selections, and even annotation bodies. But she emphasized that the flexibility of the data model must be balanced with the ease of implementation, and indicated that for interchange, a shared data model is necessary but not sufficient, and that APIs and protocols also need to be defined. She also raised the issue of copyright, in the context of storing selections of the target content as part of the anchoring selector, which may raise concerns for content owners.

The final speaker of this session was Sean Boisen (Logos Bible Software, USA), who described Logos' software and services, which include desktop clients for Windows and MacOS, a social network, and a large library of documents related to biblical study. Many of these texts are ancient and in the public domain, so their differentiating service beyond simply publishing ebooks and documents is their own in-house research and annotation of these documents, exploring word roots across different languages and translations, providing cultural context for concepts, and indexing and cross-referencing all proper nouns, including pronouns and their antecedents. Thus, their customers are paying for these annotations, and the tools that present and utilize the annotations. Most of their business is not currently Web-based, but from their experience, Boison identified several requirements for open annotation interchange and Web Annotations, including standardized selection reference schemes and bibliographic references, word-level mappings between versions, editions, and translations, and the ability to add topic-specific vocabulary tags.

Session 2: General Requirements on Annotation Models and Systems

Nick Stenning (Open Knowledge Foundation, UK) began the session on General Requirements by indicating that we are at the very beginnings of a long process of enabling Web Annotations, which could transform the Web in unforeseen ways. He started with a recap of some of the many different types of possible annotations targets, noting that users interact with each of these types in different ways. He then set the stage for expectations of the standardization activity by indicating three topics of conversation: user interaction, including creating, reviewing, and moderating annotations; protocols for annotation transport; and the underlying data model, which describes the semantics and structure of an annotation. Stenning urged all implementers to support the data model as the basis to achieve interoperability; he also suggested that there might be some variation in protocols, based on different use cases, and even more variation in user interaction, suggesting that the priority should be to standardize the most basic building blocks and leave room for innovation at the higher levels of the hierarchy. He described the scenario of out-of-band annotations of arbitrary web pages, and enumerated the two approaches in which this could be done today, via either a bookmarklet or a browser extension, each of which has problems, especially in the development process. He described a bookmarklet as a user-initiated cross-site scripting attack, and how the Content Security Policy specification will render bookmarklets inoperable because of security considerations. He complained that browser extension mechanisms are not standardized, and that each is specific to a particular browser. Because of these restrictions, Stenning called for a standardized way to allow user-trusted code to run in the DOM, in order to allow a heterogeneity of user interfaces so different groups of people to experiment with different ways to create and edit annotations on the web.

James Williamson (John Wiley & Sons, USA), the second speaker of the General Requirements session, gave detailed background on the use of annotations in the publishing field, from the history to the next generation. He explained that within the publishing industry, there are many annotation activities beyond bibliographies that publishers don't think of as annotation, including aspects of the authoring process like copy editing, proofreading, reprint corrections, errata, footnotes, and reference citations. Williamson expressed frustration within Wiley (and the publishing industry in general) at the reliance on third-party providers for annotation services, including limitations and lack of annotation portability on different devices; he spoke about how users have little control over their reading and note-taking experience, often with no more feedback than the “contact” button, and with limited visibility about an annotation other than a count of how many other users had highlighted a passage, suggesting that a standard way of opting in or out of annotations would improve the user experience and decrease publisher customer-service requests. Williamson then described the Wiley Online Library Journals, specializing in science, technical, and medical journals, which does not allow reader annotations, but which does use annotations for footnotes, publication history, citing literature, and errata; he noted that more fine-grained selection and linking would improve these use cases, and that allowing reader annotations would foster more scholarly discussion. Continuing on the subject of third-party annotation services with their academic journals, he listed features such as tracking and evaluation of citations in articles, blog posts, and social media (e.g., Facebook, Twitter), including how it was mentioned and which sections of the article were cited; he indicated that these data are valuable for research, but because such services are done outside their own journal system, Wiley has little access to them. Williamson then touched on existing ebook “contact” functions that often don't allow anchored selections; feedback sent this way is curated by the publisher and sent to the journal organization, who may incorporate these comments into errata and revisions; he noted that the process is often too slow, with an implicit suggestion that annotation could surface these comments to readers more efficiently. Williamson explained traditional publishing revision workflows, including “tear sheets”, where the author drew desired revisions on physical pages of the book that were then sent via postal service, which was replaced by a multi-stage digital process using Microsoft Word and PDF, with color-coding and revision tracking, and inline annotations, conversations, callouts, and “sticky notes” (including graphical stamps for simulating traditional copy-editing notation marks), with revisions sent via email with versions managed manually. Then he described their nascent higher-education platform which aims to improve scholarly communication, including highlights and note-taking; bookmarking; annotations on a granular level for text, tables, images, or maps; categorizing, tracking, and searching user annotations; sharing and permissioning for teachers, students, and groups; testing with confidence evaluations; and performance assessment. He concluded with several points for consideration in fostering an annotation ecosystem: curation and quality control; storage and retention policies; and best practices and policies on handling annotations that point to edited or deleted content, or content that is no longer available (e.g., out of print). Wiley's current technology does not use any standards, but they expressed interest in seeing standards develop.

The final speaker in the General Requirements session, Frederick Hirsch (Nokia, Finland/USA), began by relating Nokia's experience in working with technology in education, and noted that seemingly simple, intuitive effects are complicated to achieve; their target environment is a distributed classroom, with teachers and students interacting over the network. From this experience, he explained several education use cases: allowing students to make notes, including easily annotating videos or audio at specific timepoints, and allowing the student to share these notes with groups or teachers; allowing students to respond to assignments via annotation, rather than traditional forms, and enabling threaded conversations by annotating annotations, including teacher review and student revision. These use cases led to several requirements. Many of these, characterized as generalized annotations, such as the ability to annotate a variety of content types and ranges, to tag annotations, to make comparisons, and to reveal the provenance of an annotation (e.g., the author and the time of creation) are already supported by the data model developed by the Open Annotation Community Group, which Hirsch praised. He also expressed a need to iterate through the annotations in an order way, when assessing student assignments. Additionally, he spoke of the need for a fine-grained access control for students, groups, and teachers, and the related issues of identity, but indicated that this would likely be out of scope for the proposed Web Annotations Working Group. Finally, he indicated that the EPUB committee was also working on this topic, and urged collaboration with EPUB and the W3C Digital Publishing Interest Group on issues such as packaging and delivering annotations. Hirsch concluded with points about the chartering of the proposed Web Annotation Working Group: he stressed that the Open Annotation Data Model specification should be adopted, and published with minimal changes; he suggested the addition of search to the HTTP API; he proposed that a JSON-LD serialization of the data model should be a required deliverable; and he suggested modifications to the draft charter about specifying deliverables, establishing expectations on testing and adoption, and including liaisons with privacy and security groups.

In the General Requirements open discussion, several more requirements were surfaced. Philip Desenne (edX, Harvard, USA) raised the need for establishing provenance for multiple instantiations of the same annotation, and tying them back to the original source. Alex Garcia (American Psychological Association, USA) raised the distinction between an annotation and a conversation consisting of a chain of annotations on another annotation; Anna Gerber (University of Queensland, Australia) responded that such use cases and requirements were useful, since in the development of the Open Annotation specification, they canvassed the community for data model requirements, but didn't cover the protocols or user application space; Doug Schepers (W3C, USA) opined that from a standardization point of view, there did not seem to be a meaningful distinction between these different types of annotations, but invited further discussion on it. Rick Johnson (VitalSource Technologies, USA) brought up the issue of levels of authority and trust, both implicit and explicit, such as annotations from the author, or publisher, or a teacher or fellow student, and saw a need to be able to codify such authority; Ivan Herman (W3C, The Netherlands) linked this with the notion of provenance; Paolo Ciccarese (Mass General Hospital / Harvard Medical School, USA) differentiated provenance for the scientific realm, breaking it down into provenance of annotations, of collections of annotations, of what is being annotated, and of what its proximate source was, as well as other levels of provenance, and noted that while provenance in the Open Annotation data model was kept deliberately simple, it could be extended to express whatever “levels” were needed, and that similarly, differentiating discussion and grouped or chained annotations could be kept simple, but more complex cases, like splitting topics, could explode in terms of complexity, and praised the Open Annotations data model for splitting out the requirements. Philippe Aigrain (Communs, France) cautioned against codifying any single reputation model within standards, noting that though such use cases and models are important (such as in education, where students are rated and scored), it should not be the province of an application or a standard, but rather that any system should allow the flexibility to use whatever reputation model is relevant, and that the already ambitious scope of Web Annotation standardization should stick to its core requirements.

Session 3: Robust Anchoring

Robust anchoring is the term for reattaching markers and styles to the selection associated with an annotation, even in the face of changes to the target document, including changes to the selection itself. This session was aimed at establishing background and requirements for the standardization of robust anchoring.

Tim Cole (University of Illinois at Urbana-Champaign, USA) spoke on requirements for robust text anchors in the context of scholarly curation, and more specifically in the case where content has been converted into a digital form (via human transcription or scanning and OCR). He reemphasized earlier comments that there should be fine granularity of selection, down to the phrase and word level, with persistence even as the target document changes. Since digitized versions of a document might exist in other formats (e.g., XML, TEI, SGML, PDF), Cole suggested that it would be nice for a single annotation could apply to multiple formats and serializations. To overcome flaws in the digitization process, where all instances of a specific word or phrase might have be incorrectly transcribed, Cole remarked that the annotation anchoring model should allow for “search-and-replace” of corrections across the document, not just a single instance. Cole drew a distinction between types of annotation, such as corrections of a misspelled word versus commentary or intellectual examination of that word. He described the HathiTrust and Text Creation Partnership projects, which include more than 11 million works digitized through OCR and transcription, which need crowd-sourced correction and part-of-speech tagging, which is made easier by annotation; using open annotations would allow discussion on the annotations, sharing with other repositories, and maintaining the portability of provenance. Cole mentioned existing tools to enable this, including Veridian, which they use in their newspaper digitization project. Cole concluded by reiterating the other annotation use case, commenting on the substance of a text passage, rather than correcting it.

Éric Aubourg (Éditions Soleb, France) described Éditions Soleb, which is a small publisher focused on history, and specifically on egyptology; their publications frequently feature not only modern European languages, but also ancient Greek, Arabic, and Egyptian hieroglyphics. They are moving to a digital-first policy, with basic EPUB, “enriched” EPUB (with JavaScript, for iPad), Kindle, and PDF; they use no DRM, with one purchase for all formats, at 30% of the print price, and only print books when there is enough demand. They are trying to take full advantage of the EPUB possibilities, not just duplicating the print experience, but adding interactivity and non-linear reading, with zoomable high-resolution pictures and interactive diagrams, and are reworking their older publications. Aubourg pointed out some possibly conflicting challenges in satisfying scholarly referencing requirements with interactive content; he indicated the need for highlighting a specific part of a zoomed, tiled image, but also the need to indicate a selection in a human-readable way (e.g., “page 23”), for use in a PhD thesis; at the same time, he indicated that a reference might need to be robust across changes or different editions, with recovery, graceful failure, and fuzzy references, but also be short enough for citations. He suggested several possible solutions, including “fake page numbers”, a paragraph-numbering scheme, or an “short URI” service for making a meaningful “citation word” that can be dereferenced on demand. In response to follow-up questions, Aubourg acknowledged that a full-text quotation would be robust, but also long, and also indicated that he wanted the reference to be independent of source-format. Ultimately, his goal is to remove impediments for scholars to quote his publications.

Fred Chasen (UC Berkeley & EPUB.js, USA) focused on the user experience (UX) of creating and anchoring annotations. He critiqued the UX of current reading systems: Amazon Kindle opens up a large input dialog that obscures the text being annotated; Apple iBooks avoids this by opening the input dialog in the page margin, but this results in a small input area; Google Play's bookstore has anchoring issues when text is duplicated, and the annotation dialog also obscures the text; finally, he showed the Annotator home page, which has so many annotations that all of the page text shows selection highlighting, creating an indistinct jumble. Chasen then suggested several best-practices to overcome these limitations. He noted that all of these systems used the same model of anchor-selection first and note-writing afterward. He suggested that highlighting and note-making are very different annotation actions, and that notes can refer to not only to a specific passage, but also to its surrounding context. In addition, he indicated that the text-selection gesture is overloaded, including copying, sharing, and highlighting, which he suggests complicates the affordance of selection. Chasen proposed a note-first approach for reading systems (like EPUB.js), in which the user first creates the annotation body, then anchors it afterward; in order to facilitate this note authoring, he suggested allowing rich-text features, including markdown, links, images tables, or more composition space, and allowing the commenter to navigate around the page content during composition, then after composition, allowing the commenter to anchor and re-anchor the annotation as needed. He also suggested that text selection should only be used for highlighting, and that a separate affordance should be used for anchoring, such as a pointer event, a button or icon in the margin. Chasen also suggested displaying the annotation in the margin, to leave ample room for lengthy multimedia annotations. Finally, he suggested that such notes should have CSS print styles so that they could be printed in the context of their referents (e.g., as footnotes). In conclusion, he indicated that by putting the authoring of the annotation content first and the anchoring afterward, users could spend more time crafting annotations worth reading. His point was reinforced by Anna Gerber (University of Queensland, Australia), who indicated that some literary scholars were uncomfortable with the transient pop-up dialog, and Chasen responded that this research arose from working with a professor who had written a textbook with extensive endnotes, and who wanted students to also be able to write extensive annotations.

Kristof Csillag (Hypothes.is, USA), the final speaker of the Robust Anchoring Anchoring session, described the anchoring solution used by Hypothes.is for their Web annotation tool, based on Annotator. In order to accomplish their goals, they forked and extended Annotator, and are in the process of reintegrating their changes back into the main codebase. One of the changes that they made was to extend the supported target formats to include EPUB and PDF (using script libraries), and plan to support other formats in the future. In order to make the anchoring more robust to changes made to the target document, they employed new text anchoring mechanisms, including collecting more context information and more matching strategies: in addition to the XPath range used by Annotator, they created a target object with multiple targets per annotation and multiple selectors per target, and increased the number and types of selector to include a range selector, a text-position selector for the normalized page content, and a text quote selector that includes not only the target selection but also the 32 characters surrounding it on either end, for context. Because the XPath selector will not work if the document or structure has changed, if it fails, they attempt to find it by applying the other selectors to the normalized page content string, first by character offset position and range, then fuzzy matching to the quote in the context of its surrounding characters, then simple fuzzy matching, and, if found, map the results back to the relevant HTML elements. Some systems, such as PDF.js, only render a few pages at a time, so the whole document is not in memory; for these cases, the anchoring process is “lazy”, done on demand, which is also a strategy applied to dynamic pages that are updated live, such as online editors and “infinite scroll” pages. In these latter cases, they observe DOM changes and hide or show annotations based on matching state. Another extension to Annotator is adding other types of anchors, such as image anchors, not just text anchors. A more fundamental problem is when the same content is at a different URL, or paginated, or in a different format, or a different language; they have a partial solution to resolve this, but more work needs to be done. Csillag concluded with a flow-chart diagram of their implementation. Kevin Marks raised the question of whether it is prudent to remove annotations if no match is found, since that gives the target author the ability to remove unfavorable comments, to which Csillag replied that a possible solution is to show “orphaned” annotations in a different view, and also suggested that a snapshot view could be taken of the original document, which they plan to do. Anna Gerber (University of Queensland, Australia) asked how they got around copyright issues in storing text quotes; Csillag indicated that at this stage, they are only working on the technical solution, and also indicated that they can still attempt fuzzy anchoring without text quotes by using other selectors

Session 4: Data Annotation

In addition to selecting and annotating HTML or other text formats, there is a need to apply annotations to various data sources, which was the aim of this session.

Robert Casties (Max Planck Institute, Germany) began the Data Annotation session with an overview of what the Max Planck Institute for the History of Science does with annotations, working with digital formats including scans, images, and data (e.g., Linked Data, databases, bibliographies). He described their vision as “weaving a web of knowledge”, enabling a close reading of various sources (preferably open-access sources), and creating and sharing comments, relations, and narratives to create a “semantic network” where more information about a source can be inferred from its annotations. He outlined their near-term goals as being able to annotate resolution-independent images, including selection of points, rectangles, and arbitrary polygonal areas, and longer-term, to be able to relate things between different sources and to have rich, reliable provenance model. He critiqued the available options for image selectors in the Open Annotation Data Model: regarding Media Fragment selectors, he suggested that they are too simple, only allowing rectangles along axes, with units expressed only as pixels or integer percent units, which is not sufficient for giga-pixel resolutions; regarding SVG selectors, he suggested that they are too complicated, requiring an XML parser, with many ways to describe the same geometry and coordinate system, and many other features not related to defining areas, while he felt it would be easier to have a well-defined set of specific features. As an alternative, he proposed a fractional relative coordinate syntax, in the range of 0 to 1 with the decimal precision determining resolution, which could be used in media fragments; while this would not conserve the calculation of areas and angles, it would allow different resolutions and some client-side reconstruction of the area location, saving server round-tripping, and would also allow the server to upgrade the resolution of the image without breaking the annotation selection. For allowing more complex shapes besides rectangles, Casties suggested using Well-Known Text (WKT) or GeoJSON, which already have geographical coordinate systems, which could perhaps be extended for giga-pixel images. Casties concluded by asking how to move forward in standards, suggesting adding WKT or GeoJSON as Open Annotation Data Model selectors, or perhaps a more structured selector, and speculated about integrating GeoJSON into Annotator.js. Kevin Marks asked why we shouldn't reuse the HTML image-map shapes; Casties had thought about it, but clarified that it wasn't as easy to work with since it doesn't have identifiers, and indicated that WKT was the closest equivalent. Paul Anderson (Benetech, USA) reinforced the need for fractional image selectors, but also indicated that there are some instances where pixels are needed, especially for small image.

Raquel Alegre (University of Reading, UK), the final speaker in the Data Annotation session, gave a summary of the CHARMe project's use of annotations with a climate dataset. She described the problem in how climate datasets frequently have metadata generated and discovered by users, including external events that may have affected the climate data such as a volcanic eruption or a failure in a satellite sensor, and that metadata may be on the Web, but it's not properly linked to the dataset, making it hard for researchers to discover relevant information or to analyze and select appropriate datasets. She enumerated several types of climate datasets: a data table; data along a period of time (perhaps represented by a data visualization); a map; or a more complex format like a 3D model; or an animation. She asserted that if the dataset has a URL, it's straightforward to link the metadata to the dataset using the Open Annotation Data Model, allowing research, sharing, and discussion on details of event timing and specific locations, and intercomparison of datasets. She described a challenge, however, in selecting specific data selections using the SVG selector recommended by the Open Annotation Data Model: a given area may have a variable resolution of data, depending on the number, type, and sensitivity of sensors in that area, and any given area may also have layers of data at different depths; Alegre asserted that SVG could not adequately represent a selection in these circumstances. Alegre suggested adding a geographical selector, like Well-Known Text (WKT), and also a temporal selector to select specific points in time. She concluded by emphasizing that this goes beyond climate data into many other fields, especially geospatial and mapping topics, and noted that while W3C and the Open Geospatial Consortium have started collaborating, there is still no standardized way of doing this.

Session 5: Storage and APIs

Gregg Kellogg (USA) began the Storage and APIs session by describing Hydra and JSON-LD, and how structured vocabularies can enable the better use of APIs for applications. He explained that many services provide RESTful APIs using JSON, but that each application must adapt to each other application's API; he characterized the use-case for JSON-LD as partially solving this problem by unambiguously describing all the properties with well-known URIs and data models for the entities and values used in the API; he suggested Hydra as a way to express this vocabulary, calling it the intersection between Linked Data and REST. Kellogg described a problem of using an API without hardcoding specifically for that API, and offered the solution of a vocabulary for that defines the set of operations on classes and properties, and proposed that annotations are the results of these operations or the relationship between entities, which may themselves be operated upon. He illustrated his proposal with the example of fan activities around sports figures, teams, and matches, with the aim of interoperable social operations (e.g. like, dislike, follow, share question, suggest relation); this work is an extension of vocabularies from Schema.org. Kellogg described Hydra as a level of abstraction that allows generic operations on entities, types of entities, or their properties. He concluded by listing a few outstanding issues, such as identifying when a property links to an entity rather than to a page describing that entity, how to maintain the data model with subsets of the larger dataset (“pagination”), identifying which operations can be performed on which entities or properties, and how to manage authentication and authorization.

In the second presentation of the Storage and APIs session, Jason Haag (IEEE Learning Technology Standards Committee, USA), on behalf of the Advanced Distributed Learning Initiative, a US Government learning technology research activity, described the Experience API (xAPI), also known as the “Tin Can API”, which a way of storing representations of social actions; xAPI is based on the ActivityStreams API, which was a collaboration between Google, Facebook, Microsoft, and others. Haag gave the background on xAPI, which came from the open source community rather than a government activity, based on a learning and training technology need that goes beyond SCORM. He described xAPI as a RESTful API that describes social actions in the triple form [Actor] [Verb] [Object], permitting data storage and retrieval not only on formal courses, but on experiences and real-world learning, as well as sensor data. He indicated use cases including mobile apps, simulators, and virtual worlds, both for individuals but for groups. He emphasized the readability of the format by both humans and machines. He also described the “Learning Record Store“ as another component in the architecture, being a triple store that allows for integration with other services and analytics. He provided links to more data about xAPI, and indicated healthy vendor adoption and activity, including e-learning authoring tools. Haag then detailed the background and timeline of the related IEEE Actionable Data Book (ADB) R&D activity, proposed by Tyde Richards, the chair of the IEEE Learning Technology Standards Committee; the goal was to use EPUB3 not only to enable access to digital books, but also as a means of recording and tracking reading and learning activity, including annotations, in a distributed way; the first phase was a feasibility study, followed by a prototype and implementation phase. Haag validated the effort by the positive reaction of the IDPF, and signaled IEEE's intent to standardize xAPI. Haag then showed screencaps from demos of prototypes of xAPI combined with EPUB3 and Annotator in several readers, such as iBooks, Readium, EPUB.js, and Calibre, including a demo with embedded video; he followed this with code examples of the JavaScript, the Learning Record Store, and the Open Annotation data model JSON-LD serialization. Haag expressed interest in further experimentation and collaboration with the annotation community, and future directions such as widgets and bookmark synchronization across platforms and readers, and concluded with a quick comparison of data models between xAPI and Open Annotation, including entities such as id, actor, object, verb, result, context, timestamp, and attachments. Audience follow-ups included comments around collaboration, consensus, and statistical data; Frederick Hirsch asked about long-term persistence and sustainability of EPUB3, which was fielded by Tyde Richards, who contrasting EPUB3 with older digital formats used by the government, indicating that with HTML5 as the baseline, even though formats may change, there would be improved sustainability.

Jake Hartnell (UC Berkeley & EPUB.js, USA), the final presenter in the Storage and APIs session, introduced himself as a science-fiction writer of the book 23rd Century Romance, and thus interested in a future with a truly read-write web. He hypothesized a future browser in which annotation and fine-grained selection-based addressability is possible; he described an annotation workflow in which a selection could be commented upon by authoring a new HTML document, perhaps in a specialized annotation web component, and raised the issue that the annotation needs to be stored somewhere; his proposed solution was the concept of annotation “channels”, or annotation document stores, each with custom settings, groups, and editing interfaces, where users subscribe to the channels of their choice, through discoverability and filtering. He indicated that each channel can serve its own use cases, such as scientific research or classroom learning, and that for each use case, the UI or “kit” you would want would be different; he showed a mockup of a browser settings menu offering multiple different annotation services. Hartnell then described a possible annotation reading workflow, in which a browser loads a page, then queries all the channels the user is listening on for relevant annotations (including “meta-channels” with smaller groups or annotation aggregator services, like RSS streams), then loads the annotations into a sidebar, which can be searched, filtered by channel or other criteria to find other relevant information. He reemphasized the need for customizable sidebar content and controls, for different contexts (e.g. multimedia annotations for music discovery and discussion versus largely text-based annotations), including different visualization, discovery, and interaction modes. Hartnell concluded with three requirements for standardization: the ability to link to any selection on a page, with the browser handling the linking, selection, and reattachment of annotations; a high degree of user control over what they see and how they interact with it, such as an annotation service requesting user permission to load its annotations, to combat “noise”; and a “space” for this annotation content to reside in the browser, including consideration for storage of personal documents and notes. In the follow-up questions, Tim Cole picked up on the point about user control, and suggested that there should be a compromise between total user control and more traditional publisher control, wherein the publisher might suggest different channels where a document is being annotated. Barbara Tien wondered about policies on annotation channels that have a particular bias (e.g. "Fox News annotation channel on the White House website"), but Hartnell responded that each user and community should have access to the tools that suit them best, and that the role of the annotation standardization and services is to provide the basic infrastructure for discussion to thrive. Philippe Aigrain praised the notion of a browser space for annotations, citing a historical analog in “commonplace books”, and criticized Zotero for making notes subsidiary to the document; Aigrain emphasized that this space must belong to the user. A. Karriem Khan suggested a standardized way for a user to identifying their identities and interests, to allow people to share their annotations with like-minded others, and to moderate discussions, with the aim of improving the public dialog while letting individuals control their own data. Hartnell expanded on the notion of a “user space” in browsers, further speculating on how third-party services could interact with documents (such as highlighting), and alluded to earlier conversations about bookmarklets and browser extensions. Genesis Kim asked how we could encourage participation younger people such as himself and lower barriers to entry to those who have things to say but may not reach outside their comfort zone of familiar services like Facebook and games; Hartnell replied that if developers are given tools, they can build compelling experiences and services, citing Rap Genius as a popular example; Doug Schepers suggested that if browsers were to support annotations natively, the barrier to entry would be very low.

Session 6: Accessibility & legal issues

Gerardo Capiel (Benetech, USA) opened the Accessibility & legal issues session by giving an overview of how annotation can help address accessibility needs, especially non-textual content on the web and in ebooks. He showed examples of blind students reading, using screen readers; he then explained how visual resources, such as images, videos, and math rendered as an image, tend to be poorly described, how videos technology has poor implementation for descriptions and captioning, and how MathML is poorly supported by browsers. He related that blind users usually have to rely on other people to help them access knowledge, such as friends, parents, teachers and aides, and Disability Support Services (DSS) offices in higher education. He then demonstrated existing crowdsourcing annotation tools and techniques for accessibility: he showed the DIAGRAM Center's Poet tool, which allows users to annotate images and math; he showed YouDescribe, which allows users to provide inline audio descriptions of YouTube videos; and he showed WebVisum, a Firefox extension which allows users to annotate images on any third-party site. From these concrete use cases, he derived several annotation requirements, including: accessible user interfaces; support for granular image annotation; annotations in HTML and MathML markup; a mechanism to request annotations from sighted users; metadata to identify the annotations as alternatives; a mechanism for original publishers to query, analyze, and pull in "crowdsourced" descriptions and transcriptions. He concluded by noting that this crowdsourced annotation is happening today with disparate tools, and that to empower and amplify the impact of these “Good Samaritans”, there needs to be standardized annotation mechanisms. Mitar Milutinovic noted that in most Web Annotation systems, the way to indicate a selection is through a visual highlight, and asked what the nest practices for selection indicators for blind users are; Capiel responded that screenreaders do allow for the selection of ranges. A. Karriem Khan asked why MathML was considered a best practice for accessible math representation; Capiel responded that MathML is machine-readable, so screenreaders can speak it, and that a user can navigate around different parts of a MathML equation; Tim Cole noted that there are different “flavors” of MathML, and that LaTeX also has good voicing support.

Puneet Kishor (Creative Commons, USA) closed the Accessibility & legal issues session with an appeal to avoid legal rats-nests by providing a means in the annotation data model to express the legal status of the annotation. He offered the caveat that he is not a lawyer. He noted that short annotations, or small collections of annotations, they would likely not have enough creative content in them to be copyrightable, but over time, longer annotations or larger collections of annotations (similar to “Cliff's Notes”), may emerge, and that a third party may wish to reuse or republish those annotations. He outlined three suggestions for legal requirements: that each annotation should carry the information in it to determine its legal status; that because annotations can build upon other annotations, you should be able to track the provenance of each annotation in a dependency chain, so that people can exert their legal rights for commercial or derivative use, and that an annotation service should establish a clear policy on annotations published there, suggesting that a single blanket license per service simplifies the legal implications; and that this mechanism should work across platforms so that when annotations are downloaded or shared, the license information travels with them. He noted that Creative Commons already has the RDF that could be embedded in an annotation, but that there could be something even simpler. Ivan Herman agreed with Kishor, but asked if the Creative Commons licenses were sufficient, especially in the case of data; Kishor indicated that the latest revision of Creative Commons licenses, CC-4, is appropriate for use with data, and that Creative Commons does not anticipate that the need for revision will arise for another decade. Gregg Kellogg raised the issue of the “right to link”, and asked if there should be a “right to annotate”, or if CC licenses might restrict the right to annotate; Kishor clarified that CC licenses do not restrict, but rather enable, though they do sometimes impose conditions such as noncommercial-use only, and indicated he wasn't aware of any “right to link or annotate” restrictions, though noting that some entities do try to impose terms of service in a way he characterized as “bad Web citizens”. Doug Schepers, as moderator, clarified that Kishor was talking about establishing a license for the annotation itself, not for the material being annotated, and that questions directly to Kishor should stay relevant to that context, though he left open the floor for discussions of other legal issues; Kishor agreed, and noted that an entity can only provide a license for materials that it owns, not those owned by other entities, (e.g. “You can only license your own rights”). Tim Cole asked if there was a legal mechanism to establish the rights and license of an annotation in the case of impersonation (and by extension, anonymity and pseudonymity), noting that most annotation tools accept claims of identity; Kishor disclaimed knowledge on how to answer the question, noting that Creative Commons licenses don't address issues of authentication; Cole accepted that the responsibility lies with the application. Anna Gerber noted that one must be careful when establishing a license for the different parts of an annotation, indicating that an annotation may have multiple bodies, which the creator of the annotation may not have rights to, and that selectors may contain quotes from the target document, which the annotation creator also may not have rights to; Kishor reiterated that you cannot license what you don't create, noted that a snippet of content from another document would likely count as “Fair Use”, and established that Creative Commons licenses don't make assertions on the legal rights of a content creator, but only establish that creator's intent and indicate social norms and expectations for that content; Schepers and Gerber clarified the technical requirement that any license on an annotation should apply to the specific part (one or more bodies or selectors) being licensed, not to the annotation as a whole. Frederick Hirsch asked why we need licenses for annotations, noting that he hasn't established a license for his Twitter tweets; Kishor responded that the legal status of short-form comments, and other content, is not necessarily established, and that indicating a clear license helps avoid potential legal confusion; Hirsch followed up with a point about the legal status of a collection of annotations with different authors, noting that the legal exploration hasn't happened. Phil Desenne noted that one use case for having license information about the source document is that if enough annotations are made on a document, with each annotation including selectors that quote that document, a copy of that document has been made, and someone could recreate the original document from the annotations; Kishor responded that he was not advocating for attaching license information to an annotation, but rather that the legal status of an annotation should be made clear, to prevent future confusion, and also noted that if the target document were CC-licensed, it would help address the issue; Rob Sanderson related that this case of “theft by annotation” had already happened to the Steve Jobs biography, of which someone had annotated every hundred characters with the sequence on how to recreate the book, underscoring that there needs to be multiple ways to describe the selectors and anchors, some of which can be used for open text and some for closed. Philippe Aigrain expressed sympathy for the notion of establishing behavioral norms through licensing, but also concern that W3C standardization might carry with it not only expression of license, but also enforcement; Kishor responded that a license is not enforcement, that only taking someone to court is enforcement, but also that copyright is a legal fact, and CC licenses help work around that; Aigrain reiterated that he was concerned about DRM. A. Karriem Khan noted that the standard should enable expression of license to empower collaboration. Tantek Çelik noted that APA and MLA citation styles for tweets includes the entire text of the tweet, and mention nothing about copyright, and opined that if multiple well-respected organizations encourage full-text copying, it establishes an expectation and a cultural norm around copying small pieces of content; Kishor noted that tweets are by necessity limited in length, but that other annotations could be very long, and gave the example of a thousand-word annotation of a portrait which would implicate copyright. Ivan Herman reiterated that a W3C standard should provide license “hooks” for those who wish to express a license, but that the standard should do no more than that; Kishor agreed, responding that Creative Commons had already created the licenses, and that the annotation model only needed to provide a way to apply them.

Session 7: Charter discussion

Because many of the participants were unfamiliar with W3C charters, some preparatory explanation was made, and efforts continued throughout the discussion to keep on topic and in scope for establishing the charter requirements.

The charter discussion stepped through each deliverable in the charter, and refined the wording to reach consensus.

There were several changes of note:

The robust anchoring deliverable was changed from an “algorithm” to a “mechanism”, and a point made about the need for extensibility
JSON-LD was made an explicit required serialization
The HTTP API had REST removed as a requirement, and the definition was expanded to include search, manage, and other operations.

There were several suggestions that “horizontal” requirements, such as accessibility, security, privacy, and internationalization, be made more explicit and strengthened.

One notable point of contention was around the robust anchoring deliverable. Throughout the day, many people had indicated the critical need for some robust anchoring mechanism. Edward O'Connor expressed the opinion that because of the complexity of robust anchoring, the potential Web Annotations WG should either drop robust anchoring as a deliverable, or should do only robust anchoring; he noted that because it would affect browser implementation, it should be done in a working group where browser implementors were already well-represented, such as the WebApps WG. Subsequent discussion established that robust anchoring should be a joint deliverable of the Web Annotations and WebApps Working Groups.

No suggestions for additional deliverables to the charter were proposed. No deliverables were dropped from the charter.

All consensus-based suggested changes to the charter were incorporated into the most recent version of the charter.

8:00–8:30	Registration
8:30–9:00	“Opening remarks & Keynote.” Ivan Herman and Doug Schepers, W3C.(Slides, Video)
9:00–9:45	Existing Annotation Systems & Implementation Issues “Hypermedia Notebooks and User Centered Publishing.” Randall Leeds, Hypothes.is, USA. (Submission, Slides, Video) “Microsoft Position Paper.” Chris Gallello, Microsoft, USA. (Submission, Slides, Video) “Supporting Web-based scholarly annotation.” Anna Gerber, University of Queensland, Australia. (Submission, Slides, Video) “Sharing and Contributing Annotations.” Sean Boisen, Logos Bible Software, USA. (Submission, Slides, Video)
9:45–10:00	Break
10:00–10:45	General Requirements on Annotation Models and Systems “Annotation on the Web.” Nick Stenning, Open Knowledge Foundation, UK. (Submission, Slides, Video) “Wiley Position Paper.” James Williamson, John Wiley & Sons, USA. (Submission, Slides, Video) “Position Paper for Annotation Workshop.” Frederick Hirsch and Vlad Stirbu, Nokia, Finland/USA. (Submission, Slides, Video) Requirement Discussion (Video)
10:45–11:30	Robust Anchoring “Position Statement.” Timothy Cole and Thomas Habing, University of Illinois at Urbana-Champaign, USA. (Submission, Slides, Video) “Soleb Position Paper.” Éric Aubourg, Éditions Soleb, France. (Submission, Slides, Video) “Point of View on Annotations in Reading Systems.” Fred Chasen, UC Berkeley & EPUB.js, USA. (Submission, Slides, Video) “Robust Anchoring.” Kristof Csillag, Hypothes.is, USA. (Submission, Slides, Video) Robust Anchoring Discussion (Video)
11:30–13:00	Break, Lunch; Birds of a Feather topic tables
13:00–13:45	Data Annotation “Sharing Knowledge about climate data using Open Annotation: the CHARMe project.” Raquel Alegre, University of Reading, UK. (Submission, Slides, Video) “Better Image Area Annotations.” Robert Casties, Max Planck Institute, Germany. (Submission, Slides, Video) Data Annotation Discussion (Video)
13:45–14:30	Storage and APIs “Hydra for Web Annotations.” Gregg Kellogg, USA. (Submission, Slides, Video) “Evaluating the Experience API (xAPI) for Annotation.” Jason Haag and Tyde Richards, IEEE Learning Technology Standards Committee, USA. (Submission, Slides, Video) “Open Annotation Architecture and Scope.” Jake Hartnell, UC Berkeley & EPUB.js, USA. (Submission, Slides, Video)
14:30–15:00	Break
15:00–15:45	Accessibility & legal issues “Annotation as a Tool for Accessibility for Blind and Vision Impaired Students.” Gerardo Capiel, Benetech, USA. (Submission, Slides, Video) “Could semantic web and accessibility be BFF (best friends for ever) in image annotation?” Mireia Ribera and Bruno Splendiani, University of Barcelona, Spain. (Submission) “Licensing Annotations.” Puneet Kishor, Creative Commons, USA. (Submission, Video)
15:45–16:30	Charter discussion (Video)
16:30-17:00	Break
17:00–17:30	Closing statements (Video)
18:30–21:00	Evening event (hosted by Hypothes.is)

Web Annotations Workshop Report

Footnotes, comments, bookmarks, and marginalia on the Web

Executive Summary

Logistics

Acknowledgments

Main workshop discussions

Introduction

Session 1: Existing Annotation Systems & Implementation Issues

Session 2: General Requirements on Annotation Models and Systems

Session 3: Robust Anchoring

Session 4: Data Annotation

Session 5: Storage and APIs

Session 6: Accessibility & legal issues

Session 7: Charter discussion

Related post-workshop events and discussion

I Annotate Summit

Advisory Committee Meeting

Other related discussion

Conclusions

Charter discussion and list of deliverables

Workshop Administrivia

Program Committee

Chairs

Committee

Registered Participants

Workshop Schedule