- 1 Task force on "Metadata"
- 1.1 Members (Please add your name, organization, and preferred contact email)
- 1.2 Scope, Issues, and Potential Work Items
- 1.2.1 SCOPE
- 1.2.2 BASIC ISSUES
- 1.2.3 ISSUES REGARDING THE USE OF METADATA IN THE OPEN WEB PLATFORM
- 1.2.4 POTENTIAL OTHER WORK ITEMS
Task force on "Metadata"
- Leader(s): Bill Kasdorf, Apex, firstname.lastname@example.org, Madi Solomon, Pearson, email@example.com
Members (Please add your name, organization, and preferred contact email)
- Ivan Herman, W3C, firstname.lastname@example.org
- Jean Kaplansky, Aptara, email@example.com
- Tzviya Siegman, Wiley, firstname.lastname@example.org
- Tim Clark, Mass General Hospital, email@example.com
- Tom De Nies, Ghent University - iMinds - MMLab, firstname.lastname@example.org
- Phil Madans, Hachette, email@example.com
- Luc Audrain, Hachette Livre, firstname.lastname@example.org
- Hajar Ghaem Sigarchian, Ghent University - iMinds - MMLab, email@example.com
- Madi Solomon, Pearson, firstname.lastname@example.org
- Julie Morris, BISG, email@example.com
- Dave Cramer, Hachette, firstname.lastname@example.org
Scope, Issues, and Potential Work Items
1. Identify problems re the use of metadata by publishers on the Open Web Platform
[IH:] To emphasize: some of the problems may lead to a request to W3C to start up a new (Interest or Working) Group to solve the issues, because there may not be a target group currently running. Which is perfectly fine, but starting up such a group should clearly identify the use cases, major potential beneficiaries and participants in such an endeavor.
2. Collect Use Cases
1. Is the DPIG the right group to address these issues?
Is this issue within the scope and charter of the DPIG, and is this Task Force the proper group to address it, or does the metadata issue require a separate IG or other activity within the W3C to be addressed properly?
[TF: Add your comments below, prefixed by your name in brackets. Note that comments added to the next question will affect our ultimate answer to this question.]
RS From email, the distinction between cataloguing requirements for metadata versus providing recommendations as to how to convey the metadata is important to keep clear. Is there a way to fulfill these needs, rather than picking one way, often from many existing methods. Given that understanding, I think it is in scope for DPIG, and valuable to the wider community.
BK: Does the scope of the DPIG include libraries? The focus so far has been on what publishers need from the OWP. Publishers do a lot with metadata but they don't think a whole lot about cataloguing explicitly. That, instead, is mainly a concern of librarians. We need clarification whether the needs of librarians are part of the scope of DPIG. Librarians are obviously very active in other W3C activities, particularly Semantic Web activities.
[PM:] After thinking about yesterday's meeting, I don't know that we should discount Libraries here. I certainly agree with Luc's point that for publishers Metadata is very much about discovery--connecting with potential readers. This goes for the digital and physical worlds, starting with book jackets and advertisements, which are nothing if not containers of metadata, as much as an ONIX feed is. Libraries also use metadata for discovery purposes to connect with their patrons. Isn't this what cataloging is about. Pr0viding enough information for patrons to find exactly what they need? There are obviously big differences between the publishing and library worlds, but there are enough similarities in terms of metadata that can be addressed.
One of the issue Publishers face is that there are a number of organizations that create and refine metadata, Publishers, Distributors like Baker & Taylor, Bowker, Library sources like OCLC, Library of Congress here in the U.S. But there is very little interoperability among the participants. We provide metadata to B&T. They augment our categories, but don't tell us about it. The libraries add some very rich metadata in terms of keywords, character profiles, but access is not easy or inexpensive.
OCLC piloted a program a few years ago where they took our ONIX feed and enhanced it with more of their metadata and sent it back to us. I think they finally offered this as a commercial product. We dropped out in the early pilot stage for other reasons. But it was a good idea, very hard to implement.
[Luc] I think this is the right place and a chance for us publishers to bring our needs for the OWP enabling better usage of our ebooks content.
1. Global ebooks discovery phase is important not only for ebookstores to display correctly our ebooks metadata before selling, but also for the reader ebooks library to be well categorized.
Just think about discipline in textbooks : today nobody in the B2C supply chain is able to say which discipline is this school book about, except if the publisher add it in the title. We need this info to be a scpcific field on ebookstores for search and display.
We already have ONIX for that global purpose and as members of EDItEUR, we are working in its evolution.
2. But my major concern is about content metadata. Converting our XML files to HTML5 brings us from a semantic world to a dumb world. In our publishing companies, we have been working hard for years to move content creation to structure and meaning, what we did achieve on a large number of subjects with XML vocabularies. We need then this WG to help us to bring that structure and meaning to the OWP platform so that we can propagate it inside the text of EPUB files.
What is the OWP recommandation to enable this is IMO a proper goal of this WG.
2. Are there deficiencies in the current OWP that need to be remedied?
If the answer to (1.) is yes, then are the problems and use cases identified by this TF due mainly to inadequate understanding and use of already existing capabilities of the OWP, or are there specific improvements to the OWP standards required to enable publishers to use metadata appropriately?
[TF: Add your comments below, prefixed by your name in brackets.][BK] One approach: For each issue raised below (and thus any resulting use cases) we should ask: "Can this be addressed without a change to the OWP?" If the answer is yes, that would imply that we don't need any of the existing components of the OWP to be modified; however, that does not necessarily imply that a new initiative within the W3C might be called for. Case in point: when my colleagues and I create XHTML-based models for publishers (as foundational models for workflow, repository, archive, etc.) we typically need to devote a