Warning:
This wiki has been archived and is now read-only.

Task Forces/Metadata/Kevin Hawkins Interview

From Digital Publishing Interest Group
Jump to: navigation, search

Kevin Hawkins, Univ. Michigan Libraries and University Press

Until a move to the University of North Texas just after our conversation, Kevin had a very interesting and relevant dual role at the University of Michigan. He has for many years been a key person in the U-M Library’s innovative and extensive Scholarly Publishing Office (now absorbed into Michigan Publishing), one of the pioneers in the trend of academic libraries moving into publishing (online journals, print-on-demand books, the Text Creation Partnership, and more). Kevin is also a true XML expert: he is my go-to guy on anything involving the Text Encoding Initiative (TEI), the XML model dominant in libraries, archives, and humanities scholarship. A couple of years ago, the U-M Library took over responsibility for the University of Michigan Press, and Kevin took major responsibility for production for both print and digital university press monographs.

In an interesting confirmation of one of my other calls (also to a scholarly publishing luminary) his first comment was that he was not really aware of any fundamental critical problems regarding metadata. Unlike almost all other segments of publishing, scholarly publishing seems to consider metadata mainly a solved problem (and it largely is).

The main problem he pointed to was the inconsistent adoption of metadata schemes throughout the publishing supply chain and the consequential work involved in customizing metadata feeds for each vendor. The Press pays Firebrand (the company run by Fran Toolan, one of my other interviews) to disseminate their metadata, customizing the ONIX feed for various vendors.

The U-M LIbrary is a member of CrossRef and deposits DOIs for much of its online content. This requires mapping the metadata they have to the metadata CrossRef requires. While this is annoying, it is not a huge problem because the CrossRef metadata requirements are not extensive. A bigger issue, in Kevin’s view, involves digital workflows. Their publishing platform and its workflows for collections of content was designed for digitized library collections, with infrequent additions to a collection or updates to the content. Revisions are cumulative, with new and revised content not distinguished. So when a new issue of a journal is published they send metadata for all that journal’s issues to CrossRef, relying on CrossRef to screen duplicate records. This of course is an internal U-M issue, but it highlights the fact that metadata is not just about marketing for a publisher like this, it is central to how they manage their workflow.

He observed that the Press has not in fact invested much to enhance metadata for discoverability—keywords in HTML, ONIX, BISAC, etc. This needs to be more of a priority. They have done some SEO work to make their online-only content more discoverably in Google search results. But he pointed out that they don’t have “real keywords in microdata,” and they “probably should be doing that.”

Another interesting twist from this conversation: promoting discoverability through metadata access actually has no financial return for online-only content, so it gets put off in favor of work for which there is a clearer financial implication. (!!)

And finally another important observation that is true of virtually all book publishers but hardly anybody ever brings up: for most books (most of which are not online), there is no HTML to put microdata IN!

Most of our conversation focused on the Library’s publishing activities, but he did have a few additional comments from the perspective of a library acquiring content and making it available to users:

  • Because of advances in search and discovery capabilities in library catalogs, vendor databases, and discovery systems, libraries are actually making LESS investment in detailed cataloguing than they used to.
  • Institutional repositories are very widely and heavily used in academia, and most allow self-deposit of content by authors. The author is asked to supply some metadata, and some institutions have a review/validation process.,In his opinion, the thorough crawling of IRs by Google Scholar and other search engines “argues against laborious metadata creation.”