16:55:38 RRSAgent has joined #dpub-arch 16:55:38 logging to http://www.w3.org/2016/02/04-dpub-arch-irc 16:55:57 Zakim has joined #dpub-arch 16:56:11 rrsagent, set log public 16:56:27 Meeting: DPUB Archival TF 16:57:01 TC2 has joined #dpub-arch 16:57:14 tzviya has joined #dpub-arch 16:59:28 present +Heather_Flanagan 17:00:09 present+ Tim_Cole 17:01:09 present+ Tzviya 17:01:39 dkaplan3 has joined #dpub-arch 17:01:42 dauwhe has joined #dpub-arch 17:01:53 present+ Deborah_Kaplan 17:02:01 mgylling has joined #dpub-arch 17:02:34 scribenick: HeatherF 17:03:26 HeatherF has joined #dpub-arch 17:03:39 scribenick: HeatherF 17:03:44 Bill_Kasdorf has joined #dpub-arch 17:03:53 present+ Bill_Kasdorf 17:04:03 present+ 17:04:06 Present+ Markus 17:04:20 Wiki page: https://www.w3.org/dpub/IG/wiki/Task_Forces/archival 17:04:32 agenda: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Feb/0013.html 17:05:19 Leonard's email: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Feb/0021.html 17:05:45 TimCole: Reviewing task force goals (see wiki page for initial draft) 17:06:15 ...: any changes, either in structure or in content? 17:07:37 q+ 17:07:49 ack me 17:07:55 ...: Should we keep the potential for expanded scope of material going beyond the PWP? 17:08:05 q+ 17:08:34 mgylling: the problem statement/goal is spot on; any output from this TF should feed the use cases more than the PWP directly. 17:08:40 q+ 17:08:54 ...: so, produce use cases and functional requirements 17:09:30 dkaplan: agree. Would add that in the long run, the product of this TF would be an archival profile for the PWP. 17:09:39 q+ 17:09:47 ...: right now, however, we're creating functional requirements and use cases. 17:09:56 ack dkaplan 17:10:13 ack tzviya 17:10:20 tzviya: +1. If other cases come up that fall outside of this remit, we can always record them on the wiki to save for later. 17:10:50 ack Bill 17:11:05 Bill_Kasdorf: as a point of clarification, it seems clear that with the existing goals statement, that we are focusing on formal archives. 17:11:30 q+ 17:11:30 ...: we are not talking about a publisher who wants to archive a version of a publication for future use. Is that correct? Should we explicitly state this? 17:11:58 ack dka 17:12:32 dkaplan: rather than saying that's out of scope, we should consider it a subset. This is about preservation, not just archiving. 17:13:59 ...: we are talking about the formal archivist definition of preservation. We are talking about long-term, persistent ability to access content. 17:14:00 +1 17:14:31 TimCole: Suggests that we need to have some mods to the goals, including that we are going to create use cases, and that we need to confine scope to formal archiving. 17:14:48 ...: hopefully we won't have to define "formal archiving" from scratch; want to use someone else's. 17:14:57 http://www2.archivists.org/glossary/terms/p/preservation 17:15:22 ACTION: dkaplan3 to pull together the formal definition of archiving/preservation 17:15:44 ACTION: TimCole to add the creation of use cases to the goals on the wiki 17:16:36 TimCole: Next topic - experts we should consult. We have some pointers to documents on the wiki; do we need specific people brought in as well? 17:16:57 q 17:17:05 q+ 17:17:07 q+ 17:18:04 tzviya: Our goal is to define use cases. To get a broader set of use cases, it would be useful to either interview or invite others from that community. 17:18:30 ...: Deborah has formal archival training. (So does Heather) 17:18:54 :-) 17:19:51 TimCole: have been in touch with people at Portico for ideas about their workflows; they ingest data and normalize it on a regular basis. Want to know how the format of what they get impacts their workflow. 17:20:41 ...: To avoid duplication, suggests that we keep track on the wiki re: who we are reaching out to. 17:20:56 ack Bill 17:22:20 Bill_Kasdorf: Portico and Lockss/Clokss are interesting contrasting organizations in this space. 17:23:04 ack dka 17:23:05 ...: Portico normalizes the content, whereas Lockss/Cloks harvests documents and so has a lot of web documents. 17:24:17 dkaplan3: Outreach - yes, we should do that, and not just to organizations, and we should keep a list. Another problem to be aware of, this TF is currently mostly US participants. 17:24:48 ...: if we can get someone not anglophone, at least as consulting expert, that would be helpful. 17:25:37 TimCole: What about resources or documents? Anything to add there? Please add if you think of anything. 17:25:44 Also British Library, KB (Nat. Lib. of Netherlands), Bibliotheque Francaise 17:26:56 Important issue with BL, KB, etc. is that they are they are mandated to archive content, "legal depository" 17:27:12 TimCole: Regarding logistics, this TF has enough work to keep us busy for a while. Should we get on a regular call schedule? What would be a good timeline for this work? 17:27:23 +1 17:27:26 +1 17:27:31 +1 17:28:15 We will aim for twice a month, though perhaps not at this time. 17:30:53 TimCole: Will search in the range of 10am-12pm Eastern, M-Th. This will narrow down the doodle poll. 17:31:37 ...: Emails should go to the main dpub list, but email authors should remember to put in [dpub-arch] in the subject for easier sorting. 17:31:46 +1 17:32:14 q+ 17:32:18 ...: What's our timeline? How long should this TF expect to run? 17:32:25 q+ to discuss goals and timeline 17:32:39 ack dk 17:32:42 ack dka 17:33:24 dkaplan3: As long as we keep the scope narrow, we start with what is not already defined (don't reinvent definitions where we don't have to) 17:33:32 ack tzv 17:33:32 tzviya, you wanted to discuss goals and timeline 17:34:00 q+ 17:34:36 ack mgyl 17:34:46 tzviya: Let's not let this be something that just happens at the meetings; do work between meetings. We can target writing use cases and seeing how much we can do in three months. 17:35:26 mgylling: +1 to tzviya. In terms of timeline, we haven't set a final delivery date to the larger use case effort, but we will soon. Having a note by TPAC this year (end of September) would be a reasonable target. 17:35:41 and music ensues 17:36:15 mgylling: NISO also has work going on in this space; make sure we don't duplicate effort. 17:36:26 no more classical music. sadness. 17:36:51 ACTION: TimCole to reach out to Todd Carpenter at NISO re their work in this space 17:37:49 TimCole: so, three to four month slot. Target end of May. 17:38:08 ...: Is there a deadline on the PWP? 17:38:30 tzviya: The IG chairs need to talk about that. 17:38:49 mgylling: if this group comes up with a new paragraph, that will be enough to refresh the PWP regardless of its state. 17:39:22 ...: it is a lightweight process to update that when needed. 17:40:09 q+ 17:40:31 TimCole: does anyone have comments on Leonards presentation re: PDF/A? Might schedule time on a future call for Leonard to talk about this directly. 17:40:38 q+ 17:40:49 q? 17:40:54 ack tzv 17:40:55 ...: PDF/A is a recognized standard, but probably not sensible to turn everything into PDF/A 17:41:36 tzviya: An interesting presentation, but don't put the cart before the horse. We are not recommending one particular solution here. 17:42:17 dkaplan: There are very good reasons that PDF/A is not the appropriate recommendation. We are (probably) not headed towards ISO standardization. In generating the PDF/A standard, many contacts were made and use cases developed. 17:42:30 ...: to the extent that the archival community participated, we should find that input and use it 17:42:38 q? 17:42:43 ack dka 17:43:46 TimCole: Do people want to start commenting on what use cases? What libraries have done historically is collect content from publishers at time of publication, so there are use cases of library services telling publishers what they need 17:43:59 q+ 17:44:09 ...: but often libraries are coming to content well after publication. That's another category of use cases. 17:44:15 ack dka 17:45:18 dkaplan: Archivists can ingest just about anything. Anything you come to after-the-fact, anything that hasn't been made as an archival document to begin with, is just like anything else (games, etc) that they might have to archive. 17:46:03 TimCole: so should we make clear some of the potential trade-offs about what happens if you don't consider archival requirements up front? 17:46:16 q+ 17:46:31 ...: Print materials were simpler. Digital material introduces problems of versioning. 17:46:51 ack Bill 17:47:49 q+ 17:47:53 Bill_Kasdorf: Would like to see a basic definition that a PWP is natively amenable to archiving, similar to how EPUB is natively amenable to accessibility 17:47:57 ack dka 17:48:27 dkaplan: Agree with limitations. Accessibility should be the default, and the same thing with preservation. This is, however, a huge limitation. 17:49:00 ...: A preservable document can't be preservable unless it is entirely offline with all its essential elements. 17:49:23 Bill_Kasdorf: That is a fundamental principle of PWP. 17:50:08 TimCole: Evn when you can take everything offline, if you try and open it in 5 years, it likely won't look the same as it did at time of publication. 17:50:57 Bill_Kasdorf: What is it that's being preserved? Is it the appearance or the essential content? 17:51:18 dkaplan: That is a question that even in the preservation community must be decided on a document-by-document basis. 17:53:05 q? 17:53:34 rrsagent, make logs public 17:53:50 rrsagent, make minutes 17:53:50 I have made the request to generate http://www.w3.org/2016/02/04-dpub-arch-minutes.html tzviya 17:54:00 rrsagent, bye 17:54:00 I see 3 open action items saved in http://www.w3.org/2016/02/04-dpub-arch-actions.rdf : 17:54:00 ACTION: dkaplan3 to pull together the formal definition of archiving/preservation [1] 17:54:00 recorded in http://www.w3.org/2016/02/04-dpub-arch-irc#T17-15-22 17:54:00 ACTION: TimCole to add the creation of use cases to the goals on the wiki [2] 17:54:00 recorded in http://www.w3.org/2016/02/04-dpub-arch-irc#T17-15-44 17:54:00 ACTION: TimCole to reach out to Todd Carpenter at NISO re their work in this space [3] 17:54:00 recorded in http://www.w3.org/2016/02/04-dpub-arch-irc#T17-36-51 17:54:05 zakim, bye 17:54:05 leaving. As of this point the attendees have been Tim_Cole, Tzviya, Deborah_Kaplan, Bill_Kasdorf, dauwhe, Markus 17:54:05 Zakim has left #dpub-arch