16:53:45 RRSAgent has joined #dpub-arch 16:53:45 logging to http://www.w3.org/2016/04/07-dpub-arch-irc 16:55:18 Zakim has joined #dpub-arch 16:55:54 TimC has joined #dpub-arch 16:56:00 zakim, this will be #dpub-arch 16:56:00 ok, ayla_stein 16:56:12 Hi Tim! 16:56:43 Hi 16:56:50 Are you up on WebEx yet? 16:57:14 not yet 16:57:18 still setting up the irc 16:57:24 minutes, etc 16:57:24 okay 16:57:54 is the URL in webex correct? 16:58:14 this is what I used: https://mit.webex.com/mit/j.php?MTID=me1b7841c4e79a3f1a7ff2728300f22e3 16:58:22 ScribeNick: TimC 16:58:35 you will need the meeting number: 646 236 425 16:59:17 Meeting: DPub Archival Task Force 16:59:42 Chair: ayla_stein 17:00:42 nullhandle has joined #dpub-arch 17:01:36 Who is nullhandle? 17:02:27 Present+ Tim_Cole 17:04:30 Present+ ayla_stein 17:04:38 Present+ 17:05:02 Present+ nicholas_taylor 17:05:02 https://lists.w3.org/Archives/Public/public-digipub-ig/2016Apr/ 17:05:14 MInutes: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Apr/ 17:05:47 Agenda: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Apr/ 17:06:18 minutes: https://www.w3.org/2016/03/24-dpub-arch-minutes.html 17:07:22 Agenda: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Apr/0020.html 17:07:51 Topic: Approve Minutes https://www.w3.org/2016/03/24-dpub-arch-minutes.html 17:08:06 ayla_stein: objections? 17:08:18 ... hearing no objections, approved. 17:08:24 Topic: Identify Archival use cases relevant PWP and assign write-up of > each to TF member(s) 17:08:48 tzviya has joined #dpub-arch 17:09:08 ayla_stein: a use case for LocKSS 17:09:50 q+ 17:10:35 ntay: format migration may not be in scope for use case 17:10:51 ayla_stein: could you expand on why not in scope 17:10:59 ack timc 17:11:07 present+ 17:11:45 timC: It seems to me that PWP may make migration on access 17:12:09 ... as I understood LOCKSS format migration is driven by use 17:12:35 ... and when people use a PWP that is unpacked on server, do they necessarily get archivable package 17:13:14 ntay: format on migration may not be current 17:13:31 ... LOCKSS access system has not needed this yet. 17:13:44 ... risk model of LOCKSS is primarily concerned with the bits. 17:14:12 ... don't see PDF, GIF, ASCII as being of concern for client-side rendering obsolence. 17:14:30 ... if GIF could no longer be rendered by mainstream Web browser 17:15:22 ... then LOCKSS might put mechanism in place that when client requests an image but doesn't accept gif,LOCKSS could migrate to png 17:15:38 ... But LOCKSS doesn't see this as a use case. 17:16:17 ... my understanding is that PWP is not a file in the same sense 17:16:32 ... it's a more a framework and manifest 17:18:35 ntay: Could still put together a CLOCKSS / LOCKSS 17:18:48 TimC: Difference between CLOCKSS and LOCKSS 17:19:05 ntay: same underlying technology 17:19:54 TimC: Ntay can you walk us through the acquisition process 17:20:17 ntay: 2 mechanisms by which we retrieve content 17:20:33 ... 1 Web harvest, collect content as it would be presented to User 17:21:18 ... plug-in provide extra intelligence to crawler so it can parse various units that contain multiple publication (e.g., issue) 17:21:36 ... helps it figure out what the units it needs to make archival package(s) 17:22:04 ... decision about what to package per publisher largely being figured out by LOCKSS 17:22:37 TimC: so the manifest of PWP might simplify this. 17:22:58 ntay: yes 17:23:26 TimC: assumes that what is need for archiving is same as what is needed for portability 17:23:36 ntay: 2nd method is more back end process 17:24:05 ... publishers are making content available on the back end for archiving services 17:24:37 ... more typically the source files, e.g., includes pdf but also XML, etc., but may not have all of the presentation (CSS) files 17:24:45 ... things are neatly organized in a tree. 17:25:35 TimC: How would you phrase some of these as user stories... 17:25:57 tzviya: the back end approach is not necessairly relevant for PWP 17:26:43 ntay: in CLOCKSS model, all content archived is dark until there is a trigger event (publisher goes out of business, natural disaster knocks out servers, etc.) 17:27:03 tzviya has joined #dpub-arch 17:27:08 ... if we harvested, the user will see what they are used to seeing. 17:27:53 ... backend acquisition makes it quality of access experience 17:29:22 ntay: to the extent that we can make use cases generic, not necessairly tied to archiving service 17:31:05 ntay: there may be specific examples from David's Blog post that talk about problems in the absence of a manifest 17:31:21 ... so this may help us 17:32:26 TimC: use case archival service wants to harvest (spider) a PWD, and expects to find in the manifest what it needs to make sure it gets all the right pieces. 17:32:41 ntay: yes 17:32:54 yes! 17:33:19 ntay: another use case is versioning, if one part of a PWP gets updated how is that update handle by archiving service 17:33:36 tzviya: also a revisioning use case 17:33:54 ... e.g., an update for a mis-spelled word in chapter 3 17:34:25 ... vs. a new version of chapter 2. 17:34:55 ayla_stein: keeping track of errata and retractions 17:35:17 tzviya: we can start these as issues it GitHub 17:35:32 ... re errata, these might be done as annotation 17:35:59 ... we do have to consider what to do with errata, revisions, versions, etc. 17:36:42 ayla_stein: removal retractions (publisher just removes the item) 17:36:55 ... what does the archive service do? 17:38:09 ntay: would be surprised if retraction resulted in deletion from an archive 17:38:57 tzviya: we give retractions their own DOI, separate from the original article's DOI 17:40:09 ... shows how people DOIs for different purposes 17:41:20 ayla_stein: Medusa digital repository at Illinois does include digital monographs 17:41:43 ... so what does that archive need to facilitate archiving 17:42:11 ... archivist needs more a sense what makes a document valid -- i.e., health check 17:42:32 ... sounds like he needs some sort of archivist validation 17:43:25 TimC: what does validation mean? how does validatiy change over tiime? 17:44:03 tzviya: there is an e-pub check system for validating 17:44:37 ayla_stein: it does sound like he wants some way to read the publication and know how to validate it 17:44:50 ... might not have to be an external tool 17:45:20 tzviya: ePub has a validator, but has not come up yet for PWP 17:45:35 ntay: so what does ePub check do 17:45:53 tzviya: checks HTML, structure, etc. 17:45:58 epubcheck https://github.com/IDPF/epubcheck 17:46:23 ntay: been focused on how PWP will help verify completeness 17:46:59 ... not clear whether you could easily check appearance and/or browser compatibility 17:47:51 ayla_stein: not clear that responsive design is of concern yet to the Library / Archive space 17:48:18 ntay: Responsive Design is a best practice for Web Archiving 17:48:20 https://library.stanford.edu/projects/web-archiving/archivability 17:49:22 TimC: the basic uc a archiving service wants validate a PWP as being adequate for archiving. 17:50:40 ... Ayla will write something up. 17:50:51 q? 17:51:39 ayla_stein: archivist will be worried about the range of content that can be included in a PWP, since these technologies change over time 17:51:48 q+ 17:52:09 ... my understanding that PWP 17:52:13 ack Ti 17:52:32 TimC: if PWP is wide open about what it includes 17:52:51 ... does that mean that some PWPs may not be archivable? 17:53:01 q+ 17:53:10 q+ 17:53:17 ack tzv 17:53:41 tzviya: is this the same issue as comes up when we talk about Archiving the Web? 17:54:06 ayla_stein: Leonard's discussion about PDF/A experience may help also 17:54:36 ntay: my understand of PDF/A, the ability to embed arbitrary content means you end up with binary blogs 17:54:50 ... as archivist we deal with not having control all the time 17:55:25 ... so while some formats easier to archive than other, it isn't that there's a non-archivable format 17:56:26 TimC: use case for making assessment of risk (from archiving perspective) 18:02:11 rrsagent, make logs member 18:02:22 rrsagent, bookmark 18:02:22 See http://www.w3.org/2016/04/07-dpub-arch-irc#T18-02-22 18:03:20 oh right 18:03:33 that command was listed next on the webpage 18:04:15 rrsagent, set logs member|visible-world 18:04:46 rrsagent, set logs member|visible-public 18:05:24 rrsagent, set logs public 18:05:37 thanks Tim! 18:06:00 rrsagent, bookmark 18:06:00 See http://www.w3.org/2016/04/07-dpub-arch-irc#T18-06-00 18:06:06 oh ok 18:06:22 rrsagent, draft minutes 18:06:22 I have made the request to generate http://www.w3.org/2016/04/07-dpub-arch-minutes.html ayla_stein 18:06:46 rrsagent, set logs public 18:06:54 rrsagent, draft minutes 18:06:54 I have made the request to generate http://www.w3.org/2016/04/07-dpub-arch-minutes.html ayla_stein 18:07:29 * excellent. okay, so should I download them and then edit? 18:07:35 * yep 18:07:38 tzviya has joined #dpub-arch 18:08:38 * okay great. but I should leave this irc chat open just in case, right? 18:10:11 * okay, will do. Thanks Tim! 18:10:25 bye! 19:32:57 zakim, bye 19:32:57 leaving. As of this point the attendees have been Tim_Cole, ayla_stein, ntay, nicholas_taylor, tzviya 19:32:57 Zakim has left #dpub-arch 21:56:01 dauwhe has joined #dpub-arch