See also: IRC log
<scribe> scribenick: HeatherF
<TimCole> Wiki page: https://www.w3.org/dpub/IG/wiki/Task_Forces/archival
<TimCole> Leonard's email: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Feb/0021.html
TimCole: Reviewing task force goals (see
wiki page for initial draft)
...: any changes, either in structure or in content?
... Should we keep the potential for expanded scope of material going
beyond the PWP?
mgylling: the problem statement/goal is
spot on; any output from this TF should feed the use cases more than the
PWP directly.
...: so, produce use cases and functional requirements
dkaplan: agree. Would add that in the long
run, the product of this TF would be an archival profile for the PWP.
...: right now, however, we're creating functional requirements and use
cases.
tzviya: +1. If other cases come up that fall outside of this remit, we can always record them on the wiki to save for later.
Bill_Kasdorf: as a point of clarification,
it seems clear that with the existing goals statement, that we are
focusing on formal archives.
...: we are not talking about a publisher who wants to archive a version
of a publication for future use. Is that correct? Should we explicitly
state this?
dkaplan: rather than saying that's out of
scope, we should consider it a subset. This is about preservation, not
just archiving.
...: we are talking about the formal archivist definition of
preservation. We are talking about long-term, persistent ability to
access content.
<mgylling> +1
TimCole: Suggests that we need to have some
mods to the goals, including that we are going to create use cases, and
that we need to confine scope to formal archiving.
...: hopefully we won't have to define "formal archiving" from scratch;
want to use someone else's.
<dkaplan3> http://www2.archivists.org/glossary/terms/p/preservation
<scribe> ACTION: dkaplan3 to pull together the formal definition of archiving/preservation [recorded in http://www.w3.org/2016/02/04-dpub-arch-minutes.html#action01]
<scribe> ACTION: TimCole to add the creation of use cases to the goals on the wiki [recorded in http://www.w3.org/2016/02/04-dpub-arch-minutes.html#action02]
TimCole: Next topic - experts we should consult. We have some pointers to documents on the wiki; do we need specific people brought in as well?
tzviya: Our goal is to define use cases. To
get a broader set of use cases, it would be useful to either interview
or invite others from that community.
...: Deborah has formal archival training. (So does Heather)
:-)
TimCole: have been in touch with people at
Portico for ideas about their workflows; they ingest data and normalize
it on a regular basis. Want to know how the format of what they get
impacts their workflow.
...: To avoid duplication, suggests that we keep track on the wiki re:
who we are reaching out to.
Bill_Kasdorf: Portico and Lockss/Clokss are
interesting contrasting organizations in this space.
...: Portico normalizes the content, whereas Lockss/Cloks harvests
documents and so has a lot of web documents.
dkaplan3: Outreach - yes, we should do
that, and not just to organizations, and we should keep a list. Another
problem to be aware of, this TF is currently mostly US participants.
...: if we can get someone not anglophone, at least as consulting
expert, that would be helpful.
TimCole: What about resources or documents? Anything to add there? Please add if you think of anything.
<Bill_Kasdorf> Also British Library, KB (Nat. Lib. of Netherlands), BNF (Bibliothèque Nationale de France)
<Bill_Kasdorf> Important issue with BL, KB, etc. is that they are they are mandated to archive content, "legal depository"
TimCole: Regarding logistics, this TF has enough work to keep us busy for a while. Should we get on a regular call schedule? What would be a good timeline for this work?
+1
<mgylling> +1
<tzviya> +1
We will aim for twice a month, though perhaps not at this time.
TimCole: Will search in the range of
10am-12pm Eastern, M-Th. This will narrow down the doodle poll.
...: Emails should go to the main dpub list, but email authors should
remember to put in [dpub-arch] in the subject for easier sorting.
<dkaplan3> +1
...: What's our timeline? How long should this TF expect to run?
dkaplan3: As long as we keep the scope narrow, we start with what is not already defined (don't reinvent definitions where we don't have to)
<Zakim> tzviya, you wanted to discuss goals and timeline
tzviya: Let's not let this be something that just happens at the meetings; do work between meetings. We can target writing use cases and seeing how much we can do in three months.
mgylling: +1 to tzviya. In terms of timeline, we haven't set a final delivery date to the larger use case effort, but we will soon. Having a note by TPAC this year (end of September) would be a reasonable target.
and music ensues
mgylling: NISO also has work going on in this space; make sure we don't duplicate effort.
no more classical music. sadness.
<scribe> ACTION: TimCole to reach out to Todd Carpenter at NISO re their work in this space [recorded in http://www.w3.org/2016/02/04-dpub-arch-minutes.html#action03]
TimCole: so, three to four month slot.
Target end of May.
...: Is there a deadline on the PWP?
tzviya: The IG chairs need to talk about that.
mgylling: if this group comes up with a new
paragraph, that will be enough to refresh the PWP regardless of its
state.
...: it is a lightweight process to update that when needed.
TimCole: does anyone have comments on
Leonards presentation re: PDF/A? Might schedule time on a future call
for Leonard to talk about this directly.
...: PDF/A is a recognized standard, but probably not sensible to turn
everything into PDF/A
tzviya: An interesting presentation, but don't put the cart before the horse. We are not recommending one particular solution here.
dkaplan: There are very good reasons that
PDF/A is not the appropriate recommendation. We are (probably) not
headed towards ISO standardization. In generating the PDF/A standard,
many contacts were made and use cases developed.
...: to the extent that the archival community participated, we should
find that input and use it
TimCole: Do people want to start commenting
on what use cases? What libraries have done historically is collect
content from publishers at time of publication, so there are use cases
of library services telling publishers what they need
...: but often libraries are coming to content well after publication.
That's another category of use cases.
dkaplan: Archivists can ingest just about anything. Anything you come to after-the-fact, anything that hasn't been made as an archival document to begin with, is just like anything else (games, etc) that they might have to archive.
TimCole: so should we make clear some of
the potential trade-offs about what happens if you don't consider
archival requirements up front?
...: Print materials were simpler. Digital material introduces problems
of versioning.
Bill_Kasdorf: Would like to see a basic definition that a PWP is natively amenable to archiving, similar to how EPUB is natively amenable to accessibility
dkaplan: Agree with limitations.
Accessibility should be the default, and the same thing with
preservation. This is, however, a huge limitation.
...: A preservable document can't be preservable unless it is entirely
offline with all its essential elements.
Bill_Kasdorf: That is a fundamental principle of PWP.
TimCole: Evn when you can take everything offline, if you try and open it in 5 years, it likely won't look the same as it did at time of publication.
Bill_Kasdorf: What is it that's being preserved? Is it the appearance or the essential content?
dkaplan: That is a question that even in the preservation community must be decided on a document-by-document basis.