W3C

Digital Publishing Interest Group Teleconference

27 Apr 2015

Agenda

See also: IRC log

Attendees

Present
Charles LaPierre (clapierre), Tzviya Siegman (Tzviya), Rob Sanderson (azaroth), Ivan Herman (Ivan), Phil Madans (philm), Markus Gylling (Markus), Bill Kasdorf (Bill_Kasdorf), Laura Dawson (LDawson), Dave Cramer (dauwhe), Deborah Kaplan (deborahGU), Laura Fowler (lfowler), Patrick Keating (pkeating), Mike Miller (MikeMiller), Vladimir Levantovsky (Vlad), Alan Stearns (astearns, Nick Ruffilo (NickRuffilo), Karen Myers (Karen_Myers), Tim Cole (TimCole), Peter Krautzberger (pkra), Paul Belfanti (pbelfanti), Ben De Meester (bjdmeest), Bert Bos (Bert), Jeff Xu (zhengxu)
Regrets
Brady Duga, Ayla Stein, David Stroup, Julie Morris, Heather Flanagan
Chair
Tzviya Siegman
Scribe
Dave Cramer

Contents


<trackbot> Date: 27 April 2015

<tzviya> agenda https://lists.w3.org/Archives/Public/public-digipub-ig/2015Apr/0098.html

<tzviya> http://www.w3.org/2015/04/20-dpub-minutes.html

tzviya: let's look at last week's minutes
... motion to accept?
... minutes approved.
... today's agenda

<tzviya> https://lists.w3.org/Archives/Public/public-digipub-ig/2015Apr/0098.html

STEM Survey

pkra: first look at survey
... 34 responses
... sent to 93 people
... ok result, but not too exciting
... people don't feel qualified to answer
... good coverage on all questions
... 37 questions total
... there's an unsurprising bias towards math
... partly due to me leading the survey
... lots of people talked about mathml
... it wasn't a random sample
... run through a few of the questions
... first section was about background
... bias towards CS and math
... 2nd was professional background
... didn't have a lot of aggregators
... most people were researchers
... researchers were primary audience
... we forgot to ask about students as audience
... which platforms people serve
... a few comments pointed to "the web is our platform"
... most people were focusing on desktop
... on the low end was print and ebooks
... next section was about content
... subject domains
... the question about reusing content was unclear
... "were people actively reusing content"
... prev. q was about making content resuable
... so people just repeated answer
... there was q about standardization
... there will be fun quotes from that

<pkra> https://www.w3.org/2002/09/wbs/64149/DPUB-STEM-2014-12/results

pkra: next section about authoring
... there were a couple of problems
... first question didn't get responses we wanted
... should have asked explicitly about STEM fragments
... want to know why people lose information when converting
... but questions were too vague
... people talked about transformation of text formats
... surprised to hear people do version control, did mention things like git
... "we see more and more JSON in scholarly publishing"
... that was unexpected answer to STEM fragment storage question
... and very positive

ivan: are some formats other than MathML that are widely used?

pkra: one question asked that
... not at this point
... alas, I don't have all the data in my head :)

tzviya: question 21, maybe?

pkra: yes, 21
... CML
... there isn't that much
... someone from Wolfram had CDS
... not as much as I'd hoped
... in the authoring section
... Q15, how do people provide access to content fragments
... that was a good result
... people are used to using XML
... next section on delivery
... how is content delivered
... HTML was ahead of PDF
... 32% to 26%
... most produce both
... there was an open question about desired methods
... with ten longer comments
... Q about migrating away from PDF
... people said Yes
... not enthusiastic about PDF but said they have to
... Q about exposing data on web
... answers were all over the place
... Q about embedding scientific data via attributes, microdata, etc.
... quite a bit of stuff there

tzviya: more summary would be good

pkra: we didn't mention stem fragments again when asking about bridging reading and authoring
... lots of fun quotes about reading
... the workflow section was tricky to write
... similarly solid in results, no surprises but good data
... PDF comes up because we have to
... a11y section I haven't looked at much
... that was a challenging section
... more answers about people not having expertise or answers

tzviya: I have a few questions
... first
... everyone should read through this even if STEM isn't your bag
... thanks PKRA!
... so what do we do next? There's a lot of information here, and it's all over the place
... might be helpful to focus on a few points and go from there

pkra: one challenge is to extract data in efficient way to spreadsheet
... yes, it's a lot, and it's not clear what to focus on
... a meeting of task force will help
... to get an in-depth summary
... original idea was to publish a note by the IG
... but the input in some sections is not viable
... for example, a11y is not balanced enough to provide good feedback

tzviya: if we try to write a note summarizing everything, it would be huge

pkra: would be good to make data available in anonymized fashion

ivan: I've already produced a spreadsheet
... so I can just remove a column

pkra: yes, but maybe that's a different conversation
... I had trouble with the spreadsheet
... Ivan, is it legally possible to publish data?

ivan: if it's anonymous, then it's not a problem

<astearns> not just names, but any identifying details in free-form responses

pkra: then the note can be much more focused
... it's all anecdotal

Bill_Kasdorf: while the anon. data are useful, what's most important are the messages
... sometimes they're clear and sometimes they're contradictory
... don't need to cover all issues equally
... what did we hear that was notable?

pkra: I agree

tzviya: OK. Any other comments?

ivan: You also need people around you?

pkra: I will rely on the task force, but more people will be great.

ivan: if anyone in the group has some experience in managing survey results, volunteer!

pkra: I'm also not experienced

NickRuffilo: I've done this
... if you have any questions about creating non-leading questions or analyzing results, please let me know
... I'll pull up articles I've written about this

tzviya: NickRuffilo is our new favorite

Bill_Kasdorf: too much reliance on statistical analysis would be suspect due to the small and biased sample

Bill_Kasdorf: more an editorial task

<NickRuffilo> Excel TIPS: http://publishingperspectives.com/2013/01/tips-for-technologists-7-excel-with-excel/

Karen: do I have confidence this is not public, just members-only? I don't think raw results should be public

ivan: we can check

pkra: it's always asked me to log in

ivan: it's not public

Karen: OK. If there's a quote, it will not be attributed?
... more a qualitative report

ivan: it should never be quoted or attributed

pkra: most people would be OK with that

tzviya: we've never asked permission to quote people
... we have our existing STEM task force, and Nick has been drafted.

<pkra> or otherwise shanghai you ;-)

Karen: we have a replacement AC rep for Copyright Clearance center, who may be interested in STEM

tzviya: thanks Karen.
... moving on...

Fragment ID-s

tzviya: a few weeks ago there was a discussion about fragment identifiers

https://lists.w3.org/Archives/Public/public-annotation/2015Apr/0051.html

tzviya: which ended up on the annotations list
... Ivan will lead us through this

tzviya: lots of discussion on what makes a legal identifier

ivan: is rob around?
... we discussed that three weeks ago that the model for selectors in open annotations document
... is in fact a very rich and powerful collection of terms

<tzviya> ivan: perhaps starting with position of the selection model in open annotation is a good

<tzviya> ...not bound to one media type

<tzviya> ...the model is an abstract model, not described in terms of URI

<tzviya> ...perhaps it's possible to turn it into fragment ID

<tzviya> ...then there was further discussion of when a frag ID is legal, etc

<tzviya> Rob's position: https://lists.w3.org/Archives/Public/public-annotation/2015Apr/0054.html

ivan: the fundamental issue is the following
... a fragment identifier is bound to a specific media type
... you must register for each and every media type
... so you can't just declare the OA model for the entire world
... if we go down that route and use the selector model
... the correct way is we define them as part of a URI
... and then we register them for some of the media types for which they are useful
... html, svg, etc.
... we can do that
... then it can be combined with other mechanisms as it's done with web packaging
... it's not clear to me who should work on this
... I have the impression that there's some sort of agreement that if we restrict by media types we can do this
... azaroth, is this a fair summary?

azaroth: yes, that's it

Bill_Kasdorf: keeping in mind the mission of the various groups
... DPUB is expressing needs, not writing standards
... I have a strong interest in what Anno WG comes up with
... as far as DPUB, our job is to surface the issue and work with the appropriate WG
... the OA model you recommended came out of a CG
... I'm not trying to wash my hands of it

ivan: I understand and agree
... two comments
... we already do this approach with structural semantics where we're involved with PF
... the other thing is that we need to recharter this group
... maybe we can then go beyond what we have here
... we'll see where it goes after September

<TimCole> http://www.w3.org/2015/04/22-annotation-minutes.html

TimCole: it was discussed at Anno F2F a bit
... if this is important to DPUB
... you need to push Anno WG
... talked about in context of rangefinder
... but it's not high on our priority list
... might be good to bring this up as collaborative
... the anno model allows lots of selectors
... not all will work as fragment identifiers
... also, EPUB already has something that kind of works

ivan: I agree

tzviya: we all are talking about epub cfi as if it solves the issues, but few people use it
... there are too many options
... it's still a multiple choice question
... i can use xpointer, I can use CFI, but what happens with packaging and building systems and epubweb
... those urls look like multi-part mime
... so maybe we should get used to those semicolons in URI

ivan: for me, the packaged uris don't look that funny
... if you combine with packaging, what is done in the packaging spec is what should be done

<clapierre1> looks like Readium supports the EPUB CFI https://github.com/readium/readium-cfi-js

ivan: we are getting into technical discussions
... good topic for f2f
... we should have clear and clean idea of pros and cons of CFI
... it's in the same space as the selectors
... CFI provides you with a fragment ID
... is CFI was completely useful then the package spec approach combined with CFI and we are done
... so we need to have a clear idea in NY whether CFI works, or it does not work
... if it does not work, we need to look at alternatives

Bill_Kasdorf: don't want to conflict with anno

HTML5 + Footnote

tzviya: we followed offline with berjon and michael smith
... HTML will not have a formal proposal for an element, so we should pursue role with aria
... in the past, some roles have been promoted to element in HTML
... so we're moving forward with ARIA role
... maybe HTML will take it up in the future

TimCole: since most annos are third party
... there might be use cases where people mine footnotes as they mine annotations
... we should see how footnotes might be transformed into annotations
... we should keep that in mind

tzviya: we agree
... footnotes are somewhere in the middle of content and annotations

<azaroth> +1 :)

tzviya: let us know if you're coming to the f2f

<tzviya> http://www.w3.org/dpub/IG/wiki/May_2015_F2F_Logistics_and_Details

tzviya: see you next week

<ivan> trackbot, end telcon

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.140 (CVS log)
$Date: 2015/04/28 04:59:43 $