Publishing Working Group Audio Task Force

28 Aug 2018


zheng_xu, gpellegrino, dauwhe, George, Avneesh, dkaplan3, romain, duga, Hadrien, marisa, DanielWeck



  1. #38, Alternate formats or bitrate for an audiobook
  2. #307, Duration and Size
  3. #229, contributors
  4. #320, supplemental content

wendyreid: let's get started
... any comments on the minutes?

<zheng_xu> https://www.w3.org/2018/06/21-pwg-audio-minutes.html

wendyreid: minutes approved
... thanks everyone for coming, sorry about the delay between meetings
... the stuff Hadrien posted is a good start
... today I want to review github issues tagged "audio"
... first issue #308

#38, Alternate formats or bitrate for an audiobook

<wendyreid> https://github.com/w3c/wpub/issues/308

Hadrien: first, is this in scope?
... on the web, when we include an audio or video source it's through the audio or video element, which includes fallbacks
... with what we've done so far, that won't work
... second, looking at the existing audio formats, it's common to be able to switch resolutions/bitrates
... so that might be useful

Avneesh: what is the status of media in WP in general?

Hadrien: I think this is specific to task force
... in the general group we're talking about html resources
... in theory it shouldn't be specific to us, but in practice it is

Avneesh: from my experience with EPUB3, restricting to an audio format may be limiting
... we've had long discussions about core media types in EPUB 3.2
... we could provide recommended formats

Hadrien: this is separate but related, do we have recommended media types
... and a way of identifying different formats as representing the same content

wendyreid: can you open an issue

Avneesh: I can do that

George: looking forward to EPUB 4, the bitrate will be really important
... mp3 @ 32 is the lowest you can go
... but higher bitrates are better for audio books

duga: cycling back to whether this is in scope
... audio formats on the web are a mess
... but if we start speccing how to choose between bitrates we will add to the mess
... I don't think we should spend lots of time on
... there's stuff like dash
... I don't think we can solve the problem in this group

Hadrien: this is a potential answer
... there are formats with multiple bit-rates
... then we couldn't use formats that don't have that native feature

wendyreid: the main Q is, is this in scope for us to talk about media-types?
... I think we could make recommendations about media types
... spotify/audible all have this high quality/low quality download, they make you decide
... user agents/apps will decide that for us

Hadrien: I don't agree; they've solved that in a closed ecosystem
... the fact that they have it proves its a common requirement

wendyreid: we probably shouldn't require that you have multiple bitrates, but we should make it possible

Avneesh: one thing to add
... we should think about compulsory metadata, which includes bitrate

wendyreid: for sure
... that leads into the discussion of the next issue, duration and size

#307, Duration and Size

Hadrien: it's common in UIs for audio books to present this information
... both for whole book and current track

<wendyreid> https://github.com/w3c/wpub/issues/307

Hadrien: we can't express in plain WP
... it's not in the draft
... there is stuff in schema.org
... should this stuff be in a track or at the publication level?

wendyreid: good question
... we have total duration and individual file lengths
... and the math sometimes doesn't work out

gpellegrino: afaik, in audio files you have duration metadata, such as in MP3
... I don't know if we need to duplicate it outside the MP3 files

Hadrien: but you would need to download the tracks to read that info

gpellegrino: if we allow the author to write the metadata, it can introduce errors
... can be misleading to the user

Hadrien: you can treat it as a hint
... inconsistency wouldn't break the experience
... distributor could verify such information

zheng_xu: I was going to say we don't need to download track to get information
... we are thinking about duration of each resource in reading order
... its just for TOC... are there other use cases for duration of each resource?

duga: that's the Q I was going to ask
... we're talking about total book duration, which is useful
... but knowing the duration of each track? I don't know what the use case is

Avneesh: total duration is important
... but for each toc it would be helpful , maybe don't make it compulsory

Hadrien: there are multiple use cases, I've seen audiobook players that show this info
... and you can decide which track you want to cache
... if I'm on a 2-hour train ride, I can choose to cache enough content
... and some UAs have a timeline

duga: the second example is a good one
... but I'm wary of the the first example; there may not space to cache the next
... it's a lot to demand of the user
... better for the UA to make those decisions
... I'm worried about duplicating information. the more we do that the more often it's wrong.
... and more work for authors

George: we're using terms like tracks and TOC
... and people are implying there's a relationship between TOC and tracks

<duga> +1 to George - absolutely!

George: but a TOC might point to the midpoint of an audio file
... we should be more precise in our language

<dkaplan3> +1 george

George: unless we think there's a relationship between start of file and start of chapter, which would be an authoring restriction

wendyreid: absolutely

Hadrien: for George's comment, if you look at my example these two are different
... we should talk about resource and reading order
... going back to Brady's point about displaying the list of tracks, it's very common in audiobook UAs
... they're different from EPUB UAs
... and they provide lots of choices to users re downloading
... some UAs default to stream, and then give user control over downloading

duga: yes, I've used these listeners, I've implemented one, but usually this is done at storage level or amount of time
... like the cache is 20 or 50M. But it doesn't expose the individual tracks. Is the user going to do bitrate multiplication?
... usually you say, here's how much space I'll give you. Do the right thing.
... we don't expose track durations and sizes.
... that's up to the implementor.
... download 50m... that might be 3 tracks, or part of one.
... and I don't think your metadata helps. You have to download the resource.

zheng_xu: If we have track list and TOC, it looks confused
... for audio book I don't think you need to expose each track

zheng_xu: if publisher doesn't provide toc information, we might need this

Hadrien: I've replied to that in github
... I'm not aware of publishers producing a toc
... but they publish a number of audio resources
... I've had to manually enhance this example
... to reply to brady about UI
... this can be a long discussion or a preference
... I don't think telling user to cache 20M is useful, it should go by time
... but seeing a list of tracks, I've seen that multiple times
... might be smarter to have autocaching, but it's in the market

wendyreid: we can talk in circles about what UAs do with audiobooks
... I've seen both in the market
... I'm leaning towards providiing the data, and then they can choose how to use it
... we should find a way to provide the data

zheng_xu: for pure audio book, we are the user agent
... so it's up to us

wendyreid: I don't think we have a final decision on duration. if the schema stuff works we should use that

Hadrien: in WP if we say the type of publication is audiobook, then we can use schema:duration
... it is defined on audio book
... but it's using ISO 8601 (?) duration
... for audio resources, in current WP draft we're defining our own type, publicationLink.
... it doesn't say anything about duration, but could be extended

wendyreid: do we need to bring this up in the main group?

Hadrien: yes

#229, contributors

<wendyreid> https://github.com/w3c/wpub/issues/229

wendyreid: should we identify narrators at track level?

zheng_xu: ... I changed title today
... if we have different type of audio for one resource, how do we express it in metadata

Hadrien: this is similar to the discussion about duration for tracks
... there is a risk of duplicating information
... is this useful for UAs?
... then it's worth considering expressing it in the manifest
... given that there are lots of different metadata formats for audiobooks, it might be hard to get from other sources

wendyreid: it depends on the resource.
... I've seen lots of books with different chapters with different narrators etc
... but this could get noisy quickly

Hadrien: right now you can list all those narrators with read-by
... but you won't know which narrator reads which resource
... but you could include that info in the description

duga: I agree
... but I want to go back to what George
... said. We talk about tracks as if they have meaning. But they don't.

<dkaplan3> duga++

duga: they're often random sections of audio.

<Hadrien> agree that how the content is divided has no particular meaning

duga: trying to impart meaning on arbitrary media resources is the wrong direction
... if we want to really do this, we need to point to time segments, not resources
... resource files don't have inherent meaning

wendyreid: there are two options
... high-level metadata, everyone gets listed at the top level
... for other stuff you can use description
... or we can go to track/time level

Hadrien: I want a vote
... I think description enough
... brady is right htat how the content is divided is meaningless
... but timestamps are tricky, if it crosses boundary of multiple resources
... for now I think description is enough

<George> It seems that the boundry of specfic speakers or narrators is more in the domain of synced media, the community Group Marisa leads.

dkaplan3: I'm good with a vote, and with the less-granular version
... there will be a long tail of things we wish audio books could do that they can't do
... a lot of publishers will not implement anyway
... the perfect is the enemy of the good, as Ivan said the other day

zheng_xu: I think this depends on how we define duration
... then how we define duration for each narrator
... not possible to generate automatically, so that's why we need info in manifest

wendyreid: let us have the vote
... do we want to say that narrator/creator info be in publication-level metadata, or resource-level metadata?

<dkaplan3> 1

<DanielWeck> 1

<duga> 1

<marisa> 1

wendyreid: vote 1 for publication level, 2 for resource level

<Hadrien> 1

<romain> 1

<zheng_xu> 1

<George> George 1

wendyreid: we seem to choose publication level
... zheng_xu, could you update your issue?

<zheng_xu> sure

#320, supplemental content

<wendyreid> https://github.com/w3c/wpub/issues/320

wendyreid: lots of non-fiction audiobooks contain supplemental content, like PDFs of charts or collections of images

zheng_xu: before we get into details of supplemental content, we should define what it is

dkaplan3: don't want to get too long-tail
... this should come up in general WPUB
... anything from maps to datasets
... in children's picture books with audio
... and YA and middle grade fiction with emoji
... there are lots of cases where supplemental content is traditional non-fiction mode,
... do want a place to add those in? do you need to link to a point in the audio?
... how do you address a11y of this?
... maybe we don't need to break it down
... we want access to supplemental material, with a useable and accessible UI

<George> If we have text, images, etc. we are no longer in an audio book.

dkaplan3: and we want something that's sustainable and creatable by authors

Hadrien: what we have in WP Is a good starting point
... we have resources in reading oroder, resources that are not in the reading order but are in publication, and then resources related to the publication that are not part of it
... this is where the links element of WP is useful
... i don't think we need anything new for this

duga: I agree, we have functionality already
... another worry:
... here's an html that's related content, with some rel value
... what stops from putting an html doc in the middle of my audio listening order?

Hadrien: I think that's a separate issue
... please open in the tracker
... do we have a profile for audio book, or just best practices
... like only having audio files in reading order, requiring a duration etc

dkaplan3: we raised this in the first meeting
... with the base definition of what an audio book is
... within the framework of a WP, the idea of whether something is an audio book or a textbook stops making sense
... they're just resources
... but the reality is that we are a long way from audible or someone supporting mixed content

(unminuted horror)

scribe: but we still want to get access to supplemental content with traditional audio books

wendyreid: publisher seems well aware of the idea that they know there is content important understanding  of book, but UAs don't have a good way of presenting that

Hadrien: it's true that with WP we can create these publications
... but that doesn't mean we cant make a profile
... there's no standard way of representing an audio book

<zheng_xu> Wendy: we will create next meeting in prob Sep and TPAC

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.152 (CVS log)
$Date: 2018/08/30 08:00:11 $