08:41:45 RRSAgent has joined #ad 08:41:45 logging to https://www.w3.org/2018/10/25-ad-irc 08:41:50 Zakim has joined #ad 08:41:54 rrsagent, make logs public 08:42:18 onishi has joined #ad 08:42:52 ericc has joined #ad 08:43:50 scribe: nigel 08:44:23 Chair: Nigel 08:45:14 Present+ Nigel_Megitt, Marise_Demeglio, Eric_Carlson, Andreas_Tai, Masayoshi_Onishi, Matt_Simpson 08:45:18 Topic: Introductions 08:46:12 Nigel: Welcome everyone to the first face to face meeting of the AD CG. 08:47:41 .. Run through of agenda 08:47:59 -> https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf Slides 08:48:35 Nigel: In the room we have: 08:48:42 .. Nigel Megitt (BBC) 08:49:10 .. Marisa Demeglio (DAISY consortium), in the Publishing WG and interested in accessibility 08:50:21 .. Eric Carlson (Apple), on the Webkit team, mostly working on media in the web, and 08:50:28 .. of course very interested in accessibility solutions. 08:50:55 Andreas: Andreas Tai (IRT), mainly work on subtitles and captions and also look at other 08:51:10 .. accessibility. Unfortunately not yet resources for dedicating time to this, but interested 08:51:13 .. in the status. 08:51:25 s/.. Eric/ericc: Eric 08:51:50 onishi: Onishi (NHK), NHK use 4K and 8K broadcast service and this uses TTML. I'd like 08:51:56 .. to research use case for TTML. 08:52:17 Matt: Matt Simpson (Red Bee), Head of Portfolio for Access Services, probably one of the 08:52:32 .. biggest producers of audio description by volume for a number of clients around the world. 08:52:37 Nigel: Thank you all 08:53:54 Regrets: John_Birch 08:54:03 Topic: Current and future status 08:54:21 .. AD CG set up earlier in the year, we have a repo, an Editor, and participants. 08:55:29 Nigel: Goal: Get to good enough for Rec Track, add to TTWG Charter 1st half 2019 08:55:35 s/.. AD/Nigel: AD 08:55:59 marisa: Timeline for TTML2? 08:56:19 Nigel: TTML2 is in Proposed Rec status, the TTWG is targeting Rec publication on 13th November. 08:56:48 .. The AC poll is open until 1st November. Please vote if you haven't already! 08:56:58 Topic: Requirements 08:57:10 Nigel: Goal: To create an open standard exchange format to support audio description all the way from scripting to mixing. 08:58:06 ericc: You should look at what 3PlayMedia has. 08:58:12 Nigel: Thanks I will 08:58:23 .. Are they delivering accessible text versions of AD? 08:58:39 ericc: Yes, both AD and extended, both pre-recorded and synthetic text, and they have 08:58:49 .. a javascript based plug-in that works in modern browsers. 08:58:55 Nigel: That sounds great, I didn't know about that, thank you. 08:58:57 atai has joined #ad 08:59:06 ericc: I haven't played with it much but it seems to work quite well. 08:59:35 marisa: When you talk about an accessible text what makes it accessible? 08:59:52 Nigel: It's delivered as text and the player can present it in an aria live region so that 08:59:57 .. accessibility tools can pick it up. 09:00:02 marisa: And TTML makes that happen? 09:00:11 Nigel: It needs the player to make it happen. 09:00:45 Present+ Mark_Watson 09:01:17 Nigel: Existing Requirements - I published a wiki page of requirements a while back. 09:01:34 -> https://github.com/w3c/ttml2/wiki/Audio-Description- Requirements AD requirements 09:01:49 Nigel: Those requirements got some feedback which led to changes. 09:02:05 .. In particular to relate them to the W3C MAUR requirements, which they align with. 09:02:27 https://github.com/w3c/ttml2/wiki/Audio-Description-Requirements 09:02:53 .. Those requirements describe the process that the document needs to support 09:03:28 .. but not the specifics of what the document itself needs to support. 09:05:43 .. I've done a first pass review, the main body of the spec work would be to validate that 09:05:55 .. those TTML2 feature designators are the correct set. 09:06:31 markw has joined #ad 09:06:54 https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf 09:07:43 Nigel: In looking at those requirements I thought there were some constraints to consider. 09:07:53 .. Two questions from me: 09:08:02 .. 1. Do we ever need to be able to have more than one “description” active at the same time? 09:08:44 Matt: I can't see a reason for needing this - it would have to be a variation of the primary language. 09:08:53 .. Multiple localised versions might be needed. 09:09:02 .. I imagine that would be a single track per file. 09:09:07 Matt: Yes, interesting thought. 09:09:19 marisa: A variation on a use case, if you have a deaf-blind user who is following the 09:09:34 .. captions they also need the information from the description and the captions. 09:09:44 markw: They would have both description and captions available at the same time. 09:11:11 Nigel: Assumptions on my part: 09:11:16 .. Separate AD and captions files 09:11:29 .. No AD over dialogue so not a significant issue of overlap 09:11:49 marisa: If viewer needs to pause AD to read it on a braille display... 09:11:56 Nigel: My assumption: that would also pause media. 09:12:01 ericc: [nods] 09:12:11 marisa: That's the trickiest use case I can think of 09:12:15 Nigel: Me too 09:12:30 atai: I'm not sure if immersive environments are in scope. 09:12:47 .. A European project that IRT is involved with is exploring requirements for AD in 360º videos. 09:12:57 .. I'm not sure if they implemented it, but one idea is to have some parts of the AD only 09:13:10 .. activated if the user looks in a certain direction, so if this is happening in one document 09:13:21 .. then there would be certain AD parts with the same timing but maybe not active at 09:13:24 .. the same time. 09:13:32 marisa: Great use case! 09:13:50 .. Now a deaf blind user in a 360º is now the trickiest use case in the world I can think of! 09:14:04 ericc: That means in addition to a time range, in the case of a 360º video you may also 09:14:17 .. want to have an additional selector for the viewport in which it is active. 09:14:30 markw: Or the location of the object it is associated with. 09:14:40 atai: This is very similar to the subtitle use case we showed before where you stick 09:14:51 .. subtitles to a location. You need the same location information for AD. 09:15:08 markw: The user could have selections about the width of the viewport they want. 09:15:25 Nigel: That's a great use case - can I suggest it's a v2 thing based on the solution for 09:15:30 .. subtitles, which we also don't know yet? 09:15:53 atai: I agree the solution for subtitles should apply here. That makes sense, but it would be 09:16:04 .. good to discuss it and understand the dependencies. 09:16:51 atai: I will check with the people working on this. I don't know any technical group working 09:17:03 .. on audio description so it would be a good forum for working on requirements. 09:17:12 .. If they want to contribute something they can post it on the CG reflector. 09:17:14 Nigel: Good plan. 09:18:47 .. Summarising, I don't think I've heard any requirement for multiple descriptions to be 09:18:54 .. active at the same time, within a single language. 09:19:24 .. My next constraint question is: 09:19:32 .. Do we need to set media time ranges (clipBegin and clipEnd) on embedded audio? 09:19:46 .. TTML2 allows audio to be embedded, but in our implementation work we hit a snag. 09:23:17 .. applying media fragment URIs to a data URL is tricky. 09:23:33 ericc: Embedding audio as text is a terrible idea. 09:23:46 markw: Any reason other than the amount of data? 09:24:36 ericc: You have to keep the text and the decoded audio in memory at the same time, 09:24:40 .. which is additional overhead. 09:25:08 ericc: Technically it should be straightforward to seek to a point. 09:25:31 marisa: I don't want to implement it! 09:25:37 ericc: It's terrible. 09:27:14 atai: Is it then debatable to leave out this feature of embedded audio? 09:27:32 Nigel: I think so, yes, the result would be that distribution of recorded audio would have 09:27:33 marisa has joined #ad 09:27:49 .. to be additional files alongside the TTML2 file. That has an asset management impact, 09:27:57 .. but it also seems like good practice. 09:28:19 ericc: High level question: I talked with Ken Harenstein who does YouTube captions, last week, 09:28:33 .. and he told me about 3PlayMedia. He said that from their research and from talking to 09:28:47 .. users of audio descriptions and from talking to 3PlayMedia, it was his understanding that 09:28:59 .. many users of audio descriptions prefer speech synthesis to pre-recorded because 09:29:13 .. partly it allows them to set the speed like they're used to doing with screen readers 09:29:29 .. and it made extended audio descriptions less disruptive because it reduces the likelihood 09:29:42 .. of interrupting playback of the main resource. I wonder if you have heard that too and if 09:30:01 .. it is true it seems that there should be information in a spec helping people who make 09:30:06 .. these make the right kind. 09:31:19 Nigel: TTML2 supports text to speech, and also players can switch off the audio 09:31:32 .. and expose the text to screen readers instead to allow the user's screen reader to take 09:31:33 .. over. 09:31:51 marisa: I've heard that most screen readers speed up the speech. 09:32:08 markw: I've heard it works better speeding up synthesised speech 09:32:20 marisa: Of course if there's no language support for text to speech then you may still 09:32:32 .. need pre-recorded audio. 09:34:18 atai: You may need to know how long the text to speech will take to author the rate correctly. 09:34:39 Nigel: There's a whole other world of pain in terms of distributability of web voices for text to speech. 09:34:56 ericc: I think the requirement is that the player pauses to allow for completion of the 09:35:05 .. audio description, so it doesn't matter how long it takes. 09:35:30 marisa: What if you're switching language of AD and some are more verbose than others? 09:35:43 ericc: Yes, as long as the description accurately identifies the section of the media file 09:35:54 .. that it describes then it is easy enough for the player to take care of, or at least it is the 09:35:58 .. player's responsibility. 09:36:10 markw: The player could do other things like tweaking the playback speed to fit. 09:36:29 ericc: The Web Speech API doesn't allow access to predicting the duration of the speech. 09:36:46 atai: Is player behaviour in scope for this document? 09:36:49 ericc: Absolutely. 09:37:01 .. It seems to me that it is because if you don't describe the behaviour of the player you 09:37:12 .. are going to get different incompatible or non-interoperable implementations and that 09:37:14 .. is an anti-goal. 09:37:28 markw: You want to describe the space of possible player behaviours, we just need to 09:37:31 .. provide the information. 09:37:47 ericc: Yes, give guidelines to help implementers do the right thing, and people who create the descriptions. 09:39:32 Nigel: I agree, this is somewhat informative relative to the document format, but for example 09:39:47 .. our UX people suggested that users would want to direct AD text to a screen reader 09:39:59 .. and switch off audio presentation sometimes, or at least be able to select that. 09:40:17 marisa: Maybe have both audio and braille display to check spellings or do some other text-related processing. 09:40:19 Nigel: Yes 09:41:27 Nigel: In terms of user preference for synthesised or pre-recorded speech, one data point 09:41:44 .. I learned recently is that the intelligibility of synthesised speech degrades more quickly 09:41:59 .. in the presence of ambient sounds than human speech. The reasons are not clear. 09:42:12 markw: Suggests that some users would want to receive the AD in a separate earpiece 09:42:19 .. from other audience members watching the same programme. 09:42:48 Matt: I think this is like dubbing vs subtitling, there may be cultural reasons for preferences. 09:43:02 .. Our experience is it is harder to automate variable reading rate descriptions, and we find 09:43:18 .. that invaluable to squeeze a description into a short period or let it "breathe". 09:43:24 .. It's probably down to historical experience. 09:43:50 Present+ Francois_Beaufort 09:44:04 Francois: I work at Google on the developer relations team. 09:44:36 fbeaufort has joined #ad 09:44:41 nigel has changed the topic to: Channel for the Audio Description Community group. Slides: https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf 09:44:55 nigel has changed the topic to: Channel for the Audio Description Community group. Slides: https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf Webex: https://ebu.webex.com/ebu/e.php?MTID=m8453309dd2136cedb50a126ec3aeff98 09:45:17 s/Francois/fbeaufort 09:45:30 Nigel: Any other constraints or requirements? 09:45:41 group: [silence] 09:45:56 Topic: TTML2 in more detail 09:47:11 notabene has joined #ad 09:50:44 Nigel: [slide on Audio Model] 09:51:04 .. I just added this to try to explain because I've found it can be tricky to get across to developers 09:51:19 .. that there is an analogy with HTML/CSS and the audio model in TTML. 09:53:16 markw: Players may or may not do this based on user preference, if for example someone 09:53:30 .. is listening on a headset and there's main programme audio in the room the mixing 09:53:34 .. preferences might change. 09:53:37 .. [slide on the Web Audio Graph] 09:53:51 .. This allows the audio mixing to happen with all the options that are needed in general 09:54:02 .. in TTML2 - it may be that we only exercise a part of that solution space. 09:54:33 Topic: Proposed Solution 09:55:48 Nigel: The solution that I'm proposing is a profile of TTML2 09:55:56 .. [slide for Profile of TTML2] 09:56:11 ericc: Also add that a UI should be provided for controlling the speed of audio descriptions 09:56:18 nigel: Yes 09:56:24 s/n/N 09:56:35 Nigel: The other things on this slide we already discussed. 09:57:14 Nigel: Is anyone thinking this is a great problem to solve but it should look completely different? 09:57:46 ericc: Is it a goal to define a guide for how this should work in a web browser? 09:59:26 Nigel: The TTML2 features are done in terms of Web Audio, Web Speech etc. so yes. 09:59:42 .. The mixing might happen server side but the client side mixing options allow for a better 09:59:49 .. range of accessible experiences. 10:00:21 ericc: It seems to me that a really detailed guide to implementation would be the most useful thing. 10:00:47 .. An explicit goal should be to help producers to create content in the right way, but also 10:01:05 .. to help people that want to deliver that to know how to make it available to the people that need it. 10:01:16 .. Not distribution, the playback experience. 10:01:27 .. Nicely constructed audio descriptions are not useful unless the people that need them are 10:01:29 .. able to consume them. 10:01:35 Nigel: [nods] 10:01:47 atai: It might be interesting to identify what is missing to get a good implementation in a browser 10:01:51 .. environment. 10:02:03 .. It might be interesting to hear how much browser communities are interested in that 10:02:15 .. case. A possible way to do this would be to implement a javascript polyfill or something 10:02:22 .. I'm not sure how much interest there is in native support. 10:02:37 ericc: Both are extremely useful. I don't know anything about 3PlayMedia but they have 10:02:50 .. a javascript based player that uses text to speech API so we know that it is possible. 10:03:02 .. There's is a commercial solution. We should have a description of ... 10:03:14 .. and as a data point I was at a conference last week about media in the web and this was 10:03:24 .. one of the breakouts, audio descriptions and extended audio descriptions. 10:03:39 .. It was well attended and people in the room were very interested in coming up with a 10:03:47 .. solution that browsers could implement natively. 10:03:53 Nigel: I'd love to be in touch with those people. 10:04:14 Topic: Implementation Experience 10:06:13 Nigel: BBC implemented a prototype to support TTML2 Rec track work 10:06:17 -> https://bbc.github.io/Adhere/ BBC implementation 10:12:28 Nigel: The point here is that it is possible to do this with current browser technologies, 10:12:47 .. even if there are some minor issues that I should raise as issues, like on Web Speech. 10:13:16 .. Question: Any other implementation work, or people who would like to do that at this time? 10:13:44 marisa: I would say no, we don't have the bandwidth but I'm keeping my eye on this for 10:13:54 .. the long term. The use cases come up all the time from the APA group. I think it is 10:14:03 .. on the horizon, but I can't commit to anything on the same timeline as this spec. 10:14:19 atai: Does BBC plan to publish this software as a reference implementation? 10:14:35 Nigel: I would say first we should publish as open source, and then allow for some 10:14:50 .. scrutiny, and if people agree it's at that level then great. I don't think it is now. 10:14:55 .. It would need more work. 10:15:18 atai: The question is if the BBC could be motivated to provide it as a reference 10:15:26 .. implementation. It would help if you have a complete reference implementation. 10:15:35 Nigel: I would like to, but I don't think the code is good enough yet. 10:17:22 .. I'm interested in other implementations too, for example it is possible that some 10:18:30 .. participants in AD CG might make authoring tools. 10:18:37 ericc: You should talk to 3Play also. 10:18:49 Nigel: Yes, I will. It'd be great if they would join us here. 10:19:07 Topic: Roles, Tools, Timelines, Next Steps 10:19:32 Nigel: In terms of tools, we have a GitHub repo w3c/adpt 10:20:42 .. We have the reflector, and EBU has kindly offered to facilitate web meetings with their WebEx. 10:22:47 .. [Next steps slide] 10:24:20 atai: Regarding the next steps, to move over to WG and Rec track, does it necessarily have 10:24:27 .. to end up in the TTWG? Could it be another group? 10:24:39 .. Could it be somewhere else? 10:24:48 .. To make sure the right set of people are involved. 10:25:33 Nigel: I'm not dogmatic about this - it seems like the home of TTML is a good place for 10:25:46 .. profiles of TTML, but if there's a better chance of getting to Rec doing it somewhere else 10:25:52 .. then I don't mind where it happens. 10:26:19 atai: One other idea: when the TTML2 feature set is there it may be useful to have a 10:26:30 .. gap listing relative to IMSC 1.1 so that if people want to reuse implementations and 10:26:40 .. start from IMSC 1.1 rather than TTML2 then they can see what they already have. 10:26:51 ericc: Or which features they prefer not to use. 10:26:56 Nigel: Because they had implementation difficulty? 10:27:10 ericc: Yes, for example someone targeting IMSC 1.1 support, if you list the features that 10:28:06 .. are only supported in one and not the other, it could inform. 10:28:20 Nigel: Of course the significant features in IMSC are about visual presentation and here 10:28:40 .. we are interested in audio features, so the common core of timing is all that's really left. 10:30:09 Topic: Discussion and close 10:30:18 Nigel: We've had good discussion all the way through, so thank you everyone. 10:30:46 ericc: Defining this using those TTML2 features is interesting and its good. 10:31:08 .. It sets a fairly high bar to implement. 10:31:11 atai has left #ad 10:31:16 Nigel: It took a couple of weeks to implement. 10:31:34 ericc: It makes me wonder if it would be possible to have something that is more like a 10:31:52 .. minor variation in a caption format. 10:31:56 Nigel: I think that's what this is. 10:32:04 ericc: Except for the ability to embed audio. 10:32:52 Nigel: That maybe took about half a day to implement. We could remove it from scope. 10:33:12 atai: It would be good to know what problems there are bringing this to a browser environment. 10:33:38 ericc: That's true. At the most basic it seems that what we have is some text and a range 10:33:51 .. of time that it applies to in another file. 10:35:43 Nigel: I'm thinking of high production values where detailed audio mixing is needed. 10:35:51 ericc: Is that something we need for the web? 10:36:13 Nigel: I am aiming for a single open standard file format that content producers can use 10:36:23 .. all the way through from content creation to broadcast and web use. 10:36:28 Matt: I would agree. 10:36:47 markw: Thinking about our chain, we create premixed versions and they seem quite high 10:37:02 .. quality, so this might be worth considering. 10:37:19 atai: Thinking about the history of TTML, it started out as an authoring format and then 10:37:32 .. began to be used for distribution and playback, which lead to IMSC. I understand the 10:37:45 .. purpose for one file for the whole chain, that's perfect, it's ideal, we should just avoid the 10:37:47 .. pitfalls. 10:38:05 ericc: If the goal is to have native implementation in a browser it may be worth looking 10:38:13 .. at the complexity with that goal in mind. 10:38:34 .. If it is not a goal then that's fine, but if it is then keep that goal in mind. 10:39:43 Nigel: I am not sure. It an be done with a polyfill but would browser makers like to supprot 10:42:09 .. the primitives to allow that or to implement it natively? 10:42:16 atai: The playback experience would be better natively. 10:42:36 fbeaufort: If the playback was the same would you still want native implementation? 10:42:54 Nigel: It would be great to avoid sending polyfill js to every page in that case, and it would 10:43:06 .. make adoption easier if the page author just had to include a track in the video element 10:43:10 .. and then it would play. 10:43:48 ericc: Your polyfill is about 50KB of unminified uncompressed js so it's not very big. 10:43:58 Nigel: Thank you everyone! [adjourns meeting] 10:44:16 rrsagent, make minutes 10:44:16 I have made the request to generate https://www.w3.org/2018/10/25-ad-minutes.html nigel 11:55:28 nigel has joined #ad 11:57:20 Meeting: Audio Description Community Group 11:58:16 s/.. Marisa/marisa: Marisa 11:59:27 s|https://github.com/w3c/ttml2/wiki/Audio-Description- Requirements|https://github.com/w3c/ttml2/wiki/Audio-Description-Requirements 12:02:00 ericc has joined #ad 12:03:11 Zakim has left #ad 12:09:36 ericc has joined #ad 12:10:05 rrsagent, make minutes 12:10:05 I have made the request to generate https://www.w3.org/2018/10/25-ad-minutes.html nigel 12:10:15 nigel has changed the topic to: Channel for the Audio Description Community group. Slides: https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf 12:13:43 Log: https://www.w3.org/2018/10/25-ad-irc 12:14:27 s/Marise/Marisa/g 12:14:48 rrsagent, make minutes 12:14:48 I have made the request to generate https://www.w3.org/2018/10/25-ad-minutes.html nigel 12:16:24 scribeOptions: -final -noEmbedDiagnostics 12:16:25 rrsagent, make logs public 12:16:39 rrsagent, make minutes 12:16:39 I have made the request to generate https://www.w3.org/2018/10/25-ad-minutes.html nigel 12:18:39 s/It an be done/It can be done 12:18:50 s/supprot/support 12:18:53 rrsagent, make minutes 12:18:53 I have made the request to generate https://www.w3.org/2018/10/25-ad-minutes.html nigel 12:44:08 ericc has joined #ad 14:11:30 ericc has joined #ad 14:11:40 nigel has joined #ad 15:07:39 notabene has left #ad 16:25:11 ericc has joined #ad