Meeting minutes
<PaulG> presnt+
https://
Agenda Review, Membership & Announcements
<PaulG> matatk: we are trying to check attendance at CSUN
<PaulG> ...is anyone attending?
Review Use Case
<PaulG> Alan: is there a reason we're using both commas and breaks?
<PaulG> Dee: sometimes the speech engine doesn't pronounce "A." but pronounces a short vowel 'a'
<PaulG> S_Wood: for clarity we should remove the comma
<PaulG> matatk: these being from real world examples is very valuable and I think we need to be careful to separate the broader context from the specific asks that we have from implementors
<PaulG> matatk: we don't want them to pinpoint what could be attributable to a bug in the TTS engine
<PaulG> matatk: the break in example 4 is not trying to be the comma
<PaulG> matatk: so we don't need to drop it except that it was being used as a workaround for a bug
<PaulG> Dee: totally makes sense to me
PaulG: We use @time in all of these examples.
<gb> @time
PaulG: but users' AT/TTS have different rates of speech.
… There are encoded strengths for breaks of different lengths, e.g. after a comma would be a 'weak' break; after a sentence would be a 'strong' break.
… Devs may be more amenable to coding along the lines of those types of breaks, rather than times.
… This is an opportunity to use SSML as a starting point, but make the names for the breaks more intiutive.
… We could come up with our own terms, or use the ones they have.
… What are your thoughts about using strength over time?
Alan: Is strength relative - you could adjust browser settings (baseline for them).
PaulG: My understanding is it's flexible, so that if you adjust TTS base speed, then breaks are affected proportionally to speech speed factor (e.g. 1.5x)
PaulG: A half-second break could sound like the end of the content to someone who's listening to 450wpm vs 100wpm
Alan: Is 'weak' in Chrome going to be the same as 'weak' in Firefox?
PaulG: Undetermined.
PaulG: I think it's based on the platform
… Even the spec isn't clear about what those tokens are.
PaulG: If we only put @time in our examples, that's all we'll get.
<gb> @time
PaulG: When someone's using their own AT, we want their AT to be able to honor that.
… Time may be OK with a built-in AT (maybe).
… Amazon implements it, and Google has strength.
Alan: I've not used strength, nor rate, in our SSML. Most of our SSML is hand-rolled. When you use Web Speech API, you send a rate separately to SSML.
Alan: Are we saying change some, because we want both?
PaulG: Yes
Alan: Great
PaulG: If someone's concerned because these times aren't well specified, or for any reason, let us know
Alan: I don't think Web Speech was designed to work with SSML
PaulG: It may be that strength maps to something inside the TTS engine, and could be well received
Dee: These examples were real-world examples relating to Amazon Polly use in ETS.
<PaulG> S_Wood: I should talk to mark about all this
<PaulG> ... for this document, we need a varied example base
PaulG: Maybe not specified (strength) to avoid the spec being too English-centric.
… This could be a way to pivot away from time being a make-or-break feature for us.
… If we could get by with pauses being based on existing breaks, and you don't have to add a timer to your software, can we work with that?
<PaulG> https://
PaulG: We'll have a meeting next week (unless nobody available) and then not for the following 2 weeks.
PaulG: When we come back, we should take some time to collate these examples.
PaulG: Then we can try to restart conversations with the vendors.
Dee: Would be helpful to know how the browsers are implementing things like break strength in relation to prossidy.
PaulG: Fantastic idea. Maybe we can do some outreach.
… We need devs who have knowledge of the inner workings.
Dee: Important to understand how that works, and how time plays into that.
… If you look up the docs, both time and strength are mentioned.
… Time being optional attribute, strength not.
Dee: That was from 2010's SSML REC
Dee: Could ask Mark?
PaulG: Is everyone here now able to be next week?
Alan: +1
Dee: +1
matatk: +1
PaulG: For next week, could you talk to Mark? Maybe he knows someone who knows?
… We need to be prepared to find different AT does it different ways.
<PaulG> matatk: we need to preserve the "real world" examples but filter it for the people we want to focus on specific implementation details
<PaulG> matatk: the two audiences, our internal folks and the external implementors we need to influence.
<PaulG> ...Alan mentioned how web speech API doesn't fully implement SSML
<PaulG> ...is that a path to solve our problems?
<PaulG> Alan: I just wrote a higher-level tool to help us with some of the issues
<PaulG> ...in some ways Web Speech gives more control
<PaulG> ...over rate and timing
<PaulG> matatk: Can you share the code or examples for that higher-level tool?
<PaulG> matatk: I was just thinking, whenever we show people we should use the TAG explainer format
<PaulG> ...they focus on the user problem being solved
<PaulG> ...and compare how the various solutions (ssml, web speech, etc) deliver on those solutions
Action Items
Github Issues and examples
Other Business
<PaulG> Alan: clarification: web speech is not a solution for these problems
<PaulG> https://
PaulG: We mentioned Web Speech in the gap analysis, but we didn't include it in the table (probably because it's very separate from the content)
PaulG: Maybe we just call that out, and make it more clear.
PaulG: We can explain why AT users are left behind if things are not moved into the AX tree
S_Wood: I may have some slides that explain this.