22 April 2024


Alan, janina, matatk, PaulG

Meeting minutes


matatk: we should explore this path on CSS speech. Implementors aren't doing it. There's no specific blocker (philosophical or technical). It just seems it's not a priority.
… an avenue to try is that there might be a subset of CSS speech (e.g., pause/break) implementors would work toward
… if that could be implemented, we could probably get the rest of what we want.
… we need an update from Mark about their spec using the data-* approach which can also influence implementors

Agenda Review, Membership & Announcements

matatk: after TPAC would be a good time to decide direction

janina: we can go with a minimal set of spec requirements and expand it later

- Drop requirements vendors resist (maybe pick up in phase2 after gathering support)

- Push for requirements again with only the few examples we found (pause/break mostly)

- Merge with CSS speech and resume/restart (both)

- Drop it entirely

<Zakim> janina, you wanted to propose that Break/Pause is CSS-Speech 1.0. End of story.

have we review this? https://www.w3.org/TR/epub-tts-10/

1EdTech's spec could have influence on implementors

(when published)

EPUB seems to use both ssml (attributes) and css-speech. Similar to one of our options.

propose to invite Matt Garrish and Mark to a future meeting (or at TPAC) when we can unify on an approach.

<matatk> Here's a search for open issues on the EPUB TTS doc: https://github.com/w3c/epub-specs/issues?q=is%3Aissue+epub+tts+is%3Aclosed+label%3ASpec-TTS

<Alan> https://idpf.org/epub/301/spec/epub-contentdocs.html

matak: who can update the gap analysis document?

I can do that especially if we're not having meetings

we need ETS folks (Mark) to join with someone from EPUB and CSS-Speech (either before Mark leaves for the Summer or at TPAC)

and hammer out either we're working together or not

EPUB "voice, the volume level, and pauses and cues" as the minimum spec from CSS-Speech

EPUB "voice, the volume level, and pauses and cues" as the minimum spec from CSS-Speech (adding emphasis and expanding voice to be Voice Family/Gender and Rate/Pitch/Volume)

if we agree lang is handled by HTML...

leaves Phonetic Pronunciation, Substitution, Say As

maybe this is simple enough to not warrant JSON

matak: basically, what's in the axTree and what's not. What's formatting and what's content.
… next steps: I will communicate with interested parties and we can reconvene after that's done

