Meeting minutes
gerard: interested in embedding conversational AI in mobile devices
dirk: interested in standardizing voice interaction
… curious to learn from Gerard about security
github issues
dirk: close issue #5
… in the architecture document
dirk: description of Russian doll principle (#40)
noreen: looking at Russian doll in Wikipedia
https://
noreen: it would be interesting to find a stable reference to that metaphor
noreen: will look for a reference
debbie: we agree to include a reference if noreen can find something appropriate
dirk: will add noreen to github (nwhysel)
irk: roles and responsibilities (issue #36)
dirk: (reviews roles and responsibilities)
debbie: what about the provider of the IPA?
noreen: could be integrator
jon: this participant has multiple roles, e.g. designer and integrator
noreen: should disambiguate owner and user
dirk: user owns speaker, but someone in the house might be using it
noreen: two potential owners, bank and user
dirk: replace owner by platform provider?
… should not mix up hardware device vs something that provider provides
jon: platform, enterprise owner, user
… Amazon has multiple roles in this scheme
jon: if we envision this architecture as a guide for independent IPAs we have three roles
… if it's a consumer-facing IPA (like an app) there would be two
debbie: should we add examples?
dirk: that would help
dirk: will revise list with examples
jon: will add examples from enterprise provider (3 roles)
debbie: revisit this next time
compare OVON and voice interaction work
debbie: looks at OVON clusters and focus items https://
debbie: the most mature OVON specs are dialog events and interagent protocols
… let's compare dialog events and interfaces
… there is a spec for dialog events but examples would be better to look at https://
… for OVON, vs. interfaces document https://
… OVON has speaker id for either user or system
dirk: should add that to VI
sending audio data
dirk: two cases, one instance is sending user started speaking and finished utterance (endpointed) or streaming, audio is sent by some other means
… either sender or receiver could endpoint
… message says "user has started speaking, look here for the audio"
debbie: will compare and contrast dialog events and interfaces
dirk: will review
dirk: suggest putting use case task force on the agenda
debbie: agrees