14:17:55 RRSAgent has joined #voiceinteraction 14:17:59 logging to https://www.w3.org/2024/04/24-voiceinteraction-irc 14:18:21 meeting: Voice Interaction 14:18:28 scribe:ddahl 14:18:35 chair:debbie 14:18:45 topic: reference implementation 14:24:22 rrsagent, make logs public 14:24:32 rrsagent, format minutes 14:24:33 I have made the request to generate https://www.w3.org/2024/04/24-voiceinteraction-minutes.html ddahl 14:26:18 https://github.com/w3c/voiceinteraction/tree/master/source/w3cipa 14:27:13 agenda: https://lists.w3.org/Archives/Public/public-voiceinteraction/2024Apr/0010.html 14:37:14 present: dirk,gerard, debbie 14:39:17 dirk: review reference implementation 14:39:29 ...ChatGPT and Mistral 14:40:38 ....framework is mainly headers 14:41:16 ...component that accesses GPT, also demo program 14:42:21 dirk reviews SOURCE.md 14:43:32 input listener for modality inputs 14:43:54 ... in this case just selects the first one 14:44:09 ...goes to ModalityManager 14:44:28 ...can add modality components as you like 14:45:30 ...startInput and handleOutput 14:46:22 ... this is part of the framework, so Royalty Free 14:48:24 ...modality type is a free string so it can be extensible 14:48:46 ...only text is implemented in the reference implementation 14:50:39 ...some modality components could be both input and output 14:54:08 ...one instance that knows all listeners and that all modality components would know 14:55:15 ...looking at one example of a modality component, textModality 14:59:16 debbie: can there be more than one InputModalityComponent? 14:59:21 dirk: in theory, yes 15:02:34 ...we might have scaling issues with multiple text inputs, for example 15:03:36 rrsagent, format minutes 15:03:37 I have made the request to generate https://www.w3.org/2024/04/24-voiceinteraction-minutes.html ddahl 15:07:42 debbie: take "first" out of name "TakeFirstInputModalityComponent" to make it more general 15:08:24 dirk: moving on to DialogLayer, IPA Service 15:09:18 ...IPA for both local and anything else we have 15:10:17 ....ReferenceIPAService consumes data from Client 15:11:31 ...could serve multiple clients or if we have local and other IPA services 15:11:45 ...no DialogManager in place 15:13:49 ...if there was one, the IPA service would send the input to it and then after that the IPA service would forward the output back to the client 15:14:24 ...the ExternalIPA/Provider Selection Service 15:14:58 ...the Provider Selection Service for now only knows about ChatGPT 15:19:37 ...IPA provider supports input from different modalities 15:20:48 debbie: should we standardize on define modality types, e.g. "voice" vs "speech" 15:21:55 dirk: would like to talk about ProviderSelectionStrategy and how components are glued together 15:22:15 debbie: we can talk more in the next call 15:25:09 ...could we list the parts of the architecture that aren't implemented yet? 15:25:09 dirk. that might make sense 15:26:20 debbie: could there be an UML diagram? 15:26:20 dirk: there could be more diagrams 15:26:46 ...could link from code to specification 15:27:42 dirk: next time talk about the provider selection strategy and how to chain everything together 15:27:47 debbie: will try runnin 15:27:57 s/runnin/running 15:28:36 dirk: demo running with ChatGPT 15:30:28 gerard: which version of Mixtral do you use? 15:30:58 ...open source version 15:31:20 hugues: the next version will not be open source 15:31:41 gerard: the approach is mixture of experts 15:32:27 dirk: what happens if we ask both at the same time? 15:32:50 ...would receive them both 15:33:05 gerard: could use an LLM to summarize 15:33:25 ...that's what Mixtral is using with the Mixture of Experts 15:34:34 rrsagent, format minutes 15:34:35 I have made the request to generate https://www.w3.org/2024/04/24-voiceinteraction-minutes.html ddahl 17:34:20 ddahl has left #voiceinteraction