Meeting minutes
<martin> Slides for the breakout session: https://
martin: my name is Martin
… my colleague from Huawei, Chunhui Mo, will present this breakout session for this new API
<kenji_baheux> can someone repost the link?
martin: I already shared the slides with you in the Zoom chat
https://
<kenji_baheux> appreciated!
chunhui: I'm Chunhui Mo
… the main topic is the new web API proposal to support hybrid AI development
… enable web developer to directly access on-device and cloud-based models
… and safely sharing user data across multiple apps when using these models
… an example is @@
… since the AI topic covers a broad range of areas
… I would like to clarify what is not in the scope
<martin> Repository with the proposal: kevinmoch/
chunhui: 1. Designing a uniform JavaScript API for accessing browser-provided language models, which is currently being explored by Chrome's AI team
… known as the Prompt API
… 2. Issues faced by hybrid AI, already covered by other breakout sessions
… let's dive into Connection API
… This is an extension of the Prompt API
… both on-device and cloud-based
… now let's talk about the switch model
… switch between different apps
… similar to connect
… this method is also an extension of the prompt API
chunhui: now let's look at the sample code
[Chunhui shows the sample code]
chunhui: I will give you a brief review of the methods
chunhui: now I'll cover the Storage API
[Chunhui shows Storage API Sample Usage]
chunhui: 'entry' is any kind of text including the user's private data
… we store the info under the user's profile, in the travel category
… if we have a web app connecting to the on-device model
[Chunhui shows Storage API Implementation References]
Chunhui: let's see how we can combine the Connection and the Storage API
… imagine we have planned a travel for upcoming holidays
… first we open a flight booking app
… and search for our destination
… this app uses a cloud-based AI model
… the recommend the best flight based on our needs
… next maybe open a hotel booking app to find a place to stay
… as you can see in this diagram
… the architecture for the flight and hotel apps
… can be broken down into the following steps
… assistants gather the user information/preference
… and send a request to the cloud-based model
… and retrieve the flight options
… once the flight information is returned
… the system uses the Storage API to convert the date
… stored into the database
… @@
… finally @@ will be sent to the cloud-based model
[Chunhui shows a demo]
chunhui: Considerations for Connection Strategy
… when we're talking about building apps for the hybrid AI
… there are several crucial elements that need to be considered
… 1. the connection time
… we need to think about how the system connect to different AI models, for instance
… if one model takes too long to respond
… we can skip it
… Custom Connection Strategy
… give developer flexibility
… we should offer more dynamic and efficient solution
… Model Status Information
… knowing the status of each model is key
… is a model available?
… Load Balancing Across Models
… helps us automatically distribute requests
… version control
… Considerations for Storage Strategy
… @@
… now let's look at a step further by looking at Considerations for Native OS APIs
… need to ensure that the AI model works consistently no matter which device is being used
… by leveraging the OS
… reduce overhead that comes with browser-based storage
… allowing the system to store more data and handle complex operations, which is fairly useful
… when working with large AI models
… exploring the native OS integrations
… now we're moving into the discussion phase
… I'd love to hear your thoughts and feedback
… how we move forward
McCool: you spent a lot of time discussing how to enable sharing data between apps
… however big concern the web is controlling the sharing the data bewteen apps
… you don't want to give all apps access all personal data
chunhui: we strore the data in the @@
McCool: in one of our enterprise applications you can do access controls for accessing certain subsets of data
… having a mechanism for access control or you can explicitly grant some data
kenji_baheux: i was wondering if you could share some of the details of the trigger of this exploration
… like the use cases
<kenji_baheux> concrete examples of use cases (internal) or from actual partners?
chunhui: the use cases is 2 different apps not directly share the same data, but into the same local database
reillyg: to understand the context here, this is a proposal that would be exposed by the browsers?
chunhui: yes
reillyg: the API will allow multiple sites to insert data into this database
… and they can prompt the AI to user the data @@
… the results from the AI are visible to the apps making the query
… if the AI uses information about travel, the app can see the response from the AI
chunhui: if the apps have permisson, yes, they can
… they can benefit from the whole database
reillyg: granting the safe access to their own device model to this database?
chunhui: another impl to implement the insert/update entries
… for your reference
zolkis: I understand that you want to build this solution
… is it mandatory requirement that you have booking and flight in separate apps?
… if you can do it in the same app, then you @@
… basically you want a web app
chunhui: let's say the Chrome prompt API
… web devs use window.ai to connect LLM
… it's just extension for Chrome's prompt API
… the create method is the one in the Chrome API
… we just make it access multiple models
[chunhui shows sample usage]
chunhui: this API uses Chroma DB
kenji_baheux: you showed gemini-1.5-flash
… but you also have like an API key
… people would use this app
… I'm really curious about how it's gonna work
… gemini_api_key
… @@ connects to an on-device model
… how it's going to work?
kenji_baheux: if the code runs on the browser, then you expose the API key
chunhui: yes, the API key will be exposed
… if it's a cloud-based model
… by the browser
<tomayac> Also, would UAs be expected to keep track of API key usage, so the user could find an app that has used up all the tokens…
Domenic: Google is not interested in creating a cross site database to store the info since it breaks the web security model
… if you want to share data between sites you should explore the traditional ways of doing that
… for example using things like authentication
… should not use some sort of database since it enables cross site tracking
McCool: it clearly violates cross sharing restrictions
… we want each app has its own storage
… regarding the API key we could generate and API key per session
… limit to the particular browser instance
… where does the model run not enough to identify @@
McCool: how do you deal with the web storage model?
chunhui: I think authentication is a very traditional topic
McCool: different sites have complete different developers
chunhui: let's say one vendor develops 2 different apps
… they have the same user information
… so they can trust each other
<martin> ask kenj
kenji_baheux: afaiu Huawei has a miniapp ecosystem
… it is very relevant to that system?
chunhui: maybe miniapp is one scenario
… that can benefit from this API
… but we're trying to say if the API is provided by the native OS whether it is a native app web app or miniapp
… they can share the local device
… the same one user
Domenic: i think the issue is that the security model for the web is very different from the security model for native apps or mini apps
… we cannot have the same security model
… i think the cross site storage thing doesn't work
… but the use case is important
zolkis: I wanted to ask if you considered @@ then you don't violate the browser security model
… on the edge side
<kenji_baheux> Domenic: on the other hand, the other part of the proposal (e.g. model switching) is worth exploring further (paraphrasing).
chunhui: yes you can implement this on the edge
zolkis: OK
reillyg: mechanism for loading multiple models, local and remote
… the local model concept is targeting OS built in models
… Gemini running onthe device for example
… what about loading model proveide bythe app
… like webnn and webgpu
chunhui: yes
… you can use webgpuo r webnn to integrate a lot of models
… Gemini or others
reillyg: i'd be interested in understanding how taht works
martin: the security model is a challenge
… this is something we need to solve
… is there any interest beyond?
… if we solve the problem of security
McCool: let's follow up in the web machine learning CG
<kenji_baheux> McCool: model switching and RAG aspects (iirc).