Web API for Hybrid AI – 25 September 2024

Meeting minutes

<martin> Slides for the breakout session: https://kevinmoch.github.io/web-hybrid-ai/Templates/Overview.html

martin: my name is Martin
… my colleague from Huawei, Chunhui Mo, will present this breakout session for this new API

<kenji_baheux> can someone repost the link?

martin: I already shared the slides with you in the Zoom chat

https://kevinmoch.github.io/web-hybrid-ai/Templates/Overview.html

<kenji_baheux> appreciated!

chunhui: I'm Chunhui Mo
… the main topic is the new web API proposal to support hybrid AI development
… enable web developer to directly access on-device and cloud-based models
… and safely sharing user data across multiple apps when using these models
… an example is @@
… since the AI topic covers a broad range of areas
… I would like to clarify what is not in the scope

<martin> Repository with the proposal: kevinmoch/web-hybrid-ai

chunhui: 1. Designing a uniform JavaScript API for accessing browser-provided language models, which is currently being explored by Chrome's AI team
… known as the Prompt API
… 2. Issues faced by hybrid AI, already covered by other breakout sessions
… let's dive into Connection API
… This is an extension of the Prompt API
… both on-device and cloud-based
… now let's talk about the switch model
… switch between different apps
… similar to connect
… this method is also an extension of the prompt API

chunhui: now let's look at the sample code

[Chunhui shows the sample code]

chunhui: I will give you a brief review of the methods

chunhui: now I'll cover the Storage API

[Chunhui shows Storage API Sample Usage]

chunhui: 'entry' is any kind of text including the user's private data
… we store the info under the user's profile, in the travel category
… if we have a web app connecting to the on-device model

[Chunhui shows Storage API Implementation References]

Chunhui: let's see how we can combine the Connection and the Storage API
… imagine we have planned a travel for upcoming holidays
… first we open a flight booking app
… and search for our destination
… this app uses a cloud-based AI model
… the recommend the best flight based on our needs
… next maybe open a hotel booking app to find a place to stay
… as you can see in this diagram
… the architecture for the flight and hotel apps
… can be broken down into the following steps
… assistants gather the user information/preference
… and send a request to the cloud-based model
… and retrieve the flight options
… once the flight information is returned
… the system uses the Storage API to convert the date
… stored into the database
… @@
… finally @@ will be sent to the cloud-based model

[Chunhui shows a demo]

chunhui: Considerations for Connection Strategy
… when we're talking about building apps for the hybrid AI
… there are several crucial elements that need to be considered
… 1. the connection time
… we need to think about how the system connect to different AI models, for instance
… if one model takes too long to respond
… we can skip it
… Custom Connection Strategy
… give developer flexibility
… we should offer more dynamic and efficient solution
… Model Status Information
… knowing the status of each model is key
… is a model available?
… Load Balancing Across Models
… helps us automatically distribute requests
… version control
… Considerations for Storage Strategy
… @@
… now let's look at a step further by looking at Considerations for Native OS APIs
… need to ensure that the AI model works consistently no matter which device is being used
… by leveraging the OS
… reduce overhead that comes with browser-based storage
… allowing the system to store more data and handle complex operations, which is fairly useful
… when working with large AI models
… exploring the native OS integrations
… now we're moving into the discussion phase
… I'd love to hear your thoughts and feedback
… how we move forward

McCool: you spent a lot of time discussing how to enable sharing data between apps
… however big concern the web is controlling the sharing the data bewteen apps
… you don't want to give all apps access all personal data

chunhui: we strore the data in the @@

McCool: in one of our enterprise applications you can do access controls for accessing certain subsets of data
… having a mechanism for access control or you can explicitly grant some data

kenji_baheux: i was wondering if you could share some of the details of the trigger of this exploration
… like the use cases

<kenji_baheux> concrete examples of use cases (internal) or from actual partners?

chunhui: the use cases is 2 different apps not directly share the same data, but into the same local database

reillyg: to understand the context here, this is a proposal that would be exposed by the browsers?

chunhui: yes

reillyg: the API will allow multiple sites to insert data into this database
… and they can prompt the AI to user the data @@
… the results from the AI are visible to the apps making the query
… if the AI uses information about travel, the app can see the response from the AI

chunhui: if the apps have permisson, yes, they can
… they can benefit from the whole database

reillyg: granting the safe access to their own device model to this database?

chunhui: another impl to implement the insert/update entries
… for your reference

zolkis: I understand that you want to build this solution
… is it mandatory requirement that you have booking and flight in separate apps?
… if you can do it in the same app, then you @@
… basically you want a web app

chunhui: let's say the Chrome prompt API
… web devs use window.ai to connect LLM
… it's just extension for Chrome's prompt API
… the create method is the one in the Chrome API
… we just make it access multiple models

[chunhui shows sample usage]

chunhui: this API uses Chroma DB

kenji_baheux: you showed gemini-1.5-flash
… but you also have like an API key
… people would use this app
… I'm really curious about how it's gonna work
… gemini_api_key
… @@ connects to an on-device model
… how it's going to work?

kenji_baheux: if the code runs on the browser, then you expose the API key

chunhui: yes, the API key will be exposed
… if it's a cloud-based model
… by the browser

<tomayac> Also, would UAs be expected to keep track of API key usage, so the user could find an app that has used up all the tokens…

Domenic: Google is not interested in creating a cross site database to store the info since it breaks the web security model
… if you want to share data between sites you should explore the traditional ways of doing that
… for example using things like authentication
… should not use some sort of database since it enables cross site tracking

McCool: it clearly violates cross sharing restrictions
… we want each app has its own storage
… regarding the API key we could generate and API key per session
… limit to the particular browser instance
… where does the model run not enough to identify @@

McCool: how do you deal with the web storage model?

chunhui: I think authentication is a very traditional topic

McCool: different sites have complete different developers

chunhui: let's say one vendor develops 2 different apps
… they have the same user information
… so they can trust each other

<martin> ask kenj

kenji_baheux: afaiu Huawei has a miniapp ecosystem
… it is very relevant to that system?

chunhui: maybe miniapp is one scenario
… that can benefit from this API
… but we're trying to say if the API is provided by the native OS whether it is a native app web app or miniapp
… they can share the local device
… the same one user

Domenic: i think the issue is that the security model for the web is very different from the security model for native apps or mini apps
… we cannot have the same security model
… i think the cross site storage thing doesn't work
… but the use case is important

zolkis: I wanted to ask if you considered @@ then you don't violate the browser security model
… on the edge side

<kenji_baheux> Domenic: on the other hand, the other part of the proposal (e.g. model switching) is worth exploring further (paraphrasing).

chunhui: yes you can implement this on the edge

zolkis: OK

reillyg: mechanism for loading multiple models, local and remote
… the local model concept is targeting OS built in models
… Gemini running onthe device for example
… what about loading model proveide bythe app
… like webnn and webgpu

chunhui: yes
… you can use webgpuo r webnn to integrate a lot of models
… Gemini or others

reillyg: i'd be interested in understanding how taht works

martin: the security model is a challenge
… this is something we need to solve
… is there any interest beyond?
… if we solve the problem of security

McCool: let's follow up in the web machine learning CG

<kenji_baheux> McCool: model switching and RAG aspects (iirc).

– DRAFT –
Web API for Hybrid AI

25 September 2024

Attendees

Meeting minutes

Diagnostics