Meeting minutes
scribe sebastian
<McCool showing agenda>
Problem Statement ... Issues and Constaints ... Discussions ... Next steps
Problem statements
McCool: AI models ...
… can be very large
… will often be shared
… are often updated
McCool: There is also the current same-origin storage portioning policy
… this preserves privacy issues
… this works for images (not shared) and for software libraries (often shared)
Use Cases for Large Models
McCool: Language translation, ...
… meeting captions
… background removal
… video creation and editing
… written language recognition
… personal assistant
<kenji_baheux> who is speaking?
XXX: Shall we specify a shared repository which only has my personal information?
@@@
Why run AI?
McCool: There are pros such as latency and being used offline.
… cons are size limitations, download time, and storage costs
Model Size vs Download Time
McCool: Average home networks speeds are between 45-216 Mbps
… downloading Phi-3-mini will result,e.g, to 22mins for baseline
Existing APIs and Experiments
McCool: For the same-origin there exists HTTP Cache, ...
… Cache API
… IndexedDB API,
… Origin Private File System API
McCool: For the cross-origin there exists the File System Access API
Caching Desired Properties
McCool: we need to reduce latency, ...
… Bandwidth
… Storage
… and preserve privacy
Security and Privacy Considerations
McCool: browsers implement only per-origin local caches
… the cross-site privacy risk based on cache timing analysis
… the per-origin caches tolerable for “typical” (non-AI) web resources. But AI Models are large and potentially shared
Issue Starter Pack
McCool: Here are first issues for discussion:
… background model download and compilation,
… model naming and versioning
… allowing for model substitution when useful
… common interface for downloadable and “platform” models
… storage deduplication
… model representation independence
… API independence
… browser independence
… offline usage, including interaction with PWAs
… cache transparency
David: Seems a big issue. Are interested are you working on specific APIs?
<Zakim> dsinger, you wanted to mention similar problems
Alternatives
McCool: One option can be that we define model-aware caches
… use 'fake misses' to avoid redundant downloads
… identify cache items by content-dependent hashes
McCool: the idea is that model caches would behave as if they were per-origin caches
McCool: there is another alternative, the auto-expedite common models
… the more common a model is, the less of a tracking risk it is
<kenji_baheux> echoing Erik's point on finding if there is an actual problem; want to share what I've heard so far from partners. Enterprise / Edu customers would like to use a custom LLM that speaks their users' language / lingo across different origins (typically internal websites and/or popular 3p solutions to Enterprise/Edu needs). Other extreme, some
<kenji_baheux> partners want a custom LLM that speaks to their community but they wouldn't share it with other sites. Likely it would be running on the server as most of their users wouldn't necessarily want the big download.
<ErikAnderson> It's a proposal supported only by Chrome and Edge, but Related Website Sets is an example of a technology that may be applicable here _if_ we hear from customers that it's a common use case to share a company-specific model across multiple top-level eTLD+1s.
adjourn