W3C Workshop on Web and Machine Learning

🌱 A virtual event with pre-recorded talks and interactive sessions

The Program Committee has identified the following speakers and presentations as input to the live discussions to be scheduled in September 2020.

We expect to publish the presentations recorded by the speakers in August 2020.

Opportunities and Challenges of Browser-Based Machine Learning

Goal: Determine what are the unique opportunities of browser-based ML, what are the obstacles hindering adoption

Privacy-first approach to machine learning
Speaker
Philip Laszkowicz
Abstract
The presentation will discuss how developers should be building modern web apps and what is missing in the existing ecosystem to make privacy-first ML possible including the challenges with WASI, modular web architecture, and localized analytics.
Real-time ML Process of media in-browser
Speaker
Bernard Aboba (Microsoft)
Abstract
The presentation will discuss efficient processing of raw video in machine learning, highlighting the need to minimize memory copies and enable integration with WebGPU.
ML hardware advancements: leveraging for performance while retaining portability
Speaker
Cormac Brick (Intel)
Cormac leads Edge Inference IP architecture at Intel with a focus on both silicon and software. Cormac joined Intel as part of Movidius Acquisition in 2016 where he lead Machine Intelligence.
Abstract
There are a lot of interesting innovations coming in hardware technology, including hardware acceleration, weight compression, weight sparsity, low precision.
Opportunities and Challenges for TensorFlow.js and beyond
Speaker
Jason Mayes (Google)
Developer Advocate for TensorFlow.js
Abstract
This talk will give a brief overview of TensorFlow.js, how it helps developers build ML-powered applications along with examples of work that is pushing the boundaries of the web, and discuss future directions for the web tech stack to help overcome barriers to ML in the web the TF.js community has encountered.
Moving Deep Learning into Web Browser: How Far Can We Go?
Speaker
Yun Ma
Yun Ma is a postdoc researcher in School of Software, Tsinghua University, China. He got his Ph.D. degree in Jun. 2017 from Peking University. His research interests lie in mobile computing, Web systems, and services computing. He has published several papers on WWW and ACM Trans on the Web. Recently he focuses on how to enable browsers to support deep learning tasks better.
Abstract
Recently, several JavaScript-based deep learning frameworks have emerged, making it possible to perform deep learning tasks directly in browsers. However, little is known on what and how well we can do with these frameworks for deep learning in browsers. In this talk, I’ll present our recent empirical study of deep learning in browsers. We survey 7 most popular JavaScript-based deep learning frameworks, investigating to what extent deep learning tasks have been supported in browsers so far. Then we measure the performance of different frameworks when running different deep learning tasks. Finally, we dig out the performance gap between deep learning in browsers and on native platforms by comparing the performance of TensorFlow.js and TensorFlow in Python. Our findings could help application developers, deep-learning framework vendors and browser vendors to improve the efficiency of deep learning in browsers. The content of this talk was published in WWW 2019

Web Platform Foundations for Machine Learning

Goal: Understand how machine learning fits into the Web technology stack

Web Platform: a 30,000 feet view / Web Platform and JS environment constraints
Speaker
Dominique Hazael-Massieux (W3C)
Dominique is part of the full-time technical staff employed by W3C to animate the Web standardization work. He is in particular responsible for the work on WebRTC, WebXR and Web & Networks, led the effort to start a WebTransport Working Group and is one of the organizers of the Web and Machine Learning workshop.
Abstract
Background talk on the specificities of the Web browser as a development platform.
Media processing hooks for the Web
Speaker
François Daoust (W3C)
François is part of the full-time technical staff employed by W3C and supervizes there the work related to media technologies.
Abstract
This talk will provide an overview of existing, planned or possible hooks for processing muxed and demuxed media (audio and video) in real time in Web applications, and rendering the results. It will also present high-level requirements for efficient media processing.
Access purpose-built ML hardware with Web Neural Network API
Speaker
Ningxin Hu (Intel)
Ningxin is a principal software engineer at Intel. Ningxin is co-editing the Web Neural Network (WebNN) API spec within W3C Machine Learning for the Web Community Group.
Abstract
The WebNN API is a new web standard proposal that allows web apps and frameworks to accelerate deep neural networks with dedicated on-device hardware such as GPUs, CPUs with deep learning extensions, or purpose-built AI accelerators. A prototype of WebNN API will be used to demonstrate the near-native speed of deep neural network execution for object detection by accessing AI accelerators on phone and PC.
A proposed web standard to load and run ML models on the web
Speaker
Jonathan Bingham (Google)
Jonathan is a web product manager at Google.
Abstract
The Model Loader API is a new proposal for a web standard to make it easy to load and run ML models from JavaScript, taking advantage of available hardware acceleration. The API surface is similar to existing model serving APIs (like TensorFlow Serving, TensorRT, and MXNet Model Server), and it is complementary to the Web NN graph API proposal as well as lower level WebGL and WebGPU APIs.
Accelerated graphics and compute API for Machine Learning - DirectML
Speaker
Chai Chaoweeraprasit (Microsoft)
Chai leads development of machine learning platform at Microsoft
Abstract
DirectML is Microsoft's hardware-accelerated machine learning platform that powers popular frameworks such as TensorFlow and ONNX Runtime. It expands the framework's hardware footprint by enabling high-performance training and inference on any device with DirectX-capable GPU
Accelerate ML inference on mobile devices
Speaker
Miao Wang (Google)
Software Engineer for Android Neural Networks API
Abstract
The Android Neural Networks API (NNAPI) is an Android C API designed for running computationally intensive operations for machine learning on Android devices. NNAPI is designed to provide a base layer of functionality for higher-level machine learning frameworks, such as TensorFlow Lite and Caffe2, that build and train neural networks. The API is available on all Android devices running Android 8.1 (API level 27) or higher. Based on an app’s requirements and the hardware capabilities on an Android device, NNAPI can efficiently distribute the computation workload across available on-device processors, including dedicated neural network hardware (NPUs and TPUs), graphics processing units (GPUs), and digital signal processors (DSPs).
Heterogeneous parallel programming with open standards using oneAPI and Data Parallel C++
Speaker
Jeff Hammond (Intel)
Jeff Hammond is a Principal Engineer at Intel where he works on a wide range of high-performance computing topics, including parallel programming models, system architecture and open-source software. He has published more than 60 journal and conference papers on parallel computing, computational chemistry, and linear algebra software. Jeff received his PhD in Physical Chemistry from the University of Chicago.
Abstract
Diversity in computer architecture and the unceasing demand for application performance in data-intensive workloads are never-ending challenges for programmers. This talk will describe Intel’s oneAPI initiative, which is an open ecosystem for heterogeneous computing that supports high-performance data analytics, machine learning and other workloads. A key component of this is Data Parallel C++, which is based on C++17 and Khronos SYCL and supports direct programming of CPU, GPU and FPGA platforms. We will describe how oneAPI and Data Parallel C++ can be used to build high-performance applications for a range of devices.
Enabling Distributed DNNs for the Mobile Web Over Cloud, Edge and End Devices
Speakers
Yakun Huang & Xiuquan Qiao (Beijing University of Posts & Telecommunications)
Abstract
This talk introduces two deep learning technologies for the mobile web over cloud, edge and end devices. One is an adaptive DNN execution scheme, which partitions and performs the computation that can be done within the mobile web, reducing the computing pressure of the edge cloud. The other is a lightweight collaborative DNN over cloud, edge and devices, which provides a collaborative mechanism with the edge cloud for accurate compensation.
Integrating with cloud-based ML with WebTransport
Speaker
Bernard Aboba (Microsoft)
Abstract
This talk will discuss the role of WebTransport to transport audio/video for machine learning in the cloud. WebTransport seems like it could be useful here, as an advance beyond what can be done with WebSockets or REST APIs.
Collaborative Learning
Speaker
Wolfgang Maaß (DKFI, German Research Center for Artificial Intelligence)
Professor at Saarland University and scientific director at DFKI
Abstract
The execution of data analysis services in a browser on devices has recently gained momentum, but the lack of computing resources on devices and data protection regulations are forcing strong constraints. In our talk we will present a browser-based collaborative learning approach for running data analysis services on peer-to-peer networks of devices. Our platform is developed in Javascript, supports modularization of services, model training and usage on devices (tensorflow.js), sensor communication (mqtt), and peer-to-peer communication (WebRTC) with role-based access-control (oauth 2.0).
WASI-NN: Neural Network for WebAssembly
Speakers
Mingqiu Sun & Andrew Brown (Intel)
Senior PE at Intel & software engineer at Intel
Abstract
Trained machine learning models are typically deployed on a variety of devices with different architectures and operating systems. WebAssembly provides an ideal portable form of deployment for those models. In this talk, we will introduce the WASI-NN initiative we have started in the WebAssembly System Interface (WASI) community, which would standardize the neural network system interface for WebAssembly programs.

Machine Learning Experiences on the Web: A Developer's Perspective

Goal: Authoring ML experiences on the Web; challenges and opportunities of reusing existing ML models on the Web; on-device training, known technical solutions, gaps

Fast client-side ML with TensorFlow.js
Speaker
Ann Yuan (Google)
Software Engineer for TensorFlow.js
Abstract
This talk will present how TensorFlow.js enables ML execution in the browser utilizing web technologies such as WebGL for GPU acceleration, Web Assembly, and technical design considerations.
ONNX.js
Speaker
Emma Ning (Microsoft)
Emma Ning is a senior Product manager in AI Framework team under Microsoft Cloud + AI Group, focusing on AI model operationalization and acceleration with ONNX/ONNX Runtime for open and interoperable AI. She has more than five years of product experience in search engine taking advantage of machine learning techniques and spent more than three years exploring AI adoption among various businesses. She is passionate about bringing AI solutions to solve business problems as well as enhance product experience.
Abstract
ONNX.js is a Javascript library for running ONNX models on browsers and on Node.js, on both CPU and GPU. Thanks to ONNX interoperability, it’s also compatible with Tensorflow and Pytroch. For running on CPU, ONNX.js adopts WebAssembly to execute the model at near-native speed and utilizes Web Workers to provide a "multi-threaded" environment, achieving very promising performance gains. For running on GPU, ONNX.js takes advantage of WebGL which is a popular standard for accessing GPU capabilities. By reducing data transfer between CPU and GPU as well as GPU processing cycles, ONNX.js further push the performance to the maximum.
Paddle.js - Machine Learning for the Web
Speaker
Ping Wu (Baidu)
Architect of Baidu, Lead of Paddle.js
Abstract
Paddle.js is a high-performance JavaScript DL framework for diverse web runtimes, which helps building a PaddlePaddle ecosystem with web community. This talk will introduce Paddle.js design principle, implementation, use scenario and future work the project would like to explore.
Machine Learning in Web Architecture
Speaker
Sangwhan Moon
Machine Learning on the Web for content filtering applications
Speaker
Oleksandr Paraska (eyeo GmbH)
eyeo GmbH is the company behind Adblock plus
Abstract
eyeo GmbH has recently deployed tensorflow.js into their product for better ad blocking functionality and has identified gaps in what the WebNN draft covers, e.g. using the DOM as input data, or primitives needed for Graph Convolutional Networks. The talk will present the relevant use case and give indications on how can it be best supported by the new standard.
Exploring unsupervised image segmentation results
Speaker
Piotr Migdal
Abstract
This talk will present the usage of web-based tools to interactively explore machine learning models, with the example of an interactive D3.js-based visualization to see the results of unsupervised image segmentation.
Mobile-first web-based Machine Learning
Speaker
Josh Meyer (Artie, Inc.)
Lead Scientist at Artie, Inc. and Machine Learning Fellow at Mozilla
Abstract
This talk is an overview of some of Artie's machine learning tech stack, which is web-based and mobile first. It will discuss peculiarities of working with voice, text, and images originating from a user's phone, while running an application in the browser and will include discussions about balancing user preferences with privacy, latency, and performance.

Machine Learning Experiences on the Web: A User's Perspective

Goal: Web & ML for all: education, learning, accessibility, cross-industry experiences, cross-disciplinary ML: music, art, and media meet ML; Share learnings and best practices across industries

We Count: Fair Treatment, Disability and Machine Learning
Speaker
Dr. Jutta Treviranus, Director & Professor, Inclusive Design Research Centre, OCAD University
Abstract

The risks of AI Bias have recently received attention in public discourse. Numerous stories of the automation and amplification of existing discrimination and inequity are emerging, as more and more critical decisions and functions are handed over to machine learning systems. There is a growing movement to tackle non-representative data and to prevent the introduction of human biases into machine learning algorithms.

However, these efforts are not addressing a fundamental characteristic of data driven decisions that presents significant risk if you have a disability. Even if there is full proportional representation and even if all human bias is removed from AI systems, the systems will favour the majority and dominant patterns. This has implications for individuals and groups that are outliers, small minorities or highly heterogeneous. The only common characteristic of disability is sufficient difference from the average such that most systems are a misfit and present a barrier. Machine learning requires large data sets. Many people with disabilities represent a data set of one. Decisions based on population data will decide against small minorities and for the majority. The further you are from average the harder it will be to train machine learning systems to serve your needs. To add insult to injury, if you are an outlier and highly unique, privacy protections won’t work for you and you will be most vulnerable to data abuse and misuse.

This presentation will:

  • outline the risks and opportunities presented by machine learning systems;
  • address strategies to mitigate the risks; and
  • discuss steps needed to support decisions that do not discriminate against outliers and small minorities.

The benefits for innovation and the well-being of society as a whole will also be discussed.

Bias-Free Approach to Machine Learning
Speaker
John Rochord (Director, INDEX Program, Eunice Kennedy Shriver Center, University of Massachusetts Medical School)
Abstract
Biased training data produces untrustworthy, unfair, useless results. Such results include:
  • predicting black prisoners are the most likely recidivist; and
  • killing a wheelchair user in a street crosswalk by autonomous car ML models.

Training data must include representation of people with disabilities, all races, all ethnicities, all genders, etc. Creation of training data must include those populations. There are opensource and commercial toolkits and APIs to facilitate bias mitigation.

John is an expert in this area focused on AI fairness and empowerment for people with disabilities and is a member of the Machine Learning for the Web Community Group.

Interactive ML - Powered Music Applications on the Web
Speaker
Tero Parviainen (Counterpoint)
Tero Parviainen is a software developer in music, media, and the arts. As a co-founder of creative technology studio Counterpoint, he's recently built installations for The Barbican Centre, Somerset House, The Helsinki Festival, The Dallas Museum of Art, and various corners of the web. He also contributes at Wavepaths, building generative music systems for psychedelic therapy.
Abstract
This talk will present a few projects Counterpoint has built with TensorFlow.js and Magenta.js over the past couple of years. Ranging from experimental musical instruments to interactive artworks, they've really stretched what can be done in the browser context. It will focus on the special considerations needed in music & audio applications, the relationship between ML models and Web Audio, and the limitations encountered while combining the two.
Wreck a Nice Beach in the Browser: Getting the Browser to Recognize Speech
Speaker
Kelly Davis (Mozilla)
Manager of the machine learning group at Mozilla. Kelly's work at Mozilla includes Deep Speech (an open speech recognition system), Common Voice (a crowdsourced tool for creating opens speech corpora), Mozilla's TTS (an open source speech synthesis system), Snakepit (an open source ML job scheduler), as well as ML research and many other projects.
Abstract
In narrow domains ML based speech recognition systems have recently begun to approach human-like levels of performance. However, browser based support for such systems is spotty or lacking. This talk will be concerned with the industry's ongoing work to bring ML based speech recognition into the browser. It will cover a wide spectrum of topics: the struggle to standardize a browser based API, the privacy questions such an API exposes, the technical details of embedding (or not) an ML based system into the browser, along with many other topics.
Challenges to create great user experience using natural language processing model in B2B SaaS product on the web
Speaker
Ryuichi Tanimoto (Stockmark)
web developer/system architect at Stockmark Inc.
Abstract
Stockmark develops a B2B SaaS product based on machine learning and natural language processing technology to provide search and analysis UX of business news on the web and support the market research or competitive analysis in the business scene. This talk will share Stockmark's experience with developing the machine learning processes with fastText or BERT model in browser that suggests certain news or phrases or quantitative information in the text of searched news, in response to user's action to the web page.
Privacy focussed machine translation in Firefox
Speaker
Nikolay Bogoychev (University of Edinburgh)
Postdoc researcher at the University of Edinburgh
Abstract
In the recent years, machine translation has been widely adopted by the end user, making online content in foreign languages more accessible than ever. However, machine translation has always been treated as a computationally heavy problem and as such is usually delivered to the end user via online services such as Google Translate, which may not be appropriate for sensitive content. We present a privacy focussed machine translation system that runs locally on the user's machine and is accessible through a Firefox browser extension. The translation models used are just 16MB and translation speed is high enough for a seamless user experience even on laptops from 2012.
A virtual character web meeting with expression enhance power by machine learning
Speaker
Zelun Chen (Netease)
Front-end and Client Development Engineer of Netease
Abstract
This talk will cover the use of machine learning to enhance participant's expression in a virtual character web meeting and highlight the problems of using webassembly to running AI models In browser.
RNNoise, Neural Speech Enhancement, and the Browser
Speaker
Jean-Marc Valin
Jean-Marc Valin has previously contributed to the Opus and AV1 codecs. He is employed by Amazon, but is giving this talk as an individual.
Abstract
This talk presents RNNoise, a small and fast real-time noise suppression algorithm that combines classical signal processing with deep learning. We will discuss the algorithm and how the browser can be improved to make RNNoise and other neural speech enhancement algorithms more efficient.
Empowering Musicians and Artists using Machine Learning to Build Their Own Tools in the Browser
Speaker
Louis McCallum (Post Doc at the Embodied AudioVisual Interaction Group, Goldsmiths, University of London)
Louis is an experienced software developer, researcher, artist and musician. Currently, he holds a Post Doctoral position at the Embodied AudioVisual Interaction Group, Goldsmiths, University of London, where he is also an Associate Lecturer. He is also lead developer on the MIMIC platform and accompanying Learner.js and MaxiInstrument.js libraries
Abstract
Over the past 2 years, as part of the RCUK AHRC funded MIMIC project we have provided platforms and libraries for musicians and artists to use, perform and collaborate online using machine learning. Although it has a lot to offer these communities, their skill sets and requirements often diverge from more conventional machine learning use cases. For example, requirements for dynamic provision of data and on-the-fly training in the browser raises challenges with performance, connectivity and storage. We seek to address the non trivial challenges of connecting inputs from a variety of sources, running potentially computationally expensive feature extractors alongside lightweight machine learning models and generating audio and visual output, in real time, without interference. Whilst technologies like AudioWorklets addresses this to some extent, there remain issues with implementation, documentation and adoption (currently limited to Chrome). For example, issues with garbage collection (created by the worker thread messaging system) caused wide scale disruption to many developers using AudioWorklets and was only addressed by a ringbuffer solution that developers must integrate outside of the core API. We are also keen to ensure the WebGPU API takes realtime media into consideration as it is introduced. Our talk will cover both the user’s perspectives as uncovered by our user-centered research and a developer’s perspective from the technical challenges we have faced developing tools to meet the needs of these users in both creative and educational settings.