W3C Workshop on Web and Machine Learning (17 Aug 2020)

Hi!

A talk for W3C Workshop on Web and Machine Learning.

Projects

  • We work at ECC Games: AI for content design and physics engine optimization for a race game.
  • We develop livelossplot - a Python package for visualizing the training process in Jupyter Notebook.
  • Piotr develops Quantum Game and a TypeScript quantum numerics engine quantum-tensors.

Model training inspection: automation

Epoch 23/100
log-loss
    loss                 (min:    0.091, max:    0.500, cur:    0.091)
    val_loss             (min:    0.105, max:    2.000, cur:    0.105)
  • BAD: Checking metrics in your console
  • OK-ISH: Saving metrics with TensorBoard (better, but no code reproducibility)
  • GOOD: Saving all with Neptune.ai, Weights & Biases, MLflow or somthing similar (metrics, code, parameters)

Model training inspection: depth

  • BAD: Only metrics (log-loss, accuracy, IoU, etc)
  • GOOD: Also examples (concrete class predictions, see below)
  • PERFECT: Also interactive examples!

Our project

  • Turn high-res scans of race tracks into semantic maps
  • Too little data to use supervised image segmentation
  • We use word2vec-like algorithm to extract similarities
  • We need to check if the output contains revelant features

Race track (a drone scan) -> t-SNE of extracted features

Our pipeline

  • Model training (PyTorch + Ignite)
  • Saving logs, models and predictions for each run (livelossplot + Neptune.ai)
  • Investigating promising models in D3.js
  • Presenting in RMarkdown (with Distill for RMarkdown, or ioslides as here)

Satellite image example

  • GREEN: similar
  • RED: different

SimCity example

Minesweeper example 1

Minesweeper example 2

Code

library(r2d3)
data <- jsonlite::read_json("featmap_GEAR_542_minesweeper.json")
r2d3(data,
     script = "../feature_sim.js",
     options = list(tileSize=10, defaultOpacity=0.5))

W3C

  • JSON is a wonderful standard for small data, but is awkward for arrays
  • Needed: framework-agnostic standards for sharing and processing numeric data
    • Crucial for deep learning (predictions, feature maps, models)
    • Crucial for quantum computing (states, operations) - complex numbers

Thanks!