W3C Workshop on Web and Machine Learning

Pipcook, a front-end oriented DL framework - by Wenhe Eric Li (Alibaba)

Previous: ml5.js: Friendly Machine Learning for the Web All talks Next: Machine Learning on the Web for content filtering applications



Slide 1 of 40

Hey guys, this is Wenhe, and you can also call me Eric.

I'm from Alibaba F2E team, and today I'm going to present Pipcook, a artificial intelligence, deep learning / machine learning with JavaScript library.

So, let's first look at the agenda.

So, we will basically introduce the background, and what is Pipcook, the design philosophy of Pipcook, some use case and capability of it.

And finally we will look into the future to see some future plans of Pipcook and what we can combine with web standards.

Let's look at the background first.

We know that, when we talk about software engineer, we are talking about maybe graphics software engineering, interaction, compiler, virtual machine, something, something and machine learning, software engineering.

And we know that software engineer can transfer from one to another.

Let's say front-end engineering for a specific example.

With the help of Node.js, a front-end engineer can also act as a back-end engineer.

So, we have this kind of question.

Is there a library of frameworks that help us make such a transformation for a front engineer to become a machine learning engineer?

And with this configuration and paradigm, we made Pipcook, a framework that helps a front engineer, like us, become a machine learning engineer.

As soon as this kind of transformation process, a front engineer will know how to define model, how to train and evaluate a model and finally how to deploy a model.

So, let's look closely about what is Pipcook.

So, Pipcook is a front-end orientated machine learning framework.

We designed a concept called pipline, and it helps developers to define yield deep learning model quickly.

And Pipcook provides two kinds of user interface.

One is Pipcook tools, and these are CLI based interactions.

And another is Pipboard.

It's a built in G-U-I, GUI management tool and allows user to perform pipeline operations conveniently.

So, let's think about the design philosophy in terms of the architecture and how we design it.

So, first of all, we define pipeline as a function and pipeline can take a set of plugins as the parameters.

And the output of pipeline is essentially a model.

And let's look at the big picture of how we process the data and get the model.

So, at the very beginning, we get data collect node, and right now we support two kinds of data source.

One's visual in the Pascal VOC format, another type in the CSV format.

And we put the collected data to our data access node, and basically data access is trying to converge different data into a union instance.

And then we put the data access output into the dataset process.

And dataset process will run the process in terms of the whole dataset.

And after that, for each specific data sample, we will run the data process.

And after the kind of long data processing stage, we all get a UniDataset.

And after we get to UniDataset, we can also define a model in the model different plugins and with a UniDataset and the model defined, we can just train model and evaluate model.

And if we got a really satisfied result, we can just deploy the model.

So, this screenshot is just about a simple pipeline.

It's about collecting mnist data and process it and do a mnist classification in the TensorFlow.

As you can see, just within like no more than 40 lines, you can define a whole process of collecting your data and training your model.

And, after you get the pipeline, you can just run it within a single command.

Pipcook will run on whatever the JSON file is located at.

And after that, you will get the output, so, output contains basically four parts.

So, first there's the logs.

Logs contains everything that comes out during the training and evaluating process.

And the model, model is just as the name suggests, is a training model.

And index.js, and if you run the index.js, you will start a service and this service have your hosted the model and you can just pipe whatever the input is and get to predict result.

And metadata and packages.json is kind of, as their name suggests, well, it's just about the metadata and some package things.

And in terms of how to write, just send it to the directory and run npm install to install the essential packages and run the node command.

You can just start service and pipe the data in, and get the result.

So, we also introduced PipBoard, as you can see PipBoard is a GUI tool.

It helps you to manage and change the jobs and pipelines easily.

And, yeah, let's talk about use case and capabilities here.

And one use case in our internal community it combines HTML components detection and segmentation, icon classification, layout generation.

And we make a product called imgcook.

And the essential feature of the product is we can feed a screenshot and get a code output directly.

And here in the DEMO, we feed a screenshot of a form table.

And after a few seconds of process, you can see from the screenshot that we get the code directly from the screenshot.


So, what Pipcook can do right now.

So Pipcook allows you to train models in the following tasks, basically text classification, text generation, image classification, object detection, segmentation, and style transfer with different kind of, like sub models.

And you know, that Pipcook is a open source project.

So, this list will get longer and longer as the community keep contributing to it.

And finally, we want to point another thing that, we make our library call the BOA, B-O-A and machine learning system in Python to JavaScript directly.

That means, as a JavaScript developer, you don't need to learn about Python syntax, or don't need to worry about Python syntax.

You can just use the Python library in the JavaScript format, and that is amazing too.

And finally, we are introduced to the future plan and some kind of web standards that we are concerned about.

So, as you can see that, Pipcook is a training orientated library of the frameworks.

So, we're also concerning about browser training, because we want to move some kind of training job to the browser.

So, the first thing comes in our mind is a WebbNN API. And if we can have a more generic WebbNN API, that means that, the possibility of shifting some training to the browser get larger.

Another thing is that about the training and inferences performance: and this kind of thing determines whether we can do with this kind of like browser training or browser inference on the browser, at a like business-level.

And another thing about the model format, we are thinking about that, is there, like Web-Friendly, like model format?

Is that like tensorFlow.js format?

Is that like h5/pb format?

Or we can combine them together to get their pros and cons.

And finally, we are thinking about model storage.

As you know that, a deep learning model essentially is a graph with some weights.

And if we can come up with, it's kind of like, we call it neural network orientated database.

Basically, it's a graph orientated database that stores the information in a graph format.

And in this way, we can definitely reduce the serialization overhead.

We are just trying to put some, like deep learning models into the indexedDB or some other regular database.

And, that's it.

And this is our presentation, thanks for listening, and if you still have any question in terms of the Pipcook, if you are interested in Pipcook, just don't hesitate, come with us, hit us up, and come into our community.

You are really welcome for any suggestions, discussions and collaborations.

Thank you.

Keyboard shortcuts in the video player
  • Play/pause: space
  • Increase volume: up arrow
  • Decrease volume: down arrow
  • Seek forward: right arrow
  • Seek backward: left arrow
  • Captions on/off: C
  • Fullscreen on/off: F
  • Mute/unmute: M
  • Seek percent: 0-9

Previous: ml5.js: Friendly Machine Learning for the Web All talks Next: Machine Learning on the Web for content filtering applications

Thanks to Futurice for sponsoring the workshop!


Video hosted by WebCastor on their StreamFizz platform.