W3C Developer Meetup 2019: Lin Clark, Mozilla, on Web Assembly Interface Types

Web Assembly

WebAssembly (WASM) is a pretty big deal for the Web - it provides a compact, fast-to-parse, fast-to-run binary byte code for the Web, in the right conditions.

WASM complements JavaScript for tasks that are CPU intensive, and facilitates porting of non-JS code bases (e.g. game engines). It also opens the door to byte code that can be seemingly run in various server-side and cloud environments.

Video

Video hosted by WebCastor on their StreamFizz platform.

photo of Lin-Clark

Lin Clark, Code Cartoonist and Principal Research Engineer, Mozilla

Lin makes code cartoons, and is also a Principal Research Engineer at Mozilla, focusing on enabling WebAssembly's use outside of the browser. She collaborates with Mozilla’s Developer Technologies team on projects such as the WebAssembly system interface (WASI) and the wasmtime WebAssembly runtime. In previous lives, she was a core contributor to open source projects like Firefox’s developer tools, worked at npm, and contributed to HTML data standards.

Transcript

Hi everyone, I'm Lin Clark and I make codecartoons.

And I also work at Mozilla on the developer technologies team.

If you aren't familiar with our team, we work on the Rust core language, on WebAssembly and on the Rust to WebAssembly toolchain.

And if you aren't familiar with WebAssembly, well it started as a way to run programming languages besides JavaScript on the web.

So these are programming languages like C, and C++ and Rust.

So with this, we saw how we could bring some of the nice aspects of native development like predictable performance to the web.

But lately, our team has been looking at how we can use WebAssembly to bring some of the nice aspects of web development, things like the portability and the security model outside of the browser.

So how we can bring these to native development, go in other direction.

With this, we're making it possible to interoperate with all the things.

So for example, taking a WebAssembly module that you can run on the web, and then run that very same module using rich APIs and high level types when you talk to a Python or Ruby or PHP module running in its own runtime.

And then turn around and take that very same module using that same API, the same high level types, when you're talking directly to the host or the operating system.

Even though the types that you're using there to talk to the operating system are different than the types that you were using in the Python runtime.

And then use those same high level APIs and those same types when talking to another WebAssembly module that might be written in a different source language.

So how you can take two WebAssembly modules, one written in Go, one written in Rust and have those talk to each other using that same high level type system.

Now why would you wanna do this?

Why would you wanna use it as a WebAssembly module in all of these cases?

Well there a few reasons.

If your app is in a scripting language like Python, then WebAssembly could be much faster.

You could get near native performance without the hassle of compiling a native extension.

If your app is in a low level language like C++, then WebAssembly can give you light weight sandboxing.

The module can't access any memory or other resources that haven't been directly given to it.

So this can help making reusing code much more secure.

And for both scripting languages and lower level languages, being able to reuse code from any language ecosystem without having to rewrite it in your own language can really make you move faster as a developer and also help with the maintenance of your application.

If there were a way to run the same WebAssembly module across all of these different environments, that would unlock a lot of wins.

So that's something that we've been working on.

So I wrote more about this, we're working on this thing called the interface types proposal.

And I wrote more about this in a blog post on the Hacks blog, Mozilla's Hacks blog.

So if you wanna go in depth on how this works, you can read about that there.

So right now, I'm just gonna show you how this works and give you a high level overview.

Now, we need a scenario, we need something for our demo that we actually put together.

So let's say that we're trying to create a build tool for creating static sites.

And we wanna build the same build tool in a few different environments.

And for this build tool we want to support Markdown.

So we need a Markdown parser.

And we need it to be in a language that can be compiled to WebAssembly.

So something like Rust.

And we also need it to support interface types.

Now, since I'm not this modules author, I can't add that support directly.

So I'm gonna wrap this module in my own module in Rust.

So I'll create a render function which uses string types.

And I added an annotation up there, "wasm-bindgen", and that is what does all the magic.

And then I'm gonna compile it using a tool called wasm-pack.

And use the --wasm-interface-types flag 'cause this is still an experimental feature.

Now, this gives us the single wasm file that we're gonna use in all of these different environments.

For our first environment, let's go with pure WebAssembly.

So for this we need a WebAssembly runtime that comes without JavaScript.

So one that we can easily use outside of the browser.

And that's what our wasmtime runtime is.

So we're gonna download wasmtime from wasmtime.dev.

And then we can run this module and pass it a Markdown string.

And as you see, the WebAssembly module took that Markdown string and return the html string.

Even though the runtime doesn't know anything about how Rust strings work, it was able to communicate with this module using a high level type.

So it's pretty easy and straight forward.

But what about Python?

Can we use this Markdown parser there?

Yes, and we may want to for speed.

To do this, we download the wasmtime extension.

And this makes it possible for a Python modules to call WebAssembly functions.

And now all I need to do it import the extension, and the Markdown module, and then I can call the render function.

So now we run this.

And again, it works.

The types are different this time, we're passing in Python values.

But it still just works.

Because of the magic of interface types, we can use the same file in the same way using different types in a different environment.

We can also use the same WebAssembly module in Rust.

Now one reason you'd wanna use WebAssembly here is for that lightweight sandboxing I was talking about before which isolates the third party code from the rest of your application.

So let's walk through how this works.

First, we add wasmtime Rust as a dependency.

And this does the same thing as the Python extension before.

It makes it possible to run the WebAssembly function.

And then in the main file, we add the wasmtime Rust macro.

And then add a trait and a render method to that trait.

But we don't add an implementation of this trait where you might expect it to be.

Instead the implementation is actually the render function in our WebAssembly module.

And this macro actually does all the wiring up for us.

And it also add in other methods on that trait like load file which instantiates a WebAssembly module from a file.

So in the main function we'll call load file to instantiate the module.

And then we can call render on that.

So now, let's use cargo to build and run it.

And again, it just works.

It's in a different environment and using different types, but it works.

Now, this might not seem impressive because we did actually compile this module originally from Rust.

But it would work just as seamlessly if this module had been compiled from C or C++ or Go.

As long as that module was using interface types.

Where else can we make this work?

Well I don't have enough time to show you, but this also already works in node and on the web through wasm-bindgen.

So that's the same Rust module compiled to WebAssembly and use the same rich API and types to talk to five wildly different environments and languages and runtimes.

And those are just a few examples.

There's no reason why this can't work in many more languages and many more runtimes.

So how does it work under the hood?

Well to explain that I have to walk through how the proposal developed over the last two years.

So the initial problem that I was trying to tackle was a more tractable one.

How can WebAssembly interact with the web platform using high level types?

And that still isn't a tiny problem, because at least up until now WebAssembly has only been able to talk in numbers.

If you had something more complex, like a string which might not seem complex but even that wouldn't work.

You'd have to write all of this glue code in between the two sides to encode and decode it into numbers.

But web API parameters and return values are usually far more complex.

And there a ton of different types in web APIs.

Now fortunately, there's a standard way to talk about the structure of these types, which is Web IDL. And because it's so structured, you can create mappings to Web IDL. So for JavaScript, there's a pretty straightforward mapping to Web IDL types.

So here was have an obvious solution.

Just create a mapping from WebAssembly to Web IDL, just as there is for JavaScript.

But that's not as straightforward as it might seem.

Now, I don't have enough time to explain why, but if you really wanna dig into that the post has more details.

At a high level, the main reason is that each language has its own way of representing data and memory.

And we can't pick just one, that would be a bad idea.

Even though the exact layout and memory for these different things is often different, there are some abstract concepts that they usually share in common.

So for example, strings in one language or in most languages often have a pointer to the start of the string in memory.

And then the length of the string.

So this means we can reduce this string down to a type that WebAssembly understands.

Two i32's, two integers.

So we need a way for a module to explicitly tell the engine something like, "I know that document create element takes a string, "but when I call it I'm gonna pass you two integers.

"Use these to create a DOM string from the data in my linear memory. Use the first integer as a starting address of the string, and the second as the length." This is what the early version of this proposal did.

It gave a WebAssembly module a way to map between the types that it uses and Web IDL's types.

Now these mappings weren't hard coded in the engine, instead a module comes with its own little booklet of mappings.

And this makes it possible for different languages to map to Web IDL in different ways.

Now how do you generate this booklet?

The compiler takes care of adding this information into your module for you.

So for the most part in many language toolchains, the programmer doesn't have to do very much work at all.

Once we stepped back and looked at the solution, we realized there's actually a solution to a bigger hairier problem here.

And this is the one that I was talking about before.

Is there a feasible way for WebAssembly to talk to all of these different things using all of these different type systems?

So like I talked about before, you could try to create these mappings that are hard coded in the engine like the JS to Web IDL mapping.

But to do that for every language, you'd have to create a specific mapping between two languages.

And this creates a real mess.

You'd have these mappings going very which way between the different languages.

This is kind of how early compilers were designed, where you had mappings between different source languages and targets.

We need something more scalable than this.

How did compilers solve the scalability issue?

Well they split things up between a front end and a back end.

The front end goes from the source language to and abstract intermediate representation or an IR.

And then the back end goes from that IR to the target machine code.

And this is where the insight from Web IDL comes in.

When you squint at it, Web IDL kind of looks like an intermediate representation.

Now Web IDL is pretty specific to the web, and there are lots of use cases for WebAssembly outside the web.

So Web IDL isn't, that itself isn't a great intermediate representation for us to use.

But what if you just used Web IDL as inspiration and create a new set of abstract types?

This is how we got to the WebAssembly interface types proposal.

Now these types aren't concrete types, they aren't like the int32 and the float64 types in WebAssembly today.

We won't be adding specific operations for these types to WebAssembly itself.

So for example, there won't be any string concatenation operations added to WebAssembly.

Instead all operations happen on the concrete types that are on either end of this.

And this is enabled by a really important detail of the proposal.

With interface types the two sides aren't trying to share a representation.

Instead the default is to copy values between one side and the other.

So this makes it a lot easier for a single module to talk to many different languages.

In some cases like in the browser, the mappings from the interface types to the host concrete types will be baked into the engine.

So one set of mappings will be baked in at compile time, and the other will be handed to the engine at load time.

But other cases like like when two WebAssembly modules are talking to each other they'll both send down their own little booklet.

And these each map, they are functions types to the abstract types.

And because all these mappings are declarative, the engine can actually see when a translation is unnecessary.

Like when two modules on either side are already using the same type, so that the engine can skip that translation all together.

Which allows for some really nice optimizations.

So that's just a taste of how this all works under the hood.

And if you wanna learn more about this in more depth, you can read the blog post on Mozilla's Hacks blog.

And if you wanna try this out, you can use our wasmtime project.

We're happy to talk about this more, so just ping me on Twitter if you wanna talk about this or find me after the event and we're happy to discuss how this can be used in your own use cases.

I wanna say thank you to the team who put these demos together and who are working on the proposal, they're doing fantastic work.

And thank you all for listening.