Jeremy KeithSwitching to HTTPS on Apache 2.4.7 on Ubuntu 14.04 on Digital Ocean

I’ve been updating my book sites over to HTTPS:

They’re all hosted on the same (virtual) box as adactio.com—Ubuntu 14.04 running Apache 2.4.7 on Digital Ocean. If you’ve got a similar configuration, this might be useful for you.

First off, I’m using Let’s Encrypt. Except I’m not. It’s called Certbot now (I’m not entirely sure why).

I installed the Let’s Encertbot client with this incantation (which, like everything else here, will need root-level access so if none of these work, retry using sudo in front of the commands):

wget https://dl.eff.org/certbot-auto
chmod a+x certbot-auto

Seems like a good idea to put that certbot-auto thingy into a directory like /etc:

mv certbot-auto /etc

Rather than have Certbot generate conf files for me, I’m just going to have it generate the certificates. Here’s how I’d generate a certificate for yourdomain.com:

/etc/certbot-auto --apache certonly -d yourdomain.com

The first time you do this, it’ll need to fetch a bunch of dependencies and it’ll ask you for an email address for future reference (should anything ever go screwy). For subsequent domains, the process will be much quicker.

The result of this will be a bunch of generated certificates that live here:

  • /etc/letsencrypt/live/yourdomain.com/cert.pem
  • /etc/letsencrypt/live/yourdomain.com/chain.pem
  • /etc/letsencrypt/live/yourdomain.com/privkey.pem
  • /etc/letsencrypt/live/yourdomain.com/fullchain.pem

Now you’ll need to configure your Apache gubbins. Head on over to…

cd /etc/apache2/sites-available

If you only have one domain on your server, you can just edit default.ssl.conf. I prefer to have separate conf files for each domain.

Time to fire up an incomprehensible text editor.

nano yourdomain.com.conf

There’s a great SSL Configuration Generator from Mozilla to help you figure out what to put in this file. Following the suggested configuration for my server (assuming I want maximum backward-compatibility), here’s what I put in.

<script src="https://gist.github.com/adactio/f0e13a2f8b9f9f084676bb2a901c5c95.js"></script>

Make sure you update the /path/to/yourdomain.com part—you probably want a directory somewhere in /var/www or wherever your website’s files are sitting.

To exit the infernal text editor, hit ctrl and o, press enter in response to the prompt, and then hit ctrl and x.

If the yourdomain.com.conf didn’t previously exist, you’ll need to enable the configuration by running:

a2ensite yourdomain.com

Time to restart Apache. Fingers crossed…

service apache2 restart

If that worked, you should be able to go to https://yourdomain.com and see a lovely shiny padlock in the address bar.

Assuming that worked, everything is awesome! …for 90 days. After that, your certificates will expire and you’ll be left with a broken website.

Not to worry. You can update your certificates at any time. Test for yourself by doing a dry run:

/etc/certbot-auto renew --dry-run

You should see a message saying:

Processing /etc/letsencrypt/renewal/yourdomain.com.conf

And then, after a while:

** DRY RUN: simulating 'certbot renew' close to cert expiry
** (The test certificates below have not been saved.)
Congratulations, all renewals succeeded.

You could set yourself a calendar reminder to do the renewal (without the --dry-run bit) every few months. Or you could tell your server’s computer to do it by using a cron job. It’s not nearly as rude as it sounds.

You can fire up and edit your list of cron tasks with this command:

crontab -e

This tells the machine to run the renewal task at quarter past six every evening and log any results:

15 18 * * * /etc/certbot-auto renew --quiet >> /var/log/certbot-renew.log

(Don’t worry: it won’t actually generate new certificates unless the current ones are getting close to expiration.) Leave the cronrab editor by doing the ctrl o, enter, ctrl x dance.

Hopefully, there’s nothing more for you to do. I say “hopefully” because I won’t know for sure myself for another 90 days, at which point I’ll find out whether anything’s on fire.

If you have other domains you want to secure, repeat the process by running:

/etc/certbot-auto --apache certonly -d yourotherdomain.com

And then creating/editing /etc/apache2/sites-available/yourotherdomain.com.conf accordingly.

I found these useful when I was going through this process:

That last one is good if you like the warm glow of accomplishment that comes with getting a good grade:

For extra credit, you can run your site through securityheaders.io to harden your headers. Again, not as rude as it sounds.

You know, I probably should have said this at the start of this post, but I should clarify that any advice I’ve given here should be taken with a huge pinch of salt—I have little to no idea what I’m doing. I’m not responsible for any flame-bursting-into that may occur. It’s probably a good idea to back everything up before even starting to do this.

Yeah, I definitely should’ve mentioned that at the start.

Bruce LawsonOn URLs in Progressive Web Apps

I’m writing this as a short commentary on Stuart Langridge’s post The Importance of URLs which you should read (he’s surprisingly clever, although he looks like the antichrist in that lewd hat).

Stuart says

I approve of the Lighthouse team’s idea that you don’t qualify as an add-to-home-screen-able app if you want a URL bar

Opera’s implementation of Progressive Web Apps differs from Chrome’s here (we only take the content layer of Chromium; we implement all the UI ourselves, precisely so we can do our own thing). Regardless of whether the developer has chosen display: standalone or display: fullscreen in order to hide the URL bar, Opera will display it if the app is served over HTTP because we think that the user should know exactly where she is if the app is served over an insecure connection. Similarly, if the user follows a link from your app that goes outside its domain, Opera spawns a new tab and forces display: browser so the URL bar is shown.

But I take Jeremy Keith’s point:

I want people to be able to copy URLs. I want people to be able to hack URLs. I’m not ashamed of my URLs …I’m downright proud.

One of the superpowers of the Web is URLs, and fullscreen progressive web apps hide them (deliberately). After our last PWA meeting with the Chrome team in early February, I was talking about just this with Andreas Bovens, the PM for Opera for Android. We mused about some mechanism (a new gesture?) that would allow the user to see and copy (if they want) the URL of the current page. I’ve already heard of examples when developers are making their own “share this” buttons — and devs re-implementing browser functionality is often a klaxon signalling something is missing from the platform.

When I mentioned our musings on Twitter this morning, Alex Russell said “we’ve been discussing the same.” It is, as Chrome chappie Owen Campbell-Moore said “a difficult UX problem indeed”, which is one reason that Andreas and I parked our discussion. One of Andreas’ ideas is long press on the current page, and then get an option to copy/share the URL of the page you’re currently viewing (this means that a long press is not available as an action for site owners to use on their sites. Probably not a big deal?)

What do you think? How can we best allow the user to see the current URL in a discoverable way?

Planet MozillaOutreachy: What? How? Why?

Today was my first day as an Outreachy intern with Mozilla! What does that even mean? Why is it super exciting? How did I swing such a sweet gig? How will I be spending my summer non-vacation? Read on to find out!

Outreachy logo

What is Outreachy?

Outreachy is a fantastic initiative to get more women and members of other underrepresented groups involved in Free & Open Source Software. Through Outreachy, organizations that create open-source software (e.g. Mozilla, GNOME, Wikimedia, to name a few) take on interns to work full-time on a specific project for 3 months. There are two internship rounds each year, May-August and December-March. Interns are paid for their time, and receive guidance/supervision from an assigned mentor, usually a full-time employee of the organization who leads the given project.

Oh yeah, and the whole thing is done remotely! For a lot of people (myself included) who don’t/can’t/won’t live in a major tech hub, the opportunity to work remotely removes one of the biggest barriers to jumping in to the professional tech community. But as FOSS developers tend to be pretty distributed anyway (I think my project’s team, for example, is on about 3 continents), it’s relatively easy for the intern to integrate with the team. It seems that most communication takes place over IRC and, to a lesser extent, videoconferencing.

What does an Outreachy intern do?

Anything and everything! Each project and organization is different. But in general, interns spend their time…

Coding (or not)

A lot of projects involve writing code, though what that actually entails (language, framework, writing vs. refactoring, etc.) varies from organization to organization and project to project. However, there are also projects that don’t involve code at all, and instead have the intern working on equally important things like design, documentation, or community management.

As for me specifically, I’ll be working on the project Test-driven Refactoring of Marionette’s Python Test Runner. You can click through to the project description for more details, but basically I’ll be spending most of the summer writing Python code (yay!) to test and refactor a component of Marionette, a tool that lets developers run automated Firefox tests. This means I’ll be learning a lot about testing in general, Python testing libraries, the huge ecosystem of internal Mozilla tools, and maybe a bit about browser automation. That’s a lot! Luckily, I have my mentor Maja (who happens to also be an alum of both Outreachy and RC!) to help me out along the way, as well as the other members of the Engineering Productivity team, all of whom have been really friendly & helpful so far.

Traveling

Interns receive a $500 stipend for travel related to Outreachy, which is fantastic. I intend, as I’m guessing most do, to use this to attend conference(s) related to open source. If I were doing a winter round I would totally use it to attend FOSDEM, but there are also a ton of conferences in the summer! Actually, you don’t even need to do the traveling during the actual 3 months of the internship; they give you a year-long window so that if there’s an annual conference you really want to attend but it’s not during your internship, you’re still golden.

At Mozilla in particular, interns are also invited to a week-long all-hands meet up! This is beyond awesome, because it gives us a chance to meet our mentors and other team members in person. (Actually, I doubly lucked out because I got to meet my mentor at RC during “Never Graduate Week” a couple of weeks ago!)

Blogging

One of the requirements of the internship is to blog regularly about how the internship and project are coming along. This is my first post! Though we’re required to write a post every 2 weeks, I’m aiming to write one per week, on both technical and non-technical aspects of the internship. Stay tuned!

How do you get in?

I’m sure every Outreachy participant has a different journey, but here’s a rough outline of mine.

Step 1: Realize it is a thing

Let’s not forget that the first step to applying for any program/job/whatever is realizing that it exists! Like most people, I think, I had never heard of Outreachy, and was totally unaware that a remote, paid internship working on FOSS was a thing that existed in the universe. But then, in the fall of 2015, I made one of my all-time best moves ever by attending the Recurse Center (RC), where I soon learned about Outreachy from various Recursers who had been involved with the program. I discovered it about 2 weeks before applications closed for the December-March 2015-16 round, which was pretty last-minute; but a couple of other Recursers were applying and encouraged me to do the same, so I decided to go for it!

Step 2: Frantically apply at last minute

Applying to Outreachy is a relatively involved process. A couple months before each round begins, the list of participating organizations/projects is released. Prospective applicants are supposed to find a project that interests them, get in touch with the project mentor, and make an initial contribution to that project (e.g. fix a small bug).

But each of those tasks is pretty intimidating!

First of all, the list of participating organizations is long and varied, and some organizations (like Mozilla) have tons of different projects available. So even reading through the project descriptions and choosing one that sounds interesting (most of them do, at least to me!) is no small task.

Then, there’s the matter of mustering up the courage to join the organization/project’s IRC channel, find the project mentor, and talk to them about the application. I didn’t even really know what IRC was, and had never used it before, so I found this pretty scary. Luckily, I was RC, and one of my batchmates sat me down and walked me through IRC basics.

However, the hardest and most important part is actually making a contribution to the project at hand. Depending on the project, this can be long & complicated, quick & easy, or anything in between. The level of guidance/instruction also varies widely from project to project: some are laid out clearly in small, hand-holdy steps, others are more along the lines of “find something to do and then do it”. Furthermore, prerequisites for making the contribution can be anything from “if you know how to edit text and send an email, you’re fine” to “make a GitHub account” to “learn a new programming language and install 8 million new tools on your system just to set up the development environment”. All in all, this means that making that initial contribution can often be a deceptively large amount of work.

Because of all these factors, for my application to the December-March round I decided to target the Mozilla project “Contribute to the HTML standard”. In addition to the fact that I thought it would be awesome to contribute to such a fundamental part of the web, I chose it because the contribution itself was really simple: just choose a GitHub issue with a beginner-friendly label, ask some questions via GitHub comments, edit the source markup file as needed, and make a pull request. I was already familiar with GitHub so it was pretty smooth sailing.

Once you’ve made your contribution, it’s time to write the actual Outreachy application. This is just a plain text file you fill out with lots of information about your experience with FOSS, your contribution to the project, etc. In case it’s useful to anyone, here’s my application for the December-March 2015-16 round. But before you use that as an example, make sure you read what happened next…

Step 3: Don’t get in

Unfortunately, I didn’t get in to the December-March round (although I was stoked to see some of my fellow Recursers get accepted!). Honestly, I wasn’t too surprised, since my contributions and application had been so hectic and last-minute. But even though it wasn’t successful, the application process was educational in and of itself: I learned how to use IRC, got 3 of my first 5 GitHub pull requests merged, and became a contributor to the HTML standard! Not bad for a failure!

Step 4: Decide to go for it again (at last minute, again)

Fast forward six months: after finishing my batch at RC, I had been looking & interview-prepping, but still hadn’t gotten a job. When the applications for the May-August round opened up, I took a glance at the projects and found some cool ones, but decided that I wouldn’t apply this round because a) I needed a Real Job, not an internship, and b) the last round’s application process was a pretty big time investment which hadn’t paid off (although it actually had, as I just mentioned!).

But as the weeks went by, and the application deadline drew closer, I kept thinking about it. I was no closer to finding a Real Job, and upheaval in my personal life made my whereabouts over the summer an uncertainty (I seem never to know what continent I live on), so a paid, remote internship was becoming more and more attractive. When I broached my hesitation over whether or not to apply to other Recursers, they unanimously encouraged me (again) to go for it (again). Then, I found out that one of the project mentors, Maja, was a Recurser, and since her project was one of the ones I had shortlisted, I decided to apply for it.

Of course, by this point it was once again two weeks until the deadline, so panic once again set in!

Step 5: Learn from past mistakes

This time, the process as a whole was easier, because I had already done it once. IRC was less scary, I already felt comfortable asking the project mentor questions, and having already been rejected in the previous round made it somehow lower-stakes emotionally (“What the hell, at least I’ll get a PR or two out of it!”). During my first application I had spent a considerable amount of time reading about all the different projects and fretting about which one to do, flipping back and forth mentally until the last minute. This time, I avoided that mistake and was laser-focused on a single project: Test-driven Refactoring of Marionette’s Python Test Runner.

From a technical standpoint, however, contributing to the Marionette project was more complicated than the HTML standard had been. Luckily, Maja had written detailed instructions for prospective applicants explaining how to set up the development environment etc., but there were still a lot of steps to work through. Then, because there were so many folks applying to the project, there was actually a shortage of “good-first-bugs” for Marionette! So I ended up making my first contributions to a different but related project, Perfherder, which meant setting up a different dev environment and working with a different mentor (who was equally friendly). By the time I was done with the Perfherder stuff (which turned out to be a fun little rabbit hole!), Maja had found me something Marionette-specific to do, so I ended up working on both projects as part of my application process.

When it came time to write the actual application, I also had the luxury of being able to use my failed December-March application as both a starting point and an example of what not to do. Some of the more generic parts (my background, etc.) were reusable, which saved time. But when it came to the parts about my contribution to the project and my proposed internship timeline, I knew I had to do a much better job than before. So I opted for over-communciation, and basically wrote down everything I could think of about what I had already done and what I would need to do to complete the goals stated in the project description (which Maja had thankfully written quite clearly).

In the end, my May-August application was twice as long as my previous one had been. Much of that difference was the proposed timeline, which went from being one short paragraph to about 3 pages. Perhaps I was a bit more verbose than necessary, but I decided to err on the side of too many details, since I had done the opposite in my previous application.

Step 6: Get a bit lucky

Spoiler alert: this time I was accepted!

Although I knew I had made a much stronger application than in the previous round, I was still shocked to find out that I was chosen from what seemed to be a large, competitive applicant pool. I can’t be sure, but I think what made the difference the second time around must have been a) more substantial contributions to two different projects, b) better, more frequent communication with the project mentor and other team members, and c) a much more thorough and better thought-out application text.

But let’s not forget d) luck. I was lucky to have encouragement and support from the RC community throughout both my applications, lucky to have the time to work diligently on my application because I had no other full-time obligations, lucky to find a mentor who I had something in common with and therefore felt comfortable talking to and asking questions of, and lucky to ultimately be chosen from among what I’m sure were many strong applications. So while I certainly did work hard to get this internship, I have to acknowledge that I wouldn’t have gotten in without all of that luck.

Why am I doing this?

Last week I had the chance to attend OSCON 2016, where Mozilla’s E. Dunham gave a talk on How to learn Rust. A lot of the information applied to learning any language/new thing, though, including this great recommendation: When embarking on a new skill quest, record your motivation somewhere (I’m going to use this blog, but I suppose Twitter or a vision board or whatever would work too) before you begin.

The idea is that once you’re in the process of learning the new thing, you will probably have at least one moment where you’re stuck, frustrated, and asking yourself what the hell you were thinking when you began this crazy project. Writing it down beforehand is just doing your future self a favor, by saving up some motivation for a rainy day.

So, future self, let it be known that I’m doing Outreachy to…

  • Write code for an actual real-world project (as opposed to academic/toy projects that no one will ever use)
  • Get to know a great organization that I’ve respected and admired for years
  • Try out working remotely, to see if it suits me
  • Learn more about Python, testing, and automation
  • Gain confidence and feel more like a “real developer”
  • Launch my career in the software industry

I’m sure these goals will evolve as the internship goes along, but for now they’re the main things driving me. Now it’s just a matter of sitting back, relaxing, and working super hard all summer to achieve them! :D

Got any more questions?

Are you curious about Outreachy? Thinking of applying? Confused about the application process? Feel free to reach out to me! Go on, don’t be shy, just use one of those cute little contact buttons and drop me a line. :)

Planet Mozilla[worklog] Make Web sites simpler.

Not a song this week, but just a documentary to remind me that some sites are overly complicated and there are strong benefits and resilience in chosing a solid simple framework for working. Not that it makes easier the work. I think it's even the opposite, it's basically harder to make a solid simple Web site. But that the cost is beneficial on the longterm. Tune of the week: The Depth of simplicity in Ozu's movie.

Webcompat Life

Progress this week:

Today: 2016-05-16T10:12:01.879159
354 open issues
----------------------
needsinfo       3
needsdiagnosis  109
needscontact    30
contactready    55
sitewait        142
----------------------

In my journey in getting the contactready and needscontact lower, we are making progress. You are welcome to participate

Londong agenda.

Reorganizing a bit the wiki so it better aligns with our current work. In Progress.

Good news on the front of appearance in CSS.

The CSSWG just resolved that "appearance: none" should turn checkbox & radio <input> elements into a normal non-replaced element.

Learning on how to do mozregression

We are looking at creating a mechanism similar to Opera browser.js into Firefox. Read and participate to the discussion.

Webcompat issues

(a selection of some of the bugs worked on this week).

Reading List

Follow Your Nose

TODO

  • Document how to write tests on webcompat.com using test fixtures.
  • ToWrite: rounding numbers in CSS for width
  • ToWrite: Amazon prefetching resources with <object> for Firefox only.

Otsukare!

Planet MozillaOne year of Rust

Rust is a language that gives you:

It’s a language for writing highly reliable, screamingly fast software—and having fun doing it.

And yesterday, Rust turned one year old.

Rust in numbers

A lot has happened in the last 365 days:

  • 11,894 commits by 702 contributors added to the core repository;
  • 88 RFCs merged;
  • 18 compiler targets introduced;
  • 9 releases shipped;
  • 1 year of stability delivered.

On an average week this year, the Rust community merged two RFCs and published 53 brand new crates. Not a single day went by without at least one new Rust library hitting the central package manager. And Rust topped the “most loved language” in this year’s StackOverflow survey.

Speaking of numbers: we recently launched a survey of our own, and want to hear from you whether you are an old hat at Rust, or have never used it.

One place where our numbers are not where we want them to be: community diversity. We’ve had ongoing local outreach efforts, but the Rust community team will soon be launching a coordinated, global effort following the Bridge model (e.g. RailsBridge). If you want to get involved, or have other ideas for outreach, please let the community team know.

Rust in production

This year saw more companies betting on Rust. Each one has a story, but two particularly resonated.

First, there’s Dropbox. For the last several years, the company has been secretively working on a move away from AWS and onto its own infrastructure. The move, which is now complete, included developing custom-build hardware and the software to drive it. While much of Dropbox’s back-end infrastructure is historically written in Go, for some key components the memory footprint and lack of control stood in the way of achieving the server utilization they were striving for. They rewrote those components in Rust. In the words of Jamie Turner, a lead engineer for the project, “the advantages of Rust are many: really powerful abstractions, no null, no segfaults, no leaks, yet C-like performance and control over memory.”

Second, there’s Mozilla. They’ve long been developing Servo as a research browser engine in Rust, but their first production Rust code shipped through a different vehicle: Firefox. In Firefox 45, without any fanfare, Rust code for mp4 metadata parsing went out to OSX and 64-bit Linux users; it will hit Windows in version 48. The code is currently running in test mode, with its results compared against the legacy C++ library: 100% correctness on 1 billion reported executions. But this code is just the tip of the iceberg: after laying a lot of groundwork for Rust integration, Firefox is poised to bring in significant amounts of new Rust code, including components from Servo—and not just in test mode.

We’re hearing similar stories from a range of other shops that are putting Rust into production: Rust helps a team punch above its weight. It gives many of the same benefits as traditional systems languages while being more approachable, safer and often more productive.

These are just a few stories of Rust in production, but we’d love to hear yours!

Rust, improved

Of course, Rust itself hasn’t been standing still. The focus in its first year has been growing and polishing its ecosystem and tooling:

There’s a lot more to say about what’s happened and what’s coming up in the Rust world—over the coming months, we’ll be using this blog to say it.

Rust in community

It turns out that people like to get together and talk Rust. We had a sold out RustCamp last August, and several upcoming events in 2016:

  • September 9-10, 2016: the first RustConf in Portland, OR, USA;
  • September 17, 2016: RustFest, the European community conference, in Berlin, Germany;
  • October 27-18, 2016: Rust Belt Rust, a Rust conference in Pittsburgh, PA, USA;
  • 71 Rust-related meetup groups worldwide.

And that’s no surprise. From a personal perspective, the best part about working with Rust is its community. It’s hard to explain quite what it’s like to be part of this group, but two things stand out. First, its sheer energy: so much happens in any given week that This Week in Rust is a vital resource for anyone hoping to keep up. Second, its welcoming spirit. Rust’s core message is one of empowerment—you can fearlessly write safe, low-level systems code—and that’s reflected in the community. We’re all here to learn how to be better programmers, and support each other in doing so.

There’s never been a better time to get started with Rust, whether through attending a local meetup, saying hello in the users forum, watching a talk, or reading the book. No matter how you find your way in, we’ll be glad to have you.

Happy birthday, Rust!

Reddit: BrowsersVortex Studio "Shadow Browser" benchmarks leaked!

Hello! I am someone who is associated with Vortex Studio and I would like to release some information about their new product. The reason why I want to do this is because I found some benchmark numbers and they seem pretty impressive so I want you're input on the benchmarks. All of the benchmark images will be uploaded here. The browser at it current position isn't that impressive. It doesn't have a lot of the features that the original idea had. The original idea of the browser is the most secured web browser where it automatically connects to a virtual private network (if you want it too) and hides all of you're information while you can access websites that are blocked (like Facebook or Yahoo, basically like the Tor browser.) This browser code name was the Shadow Browser (most likely to change.) The reason why it was called the Shadow Browser is too keep the idea that the browser is supposed to be highly optimized for security and not for performance but still is really good with performance and can match up with a browser like Opera.

So now going into the information of what the browser is, here are some of the benchmarks. The name of the browser in these benchmarks are Shadow Browser Alpha, again this name will most likely change because the creator of this web browser named this browser quickly.

Average Position in all of the benchmarks : http://s32.postimg.org/lunvml3d1/Average_Position.jpg

Base battle Benchmark : http://s32.postimg.org/t9d7eyp8l/Basebattle_Benchmark.jpg

Browserscope Security Benchmark : http://s32.postimg.org/3ueoi7b5x/Browserscope_Security_Benchmark.jpg

HTML5 Test Benchmark : http://s32.postimg.org/xsu1e7whx/HTML5_Test_Benchmark.jpg

Jet Stream Benchmark : http://s32.postimg.org/xespemced/Jet_Stream_Benchmark.jpg

Octane 2.0 Benchmark : http://s32.postimg.org/txqnbndc5/Octane_2_0_Benchmark.jpg

Peace Keeper Benchmark : http://s32.postimg.org/sc6s63dqd/Peace_Keeper_Benchmark.jpg

Speed Battle Benchmark : http://s32.postimg.org/u6o81kq4l/Speed_Battle_Benchmark.jpg

SpeedoMeter Benchmark : http://s32.postimg.org/ro2j0w4ed/Speedo_Meter_Benchmark.jpg

V8 Benchmark : http://s32.postimg.org/wt3qiihc5/V8_Benchmark.jpg

For a new browser in development for about 3 weeks now, the benchmarks are pretty good if they are not focusing performance. What do you guys think?

For a new browser in development for about 3 weeks now, the benchmarks are pretty good if they are not focusing performance. What do you guys think? If you have any questions please list them below. I know a lot about the browser because I know who was developing it but I am pretty sure he changed a lot of it. So I will answer your questions to the best of my ability.

submitted by /u/Tryrip
[link] [comments]

WHATWG blogDefining the WindowProxy, Window, and Location objects

The HTML Standard defines how navigation works inside a browser tab, how JavaScript executes, what the overarching web security model is, and how all these intertwine and work together. Over the last decade, we’ve made immense progress in specifying previously-unspecified behavior, reverse-engineering and precisely documenting the de-facto requirements for a web-compatible browser. Nevertheless, there are still some corners of the web that are underspecified—sometimes because we haven’t yet discovered the incompatibility, and sometimes because specifying the behavior in a way that is acceptable to all implementers is really, really hard. What follows is an account of the latter.

Until recently, the HTML Standard lacked a precise definition of the WindowProxy, Window, and Location objects. As you might imagine, these are fairly important objects, so having them be underdefined was not great for the web. (Note that the global object used for documents is the Window object, though due the way browsers evolved it is never directly exposed. Instead, JavaScript code accesses the WindowProxy object, which serves as a proxy and security boundary for the Window object.)

Each navigable frame (top-level tab, <iframe> element, et cetera) is called a browsing context in the HTML Standard. A browsing context has an associated WindowProxy and Window object. As you navigate a browsing context, the associated Window object changes. But the whole time, the WindowProxy object stays the same. Ergo, one WindowProxy object is a proxy for many Window objects.

To make matters more interesting, scripts in these different browsing contexts can access each other, through frame.contentWindow, self.opener, window.open(), et cetera. The same-origin policy generally forbids code from one origin from accessing code from a different origin, which prevents evil.com from prying into bank.com. The two legacy exceptions to this rule are the WindowProxy and Location objects, which have some properties that can be accessed across origins.

document.domain makes this even trickier, as it effectively allows you to observe a WindowProxy or Location object as cross-origin initially, and same-origin later, or vice versa. Since the object remains the same during that time, the same-origin versus cross-origin logic needs to be part of the same object and cannot be spread across different classes.

As JavaScript has many ways to inspect objects, WindowProxy and Location objects were forced to be exotic objects and defined in terms of JavaScript’s “meta-object protocol”. This means they have custom internal methods (such as [[Get]] or [[DefineOwnProperty]]) that define how they respond to the low-level operations that govern JavaScript execution.

Defining this all in detail has been a multi-year effort spearheaded by Bobby Holley, Boris Zbarsky, Ian Hickson, Adam Barth, Domenic Denicola, and Anne van Kesteren, and completed in the “define security around Window, WindowProxy, and Location objects properly” pull request. The basic setup we ended up with is that WindowProxy and Location objects have specific cross-origin branches in their internal method implementation. These take care to only expose specific properties, and even for those properties, generating specific accessor functions per origin. This ensures that cross-origin access is not inadvertently allowed through something like Object.getOwnPropertyDescriptor(otherWindowProxy, "window").get. After filtering, a WindowProxy object will forward to its Window object as appropriate, whereas a Location object simply gives access to its own properties.

Having these objects defined in detail will make it easier for implementations to refactor, and for new novel implementations like Servo to achieve web-compatibility. It will reduce debugging time for web developers after implementations have converged on the edge cases. And it drastically simplifies extending these objects, as well as placing new restrictions upon them, within this well-defined subsystem. Well-understood, stable foundations are the key to future extensions.

(Many thanks to Bobby Holley for his contributions to this post.)

Planet MozillaSchools Of Thoughts In Web Standards

Last night, I had the pleasure of reading Daniel Stenberg's blog post about URL Standards. It led me to the discussion happening about the WHATWG URL spec about "It's not immediately clear that "URL syntax" and "URL parser" conflict". As you can expect, the debate is inflammatory on both sides, border line hypocrite at some occasions and with a lot of the arguments I have seen in the last 20 years I have followed discussions around the Web development.

This post has no intent to be the right way to talk about it. It's more a collection of impression I had when reading the thread with my baggage of ex-W3C staff, Web agency work and, ex-Opera and now-Mozilla Web Compatibility work.

"Le chat a bon dos". French expression to basically say we are in the blaming game in that thread. Maybe not that useful.

What is happening?

  • Deployed Web Content: Yes there is a lot of content broken out there and some of it will never be fixed which ever effort you put into it. That's normal and this is not a broken per se. Think about abandon editions of old dictionaries with mistakes into it. History and fabric of time. What should happen? When a mistake is frequent enough, it is interesting to try to have a part of the parsing algorithm to recover it. The decision then becomes to decide on frequent enough meaning. And that opens a new debate in itself, because it's dependent on countries, market shares, specific communities. Everything the society can provide in terms of economy, social behavior, history, etc.
  • Browsers: We can also often read in that thread, that it's not browser's fault, it's because of the Web Content. Well that's not entirely true too. When a browser recovers from a previously-considered-broken pattern found on the Web, it just entrenches the pattern. Basically, it's not an act of saying, we need to be compatible with the deployed content (aka not our fault). It would be a false pretense. It's an implementation decision which further drags the once-broken-pattern into a the normal patterns of the Web, a standardization process (a king of jurisprudence). So basically it's about recognizing that this term, pattern is now part of the bigger picture. There's no such things as saying: "It is good for people who decide to be compatible with browsers" (read "Join us or go to hell, I don't want to discuss with you."). There's a form of understandable escapism here to hide a responsibility and to hide the burden of creating a community. It would be more exact to say "Yes, we make the decision that the Web should be this and not anything else." It doesn't make the discussion easier but it's more the point of the power play in place.
  • $BROWSER lord: In the discussion, the $BROWSER is Google's Chrome. A couple of years ago, it was IE. Again, saying Chrome has no specific responsibility is again an escapism. The same way that Safari has a lot of influences on the mobile Web, Chrome currently by its market share creates a tide which influences a lot the Web content and its patterns out there. I can guarantee that it's easier now for Chrome to be stricter with regards to syntax than it is for Edge or Firefox. Opera had to give up its rendering engine (Presto) because of this and switched to blink.

There are different schools for the Web specifications:

  1. Standards defining a syntax considered ideal and free for implementations to recover with their own strategy when it's broken.
  2. Standards defining how to recover for all the possible ways it is mixed up. By doing that the intent is often to recover from a previous stricter syntax, but in the end it is just defining, expanding the possibilities.
  3. Standards defining a different policy for parsing and producing with certain nuances in between. [Kind of Postel's law.].

I'm swaying in between these three schools all the time. I don't like the number 2 at all, but because of survival it is sometimes necessary. My preferred way it's 3, having a clear strict syntax for producing content, and a recovery parsing technique. And when possible I would prefer a sanitizer version of the Postel's law.

What did he say btw?

RFC 760

The implementation of a protocol must be robust. Each implementation must expect to interoperate with others created by different individuals. While the goal of this specification is to be explicit about the protocol there is the possibility of differing interpretations. In general, an implementation should be conservative in its sending behavior, and liberal in its receiving behavior. That is, it should be careful to send well-formed datagrams, but should accept any datagram that it can interpret (e.g., not object to technical errors where the meaning is still clear).

Then in RFC 1122: The 1.2.2 section, the Robustness Principle

At every layer of the protocols, there is a general rule whose application can lead to enormous benefits in robustness and interoperability [IP:1]:

"Be liberal in what you accept, and conservative in what you send"

Software should be written to deal with every conceivable error, no matter how unlikely; sooner or later a packet will come in with that particular combination of errors and attributes, and unless the software is prepared, chaos can ensue. In general, it is best to assume that the network is filled with malevolent entities that will send in packets designed to have the worst possible effect. This assumption will lead to suitable protective design, although the most serious problems in the Internet have been caused by unenvisaged mechanisms triggered by low-probability events; mere human malice would never have taken so devious a course!

Adaptability to change must be designed into all levels of Internet host software. As a simple example, consider a protocol specification that contains an enumeration of values for a particular header field -- e.g., a type field, a port number, or an error code; this enumeration must be assumed to be incomplete. Thus, if a protocol specification defines four possible error codes, the software must not break when a fifth code shows up. An undefined code might be logged (see below), but it must not cause a failure.

The second part of the principle is almost as important: software on other hosts may contain deficiencies that make it unwise to exploit legal but obscure protocol features. It is unwise to stray far from the obvious and simple, lest untoward effects result elsewhere. A corollary of this is "watch out for misbehaving hosts"; host software should be prepared, not just to survive other misbehaving hosts, but also to cooperate to limit the amount of disruption such hosts can cause to the shared communication facility.

The important point in the discussion of Postel's law is that he is talking about software behavior, not specifications. The new school of thoughts for Web standards is to create specification which are "software-driven", not "syntax-driven". And it's why you can read entrenched debates about the technology.

My sanitizer version of the Postel's law would be something along:

  1. Be liberal in what you accept
  2. Be conservative in what you send
  3. Make conservative what you accepted (aka fixing it)

Basically when you receive something broken, and there is a clear path for fixing it, do it. Normalize it. In the debated version, about accepting http://////, it would be

  • parse it as http://////
  • communicate it to the next step as http:// and possibly with an optional notification it has been recovered.

Otsukare!

Planet MozillaMy URL isn’t your URL

URLs

When I started the precursor to the curl project, httpget, back in 1996, I wrote my first URL parser. Back then, the universal address was still called URL: Uniform Resource Locators. That spec was published by the IETF in 1994. The term “URL” was then used as source for inspiration when naming the tool and project curl.

The term URL was later effectively changed to become URI, Uniform Resource Identifiers (published in 2005) but the basic point remained: a syntax for a string to specify a resource online and which protocol to use to get it. We claim curl accepts “URLs” as defined by this spec, the RFC 3986. I’ll explain below why it isn’t strictly true.

There was also a companion RFC posted for IRI: Internationalized Resource Identifiers. They are basically URIs but allowing non-ascii characters to be used.

The WHATWG consortium later produced their own URL spec, basically mixing formats and ideas from URIs and IRIs with a (not surprisingly) strong focus on browsers. One of their expressed goals is to “Align RFC 3986 and RFC 3987 with contemporary implementations and obsolete them in the process“. They want to go back and use the term “URL” as they rightfully state, the terms URI and IRI are just confusing and no humans ever really understood them (or often even knew they exist).

The WHATWG spec follows the good old browser mantra of being very liberal in what it accepts and trying to guess what the users mean and bending backwards trying to fulfill. (Even though we all know by now that Postel’s Law is the wrong way to go about this.) It means it’ll handle too many slashes, embedded white space as well as non-ASCII characters.

From my point of view, the spec is also very hard to read and follow due to it not describing the syntax or format very much but focuses far too much on mandating a parsing algorithm. To test my claim: figure out what their spec says about a trailing dot after the host name in a URL.

On top of all these standards and specs, browsers offer an “address bar” (a piece of UI that often goes under other names) that allows users to enter all sorts of fun strings and they get converted over to a URL. If you enter “http://localhost/%41” in the address bar, it’ll convert the percent encoded part to an ‘A’ there for you (since 41 in hex is a capital A in ASCII) but if you type “http://localhost/A A” it’ll actually send “/A%20A” (with a percent encoded space) in the outgoing HTTP GET request. I’m mentioning this since people will often think of what you can enter there as a “URL”.

The above is basically my (skewed) perspective of what specs and standards we have so far to work with. Now we add reality and let’s take a look at what sort of problems we get when my URL isn’t your URL.

So what  is a URL?

Or more specifically, how do we write them. What syntax do we use.

I think one of the biggest mistakes the WHATWG spec has made (and why you will find me argue against their spec in its current form with fierce conviction that they are wrong), is that they seem to believe that URLs are theirs to define and work with and they limit their view of URLs for browsers, HTML and their address bars. Sure, they are the big companies behind the browsers almost everyone uses and URLs are widely used by browsers, but URLs are still much bigger than so.

The WHATWG view of a URL is not widely adopted outside of browsers.

colon-slash-slash

If we ask users, ordinary people with no particular protocol or web expertise, what a URL is what would they answer? While it was probably more notable years ago when the browsers displayed it more prominently, the :// (colon-slash-slash) sequence will be high on the list. Seeing that marks the string as a URL.

Heck, going beyond users, there are email clients, terminal emulators, text editors, perl scripts and a bazillion other things out there in the world already that detects URLs for us and allows operations on that. It could be to open that URL in a browser, to convert it to a clickable link in generated HTML and more. A vast amount of said scripts and programs will use the colon-slash-slash sequence as a trigger.

The WHATWG spec says it has to be one slash and that a parser must accept an indefinite amount of slashes. “http:/example.com” and “http:////////////////////////////////////example.com” are both equally fine. RFC 3986 and many others would disagree. Heck, most people I’ve confronted the last few days, even people working with the web, seem to say, think and believe that a URL has two slashes. Just look closer at the google picture search screen shot at the top of this article, which shows the top images for “URL” google gave me.

We just know a URL has two slashes there (and yeah, file: URLs most have three but lets ignore that for now). Not one. Not three. Two. But the WHATWG doesn’t agree.

“Is there really any reason for accepting more than two slashes for non-file: URLs?” (my annoyed question to the WHATWG)

“The fact that all browsers do.”

The spec says so because browsers have implemented the spec.

No better explanation has been provided, not even after I pointed out that the statement is wrong and far from all browsers do. You may find reading that thread educational.

In the curl project, we’ve just recently started debating how to deal with “URLs” having another amount of slashes than two because it turns out there are servers sending back such URLs in Location: headers, and some browsers are happy to oblige. curl is not and neither is a lot of other libraries and command line tools. Who do we stand up for?

Spaces

A space character (the ASCII code 32, 0x20 in hex) cannot be part of a URL. If you want it sent, you percent encode it like you do with any other illegal character you want to be part of the URL. Percent encoding is the byte value in hexadecimal with a percent sign in front of it. %20 thus means space. It also means that a parser that for example scans for URLs in a text knows that it reaches the end of the URL when the parser encounters a character that isn’t allowed. Like space.

Browsers typically show the address in their address bars with all %20 instances converted to space for appearance. If you copy the address there into your clipboard and then paste it again in your text editor you still normally get the spaces as %20 like you want them.

I’m not sure if that is the reason, but browsers also accept spaces as part of URLs when for example receiving a redirect in a HTTP response. That’s passed from a server to a client using a Location: header with the URL in it. The browsers happily allow spaces in that URL, encode them as %20 and send out the next request. This forced curl into accepting spaces in redirected “URLs”.

Non-ASCII

Making URLs support non-ASCII languages is of course important, especially for non-western societies and I’ve understood that the IRI spec was never good enough. I personally am far from an expert on these internationalization (i18n) issues so I just go by what I’ve heard from others. But of course users of non-latin alphabets and typing systems need to be able to write their “internet addresses” to resources and use as links as well.

In an ideal world, we would have the i18n version shown to users and there would be the encoded ASCII based version below, to get sent over the wire.

For international domain names, the name gets converted over to “punycode” so that it can be resolved using the normal system name resolvers that know nothing about non-ascii names. URIs have no IDN names, IRIs do and WHATWG URLs do. curl supports IDN host names.

WHATWG states that URLs are specified as UTF-8 while URIs are just ASCII. curl gets confused by non-ASCII letters in the path part but percent encodes such byte values in the outgoing requests – which causes “interesting” side-effects when the non-ASCII characters are provided in other encodings than UTF-8 which for example is standard on Windows…

Similar to what I’ve written above, this leads to servers passing back non-ASCII byte codes in HTTP headers that browsers gladly accept, and non-browsers need to deal with…

No URL standard

I’ve not tried to write a conclusive list of problems or differences, just a bunch of things I’ve fallen over recently. A “URL” given in one place is certainly not certain to be accepted or understood as a “URL” in another place.

Not even curl follows any published spec very closely these days, as we’re slowly digressing for the sake of “web compatibility”.

There’s no unified URL standard and there’s no work in progress towards that. I don’t count WHATWG’s spec as a real effort either, as it is written by a closed group with no real attempts to get the wider community involved.

My affiliation

I’m employed by Mozilla and Mozilla is a member of WHATWG and I have colleagues working on the WHATWG URL spec and other work items of theirs but it makes absolutely no difference to what I’ve written here. I also participate in the IETF and I consider myself friends with authors of RFC 1738, RFC 3986 and others but that doesn’t matter here either. My opinions are my own and this is my personal blog.

Planet Mozilla[worklog] Golden Week working at half speed.

Golden week is the highlight of Japanese holidays (1 week), which is always a bit silly for me (French origin). No matter how long I have lived in North America or Japan, I still think that something wrong with 1 week or even 10 days of holidays (compared to the current 5 weeks in France). Tune of the week: Izumi Yukimura: Ain't Cha-Cha Coming Out T-tonight.

Webcompat Life

Progress this week:

Today: 2016-05-09T10:15:31.053288
366 open issues
----------------------
needsinfo       4
needsdiagnosis  109
needscontact    32
contactready    87
sitewait        130
----------------------

You are welcome to participate

Londong agenda.

Webcompat issues

(a selection of some of the bugs worked on this week).

Webcompat development

Reading List

Follow Your Nose

TODO

  • Document how to write tests on webcompat.com using test fixtures.
  • ToWrite: rounding numbers in CSS for width
  • ToWrite: Amazon prefetching resources with <object> for Firefox only.

Otsukare!

Anne van KesterenHTML components

Hayato left a rather flattering review comment to my pull request for integrating shadow tree event dispatch into the DOM Standard. It made me reflect upon all the effort that came before us with regards to adding components to DOM and HTML. It has been a nearly two-decade journey to get to a point where all browsers are willing to implement, and then ship. It is not quite a glacial pace, but you can see why folks say that about standards.

What I think was the first proposal was simply titled HTML Components, better known as HTC, a technology by the Microsoft Internet Explorer team. Then in 2000, published in early 2001, came XBL, a technology developed at Netscape by Dave Hyatt (now at Apple). In some form that variant of XBL still lives on in Firefox today, although at this point it is considered technical debt.

In 2004 we got sXBL and in 2006 XBL 2.0, the latter largely driven by Ian Hickson with design input from Dave Hyatt. sXBL had various design disputes that could not be resolved among the participants. Selectors versus XPath was a big one. Though even with XBL 2.0 the lesson that namespaces are an unnecessary evil for rather tightly coupled languages was not yet learned. A late half-hearted revision of XBL 2.0 did drop most of the XML aspects, but by that time interest had waned.

There was another multi-year gap and then from 2011 onwards the Google Chrome team put effort into a different, more API-y approach towards HTML components. This was rather contentious initially, but after recent compromises with regards to encapsulation, constructors for custom elements, and moving from selectors to an even more simplistic model (basically strings), this seems to be the winning formula. A lot of it is now part of the DOM Standard and we also started updating the HTML Standard to account for shadow trees, e.g., making sure script elements execute.

Hopefully implementations follow soon and then widespread usage to cement it for a long time to come.

Anne van KesterenNetwork effects affecting standards

In the context of another WebKit issue around URL parsing, Alexey pointed me to WebKit bug 116887. It handily demonstrates why the web needs the URL Standard. The server reacts differently to %7B than it does to {. It expects the latter, despite the IETF STD not even mentioning that code point.

Partly to blame here are the browsers. In the early days code shipped without much quality assurance and many features got added in a short period of time. While standards evolved there was not much of a feedback loop going on with the browsers. There was no organized testing effort either, so the mismatch grew.

On the other side, you have the standards folks ignoring the browsers. While they did not necessarily partake in the standards debate back then that much, browsers have had an enormous influence on the web. They are the single most important piece of client software out there. They are even kind of a meta-client. They are used to fetch clients that in turn talk to the internet. As an example, the Firefox meta-client can be used to get and use the FastMail email client.

And this kind of dominance means that it does not matter much what standards say, it matters what the most-used clients ship. Since when you are putting together some server software, and have deadlines to make, you typically do not start with reading standards. You figure out what bits you get from the client and operate on that. And that is typically rather simplistic. You would not use a URL parser, but rather various kinds of string manipulation. Dare I say, regular expressions.

This might ring some bells. If it did, that is because this story also applies to HTML parsing, text encodings, cookies, and basically anything that browsers have deployed at scale and developers have made use of. This is why standards are hardly ever finished. Most of them require decades of iteration to get the details right, but as you know that does not mean you cannot start using any of it right now. And getting the details right is important. We need interoperable URL parsing for security, for developers to build upon them without tons of cross-browser workarounds, and to elevate the overall abstraction level at which engineering needs to happen.

Jeremy KeithConversational interfaces

<figure>

Psst… Jeremy! Right now you’re getting notified every time something is posted to Slack. That’s great at first, but now that activity is increasing you’ll probably prefer dialing that down.

<figcaption> —Slackbot, 2015 </figcaption> </figure>

<figure>

What’s happening?

<figcaption> —Twitter, 2009 </figure>

<figure>

Why does everyone always look at me? I know I’m a chalkboard and that’s my job, I just wish people would ask before staring at me. Sometimes I don’t have anything to say.

<figcaption> —Existentialist chalkboard, 2007 </figcaption> </figure>

<figure>

I’m Little MOO - the bit of software that will be managing your order with us. It will shortly be sent to Big MOO, our print machine who will print it for you in the next few days. I’ll let you know when it’s done and on it’s way to you.

<figcaption> —Little MOO, 2006 </figcaption> </figure>

<figure>

It looks like you’re writing a letter.

<figcaption> —Clippy, 1997 </figcaption> </figure>

<figure>

Your quest is to find the Warlock’s treasure, hidden deep within a dungeon populated with a multitude of terrifying monsters. You will need courage, determination and a fair amount of luck if you are to survive all the traps and battles, and reach your goal — the innermost chambers of the Warlock’s domain.

<figcaption> —The Warlock Of Firetop Mountain, 1982 </figure>

<figure>

Welcome to Adventure!! Would you like instructions?

<figcaption> —Colossal Cave, 1976 </figcaption> </figure>

<figure>

I am a lead pencil—the ordinary wooden pencil familiar to all boys and girls and adults who can read and write.

<figcaption> —I, Pencil, 1958 </figcaption> </figure>

<figure>

ÆLFRED MECH HET GEWYRCAN
Ælfred ordered me to be made

Ashmolean Museum, Oxford <figcaption> —The Ælfred Jewel, ~880 </figcaption> </figure>

Technical note

I have marked up the protagonist of each conversation using the cite element. There is a long-running dispute over the use of this element. In HTML 4.01 it was perfectly fine to use cite to mark up a person being quoted. In the HTML Living Standard, usage has been narrowed:

The cite element represents the title of a work (e.g. a book, a paper, an essay, a poem, a score, a song, a script, a film, a TV show, a game, a sculpture, a painting, a theatre production, a play, an opera, a musical, an exhibition, a legal case report, a computer program, etc). This can be a work that is being quoted or referenced in detail (i.e. a citation), or it can just be a work that is mentioned in passing.

A person’s name is not the title of a work — even if people call that person a piece of work — and the element must therefore not be used to mark up people’s names.

I disagree.

In the examples above, it’s pretty clear that I, Pencil and Warlock Of Firetop Mountain are valid use cases for the cite element according to the HTML5 definition; they are titles of works. But what about Clippy or Little Moo or Slackbot? They’re not people …but they’re not exactly titles of works either.

If I were to mark up a dialogue between Eliza and a human being, should I only mark up Eliza’s remarks with cite? In text transcripts of conversations with Alexa, Siri, or Cortana, should only their side of the conversation get attributed as a source? Or should they also be written without the cite element because it must not be used to mark up people’s names …even though they are not people, according to conventional definition.

It’s downright botist.

Planet MozillaMaking asm.js/WebAssembly compilation more parallel in Firefox

In December 2015, I've worked on reducing startup time of asm.js programs in Firefox by making compilation more parallel. As our JavaScript engine, Spidermonkey, uses the same compilation pipeline for both asm.js and WebAssembly, this also benefitted WebAssembly compilation. Now is a good time to talk about what it meant, how it got achieved and what are the next ideas to make it even faster.

What does it mean to make a program "more parallel"?

Parallelization consists of splitting a sequential program into smaller independent tasks, then having them run on different CPU. If your program is using N cores, it can be up to N times faster.

Well, in theory. Let's say you're in a car, driving on a 100 Km long road. You've already driven the first 50 Km in one hour. Let's say your car can have unlimited speed from now on. What is the maximal average speed you can reach, once you get to the end of the road?

People intuitively answer "If it can go as fast as I want, so nearby lightspeed sounds plausible". But this is not true! In fact, if you could teleport from your current position to the end of the road, you'd have traveled 100 Km in one hour, so your maximal theoritical speed is 100 Km per hour. This result is a consequence of Amdahl's law. When we get back to our initial problem, this means you can expect a N times speedup if you're running your program with N cores if, and only if your program can be entirely run in parallel. This is usually not the case, and that is why most wording refers to speedups up to N times faster, when it comes to parallelization.

Now, say your program is already running some portions in parallel. To make it faster, one can identify some parts of the program that are sequential, and make them independent so that you can run them in parallel. With respect to our car metaphor, this means augmenting the portion of the road on which you can run at unlimited speed.

This is exactly what we have done with parallel compilation of asm.js programs under Firefox.

A quick look at the asm.js compilation pipeline

I recommend to read this blog post. It clearly explains the differences between JIT (Just In Time) and AOT (Ahead Of Time) compilation, and elaborates on the different parts of the engines involved in the compilation pipeline.

As a TL;DR, keep in mind that asm.js is a strictly validated, highly optimizable, typed subset of JavaScript. Once validated, it guarantees high performance and stability (no garbage collector involved!). That is ensured by mapping every single JavaScript instruction of this subset to a few CPU instructions, if not only a single instruction. This means an asm.js program needs to get compiled to machine code, that is, translated from JavaScript to the language your CPU directly manipulates (like what GCC would do for a C++ program). If you haven't heard, the results are impressive and you can run video games directly in your browser, without needing to install anything. No plugins. Nothing more than your usual, everyday browser.

Because asm.js programs can be gigantic in size (in number of functions as well as in number of lines of code), the first compilation of the entire program is going to take some time. Afterwards, Firefox uses a caching mechanism that prevents the need for recompilation and almost instaneously loads the code, so subsequent loadings matter less*. The end user will mostly wait for the first compilation, thus this one needs to be fast.

Before the work explained below, the pipeline for compiling a single function (out of an asm.js module) would look like this:

  • parse the function, and as we parse, emit intermediate representation (IR) nodes for the compiler infrastructure. SpiderMonkey has several IRs, including the MIR (middle-level IR, mostly loaded with semantic) and the LIR (low-level IR closer to the CPU memory representation: registers, stack, etc.). The one generated here is the MIR. All of this happens on the main thread.
  • once the entire IR graph is generated for the function, optimize the MIR graph (i.e. apply a few optimization passes). Then, generate the LIR graph before carrying out register allocation (probably the most costly task of the pipeline). This can be done on supplementary helper threads, as the MIR optimization and LIR generation for a given function doesn't depend on other ones.
  • since functions can call between themselves within an asm.js module, they need references to each other. In assembly, a reference is merely an offset to somewhere else in memory. In this initial implementation, code generation is carried out on the main thread, at the cost of speed but for the sake of simplicity.

So far, only the MIR optimization passes, register allocation and LIR generation were done in parallel. Wouldn't it be nice to be able to do more?

* There are conditions for benefitting from the caching mechanism. In particular, the script should be loaded asynchronously and it should be of a consequent size.

Doing more in parallel

Our goal is to make more work in parallel: so can we take out MIR generation from the main thread? And we can take out code generation as well?

The answer happens to be yes to both questions.

For the former, instead of emitting a MIR graph as we parse the function's body, we emit a small, compact, pre-order representation of the function's body. In short, a new IR. As work was starting on WebAssembly (wasm) at this time, and since asm.js semantics and wasm semantics mostly match, the IR could just be the wasm encoding, consisting of the wasm opcodes plus a few specific asm.js ones*. Then, wasm is translated to MIR in another thread.

Now, instead of parsing and generating MIR in a single pass, we would now parse and generate wasm IR in one pass, and generate the MIR out of the wasm IR in another pass. The wasm IR is very compact and much cheaper to generate than a full MIR graph, because generating a MIR graph needs some algorithmic work, including the creation of Phi nodes (join values after any form of branching). As a result, it is expected that compilation time won't suffer. This was a large refactoring: taking every single asm.js instructions, and encoding them in a compact way and later decode these into the equivalent MIR nodes.

For the second part, could we generate code on other threads? One structure in the code base, the MacroAssembler, is used to generate all the code and it contains all necessary metadata about offsets. By adding more metadata there to abstract internal calls **, we can describe the new scheme in terms of a classic functional map/reduce:

  • the wasm IR is sent to a thread, which will return a MacroAssembler. That is a map operation, transforming an array of wasm IR into an array of MacroAssemblers.
  • When a thread is done compiling, we merge its MacroAssembler into one big MacroAssembler. Most of the merge consists in taking all the offset metadata in the thread MacroAssembler, fixing up all the offsets, and concatenate the two generated code buffers. This is equivalent to a reduce operation, merging each MacroAssembler within the module's one.

At the end of the compilation of the entire module, there is still some light work to be done: offsets of internal calls need to be translated to their actual locations. All this work has been done in this bugzilla bug.

* In fact, at the time when this was being done, we used a different superset of wasm. Since then, work has been done so that our asm.js frontend is really just another wasm emitter.

** referencing functions by their appearance order index in the module, rather than an offset to the actual start of the function. This order is indeed stable, from a function to the other.

Results

Benchmarking has been done on a Linux x64 machine with 8 cores clocked at 4.2 Ghz.

First, compilation times of a few asm.js massive games:

The X scale is the compilation time in seconds, so lower is better. Each value point is the best one of three runs. For the new scheme, the corresponding relative speedup (in percentage) has been added:

Compilation times of various benchmarks

For all games, compilation is much faster with the new parallelization scheme.

Now, let's go a bit deeper. The Linux CLI tool perf has a stat command that gives you an average of the number of utilized CPUs during the program execution. This is a great measure of threading efficiency: the more a CPU is utilized, the more it is not idle, waiting for other results to come, and thus useful. For a constant task execution time, the more utilized CPUs, the more likely the program will execute quickly.

The X scale is the number of utilized CPUs, according to the perf stat command, so higher is better. Again, each value point is the best one of three runs.

CPU utilized on DeadTrigger2

With the older scheme, the number of utilized CPUs quickly rises up from 1 to 4 cores, then more slowly from 5 cores and beyond. Intuitively, this means that with 8 cores, we almost reached the theoritical limit of the portion of the program that can be made parallel (not considering the overhead introduced by parallelization or altering the scheme).

But with the newer scheme, we get much more CPU usage even after 6 cores! Then it slows down a bit, although it is still more significant than the slow rise of the older scheme. So it is likely that with even more threads, we could have even better speedups than the one mentioned beforehand. In fact, we have moved the theoritical limit mentioned above a bit further: we have expanded the portion of the program that can be made parallel. Or to keep on using the initial car/road metaphor, we've shortened the constant speed portion of the road to the benefit of the unlimited speed portion of the road, resulting in a shorter trip overall.

Future steps

Despite these improvements, compilation time can still be a pain, especially on mobile. This is mostly due to the fact that we're running a whole multi-million line codebase through the backend of a compiler to generate optimized code. Following this work, the next bottleneck during the compilation process is parsing, which matters for asm.js in particular, which source is plain text. Decoding WebAssembly is an order of magnitude faster though, and it can be made even faster. Moreover, we have even more load-time optimizations coming down the pipeline!

In the meanwhile, we keep on improving the WebAssembly backend. Keep track of our progress on bug 1188259!

Planet MozillaBeer and Tell – April 2016

Once a month, web developers from across the Mozilla Project get together to talk about our side projects and drink, an occurrence we like to call “Beer and Tell”.

There’s a wiki page available with a list of the presenters, as well as links to their presentation materials. There’s also a recording available courtesy of Air Mozilla.

emceeaich: Memory Landscapes

First up was emceeaich, who shared Memory Landscapes, a visual memoir of the life and career of artist and photographer Laurie Toby Edison. The project is presented as a non-linear collection of photographs, in contrast to the traditionally linear format of memoirs. The feature that emceeaich demoed was “Going to Brooklyn”, which gives any link a 1/5 chance of showing shadow pictures briefly before moving on to the linked photo.

lorchard: DIY Keyboard Prototype

Next was lorchard, who talked about the process of making a DIY keyboard using web-based tools. He used keyboard-layout-editor.com to generate a layout serialized in JSON, and then used Plate & Case Builder to generate a CAD file for use with a laser cutter.

A flickr album is available with photos of the process.

lorchard: Jupyter Notebooks in Space

lorchard also shared eve-market-fun, a Node.js-based service that pulls data from the EVE Online API and pre-digests useful information about it. He then uses a Jupyter notebook to pull data from the API and analyze it to guide his market activities in the game. Neat!

Pomax: React Circle-Tree Visualizer

Pomax was up next with a new React component: react-circletree! It depicts a tree structure using segmented concentric circles. The component is very configurable and can by styled with CSS as it is generated via SVG. While built as a side-project, the component can be seen in use on the Web Literacy Framework website.

Pomax: HTML5 Mahjong

Also presented by Pomax was an HTML5 multiplayer Mahjong game. It allows four players to play the classic Chinese game online by using socket.io and a Node.js server to connect the players. The frontend is built using React and Webpack.

groovecoder and John Dungan: Codesy

Last up was groovecoder and John Dungan, who shared codesy, an open-source startup addressing the problem of compensation for fixing bugs in open-source software. They provide a browser extension that allows users to bid on bugs as well as name their price for fixing a bug. Users may then provide proof that they fixed a bug, and once it is approved by the bidders, they receive a payout.


If you’re interested in attending the next Beer and Tell, sign up for the dev-webdev@lists.mozilla.org mailing list. An email is sent out a week beforehand with connection details. You could even add yourself to the wiki and show off your side-project!

See you next month!

Planet MozillaLooking at summary details in HTML5

On the dev-platform mailing-list, Ting-Yu Lin has sent an Intent to Ship: HTML5 <details> and <summary> tags. So what about it?

HTML 5.1 specification describes details as:

The details element represents a disclosure widget from which the user can obtain additional information or controls.

which is not that clear, luckily enough the specification has some examples. I put one on codepen (you need Firefox Nightly at this time or Chrome/Opera or Safari dev edition to see it). At least the rendering seems to be pretty much the same.

But as usual evil is in the details (pun not intended at first). In case, the developer would want to hide the triangle, the possibilities are for now not interoperable. Think here possible Web compatibility issues. I created another codepen for testing the different scenarios.

In Blink/WebKit world:

summary::-webkit-details-marker { display: none; } 

In Gecko world:

summary::-moz-list-bullet { list-style-type: none;  }

or

summary { display: block; } 

These work, though the summary {display: block;} is a call for catastrophes.

Then on the thread there was the proposal of

summary { list-style-type: none; }

which is indeed working for hiding the arrow, but doesn't do anything whatsoever in Blink and WebKit. So it's not really a reliable solution from a Web compatibility point of view.

Then usually I like to look at what people do on GitHub for their projects. So these are a collection of things on the usage of -webkit-details-marker:

details summary::-webkit-details-marker { display:none; }

/* to change the pointer on hover */
details summary { cursor: pointer; }

/* to style the arrow widget on opening and closing */
details[open] summary::-webkit-details-marker {
  color: #00F;
  background: #0FF;}

/* to replace the marker with an image */
details summary::-webkit-details-marker:after {
  content: icon('file.png');

/* using content this time for a unicode character */
summary::-webkit-details-marker {display: none; }
details summary::before { content:"►"; }
details[open] summary::before { content:"▼" }

JavaScript

On JavaScript side, it seems there is a popular shim used by a lot of people: details.js

More reading

Otsukare!

WHATWG blogAdding JavaScript modules to the web platform

One thing we’ve been meaning to do more of is tell our blog readers more about new features we’ve been working on across WHATWG standards. We have quite a backlog of exciting things that have happened, and I’ve been nominated to start off by telling you the story of <script type="module">.

JavaScript modules have a long history. They were originally slated to be finalized in early 2015 (as part of the “ES2015” revision of the JavaScript specification), but as the deadline drew closer, it became clear that although the syntax was ready, the semantics of how modules load each other were still up in the air. This is a hard problem anyway, as it involves extensive integration between the JavaScript engine and its “host environment”—which could be either a web browser, or something else, like Node.js.

The compromise that was reached was to have the JavaScript specification specify the syntax of modules, but without any way to actually run them. The host environment, via a hook called HostResolveImportedModule, would be responsible for resolving module specifiers (the "x" in import x from "x") into module instances, by executing the modules and fetching their dependencies. And so a year went by with JavaScript modules not being truly implementable in web browsers, as while their syntax was specified, their semantics were not yet.

In the epic whatwg/html#433 pull request, we worked on specifying these missing semantics. This involved a lot of deep changes to the script execution pipeline, to better integrate with the modern JavaScript spec. The WHATWG community had to discuss subtle issues like how cross-origin module scripts were fetched, or how/whether the async, defer, and charset attributes applied. The end result can be seen in a number of places in the HTML Standard, most notably in the definition of the script element and the scripting processing model sections. At the request of the Edge team, we also added support for worker modules, which you can see in the section on creating workers. (This soon made it over to the service workers spec as well!) To wrap things up, we included some examples: a couple for <script type="module">, and one for module workers.

Of course, specifying a feature is not the end; it also needs to be implemented! Right now there is active implementation work happening in all four major rendering engines, which (for the open source engines) you can follow in these bugs:

And there's more work to do on the spec side, too! There's ongoing discussion of how to add more advanced dynamic module-loading APIs, from something simple like a promise-returning self.importModule, all the way up to the experimental ideas being prototyped in the whatwg/loader repository.

We hope you find the addition of JavaScript modules to the HTML Standard as exciting as we do. And we'll be back to tell you more about other recent important changes to the world of WHATWG standards soon!

Planet Mozilla[worklog] Daruma, draw one eye, wish for the rest

Daruma is a little doll where you draw an eye on a status with a wish in mind. And finally draw the second eye, once the wishes has been realized. This week Tokyo Metro and Yahoo! Japan fixed their markup. Tune of the week: Pretty Eyed Baby - Eri Chiemi.

Webcompat Life

Progress this week:

Today: 2016-04-11T07:29:54.738014
368 open issues
----------------------
needsinfo       3
needsdiagnosis  124
needscontact    32
contactready    94
sitewait        116
----------------------

You are welcome to participate

The feed of otsukare (this blog) doesn't have an updated element. That was bothering me too. There was an open issue about it on Pelican issues tracker. Let's propose a pull request. It was accepted after a couple of up and down.

We had a team meeting this week.

Understanding Web compatibility is hard. It doesn't mean the same exact thing for everyone. We probably need to better define for others what it means. Maybe the success stories could help with concrete examples to give the perimeter of what is a Web Compat issue.

Looking at who is opening issues on WebCompat.com I was pleasantly surprised by the results.

Webcompat issues

(a selection of some of the bugs worked on this week).

Reading List

  • Sharing my volunteer onboarding experience at webcompat.com
  • Browsers, Innovators Dilemma, and Project Tofino
  • css-usage. "This script is used within our Bing and Interop crawlers to determine the properties used on a page and generalized values that could have been used. http://data.microsoftedge.com "
  • Apple is bad news for the future of the Web. Interesting read, though I would do this substituion s/Apple/Market Share Gorilla/. Currently there are very similar thing in some ways happening for Chrome on the Desktop market. The other way around, aka implementing fancy APIs that Web developers rush to use on their site and create Web Compatibility issues. The issue is not that much about being late at implementing, or being early at implementing. The real issue is the market share dominance, which warps the way people think about the technology and in the end making it difficult for other players to even exist in the market. I have seen that for Opera on Desktop, and I have seen that for Firefox on Mobile. And bear with me, it was Microsoft (Desktop) in the past, it is Google (Desktop) and Apple (Mobile) now, it will be another company in the future, the one dominating the market share.
  • The veil of ignorance

Follow Your Nose

TODO

  • Document how to write tests on webcompat.com using test fixtures.
  • ToWrite: rounding numbers in CSS for width
  • ToWrite: Amazon prefetching resources with <object> for Firefox only.

Otsukare!

Footnotes

Updated: .  Michael(tm) Smith <mike@w3.org>