What’s next?

It can be frustrating seeing this Community Group go quiet, especially following a surge in traffic that very literally brought the Community Group site down. We have to understand—and I’m speaking in no small part to myself, here—that most standards folks have a massive backlog of issues and suggestions to work through. With that in mind: even as public-facing as this Community Group is, it’s a lot to ask of members of the standards bodies or browser teams to parse though several pages of prose and leaf through comments for the meat of this thing.

For that reason I’ll be making a first pass at a draft spec myself. Not with any intent to see it codified word for word, mind you, but to break down the core details of this new responsive images element in the most efficient and easily parsed way. All signal, no noise. This will be a big departure for me, since—as many of you know—I’m especially noise inclined.

We’ve done a lot of work reasoning this thing out and adapting it to community feedback to get it to this point: a point where it feels like it may be ready for sample implementation. I find this particularly exciting—like many of you, I fell into this website-making gig because it afforded me a chance to solve new and exciting problems. There’s not much I enjoy more than seeing a solution to a tricky issue come to life, and I’m certain that’s a sentiment shared by many browser developers.

If you’re a similarly-inclined browser developer, I’d love to work with you on a sample implementation of the <picture> element. I wouldn’t expect anyone to barrel ahead with this based on the content of this group or an email exchange with me, but rather once we have a pared-down and easily-distributed example spec to reference along the way. Of course, it goes without saying that I’m more than happy to work alongside vendors in whatever capacity I can.

If this is something you think that you or your team might be interested in exploring, I’d love to hear from you while the draft spec is coming together.

29 Responses to What’s next?

  1. Bruce Lawson says:

    Might be good to keep an eye on the www-style list. Edward O’Connor (@hober) of Apple proposed a new CSS4 property for allowing a UA to choose between several images for a background image (based on bandwidth etc) and said that the syntax could be re-used as an attribute on HTML <img> element: http://lists.w3.org/Archives/Public/www-style/2012Feb/1103.html

    • Anselm Hannemann says:

      This is a really good point and I really would love to adapt this style for HTML so developers can write at least similar code.

      The proposed solution:

      is not really good imo. I’d prefer mine over that if we could use img-tag but this seems to be impossible due to backward-compatibility.
      I don’t like the notation of the resources. This is not very intuitive to developers and much more complicated than notation with attributes and namespaces.

  2. This is great news. I think a good step toward collaboration would be to put together a wiki page detailing the problem, current solutions + limiations, prior art, and proposed solution — essentially a cleaned up version of the Etherpad doc. This is what the HTML5 team did before getting to the draft-writing phase — for example, check out this awesome WHATWG wiki page on the canvas tag.

    Taking this approach is rad because wiki pages give everyone a chance to pitch in. They also let us capture all the ideas before trimming them down to a spec, plus the end result is a document you can understand in one reading (something we don’t get from community discussion). I think that’s what we want to end up with in order for standards folks to be able to see where we’re coming from.

    I’ve taken the liberty of starting an outline in our community wiki. Care to join? http://www.w3.org/community/respimg/wiki/Main_Page

  3. Kevin Suttle says:

    This presentation from Yiibu breaks the limitations and swings/misses down even further. It’s much more complicated than at first glance.

    http://www.slideshare.net/yiibu/muddling-through-the-mobile-web/

  4. The good news is nothing can be a standard without the support of the community behind it.

    Go go gadget responsive images draft spec.

  5. Matt Wilcox says:

    I’ve been quiet because the things I feel are still core questions have not been answered, and I’ve yet to come up with any viable solutions to those concerns myself. I am concerned about the picture syntax on a number of levels:

    1) Using the syntax as stands is horrendously wasteful because for every picture you replicate exactly the same media tests. How many wasted bytes is that, and why are we happy to let it slide just because this is HTML and not JS, PHP, or CSS (where it wouldn’t)?

    2) Embedding the test into the HTML element itself a bad choice because it is fundamentally future un-friendly. That mark-up will be horrid to manage in any re-design, especially if it’s got into a database of content. Which, let’s be honest, is a very common thing to happen.

    3) We’re using CSS media queries as our selector basis whilst knowing full well that screen size isn’t the only factor we should be looking at responding to, we actually care about bandwidth and latency as much as screen size. JS has API’s being designed to address this, so why aren’t we looking at that?

    Right now, forging ahead with picture feels like rushing to the first viable solution without looking properly into it’s flaws and implications in a wider context.

    With responsive designs all three technology stacks are responding to the same test cases – but expressed individually in three different languages. This is un-economical. Why aren’t we thinking about defining some better way to sense these things and react, instead of grafting things from other technology stacks directly into mark-up tags?

    More of my thinking about the overall impact on our proposed solutions and general approach to responsive design problems can be found here: http://mattwilcox.net/archive/entry/id/1081/

    • Mathew Marquis says:

      In my opinion, the concerns with wasteful markup and DRYness are made up for by the fact that it’s consistent with the video tag, both in terms of people picking it up and using it and in terms of getting the standards folks to acknowledge it. Through video sources’ `media` attribute, the standards bodies have effectively voiced that this is the way forward. It’s absolutely a matter of wasted bytes—but do the benefits of picking that fight outweigh the cost, here? “We want an element that better handles images, _and_ we want video changed accordingly.” If we had something better in mind—absolutely it could be an issue worth raising. But for now, we’d be saying “we don’t know how to solve this, but we know you’re doing it wrong.” There’s no endgame to that, and it certainly doesn’t get anything accomplished. We’d just be a group of frustrated developers picking a fight on the internet, crying “this isn’t good enough, and we have no suggestions.”

      To your third point: for now, screen size is a fair indicator of the largest possible image we could need on a page—and again, this is a realm of improvement that has every bit as much impact on <video> as it would <picture>. Would it be helpful to have a way to selectively deliver images based on bandwidth, or a combination of bandwidth and existing media queries? _Hell_ yes. Imagine being able to target high-density screens _and_ WiFi or better for high-res images, falling back to a standard resolution image on 3G.

      Fortunately, there has already been a lot of chatter around extending media queries to include bandwidth—because, again, this would put <picture> squarely in the same realm of existing problems as <video> and images served through CSS. <picture> doesn’t solve everything—it solves the problem of appropriately delivering content images, to the extent that such a thing is possible at present. And we don’t need to solve everything in a vacuum, right here and now. Once we have an element that’s in-line with current standards—we can move on to solving overarching problems, like bandwidth-based delivery. We’re not locked in to screen size—if we could be considered “locked into” anything in a moving target like HTML, it’s that we’re banking on the future viability of media queries.

      This isn’t a perfect solution. Thing is: one would be incredibly hard-pressed to find anything—in any spec—that’s absolutely ideal for all situations. What this gives us a starting point; this gives us a solution to our fundimental problems with image delivery in a way that adheres to the currently specced standards. If a better method for selectively delivering sources for content media should present itself one day, both <picture> and <video> would surely go in that direction. If <video> should move away from markup-based sources and media attributes, <picture> would be right there with it. This new element would be a consideration in future changes, and more importantly: adapt with them.

      The alternative is that, for the sake of not having the ideal backdrop for it, we abandon hope of solving the core problem we’ve set out to solve.

      • Matt Wilcox says:

        I don’t think it’s an either-or type of thing. I agree completely that the current picture syntax is brilliant in that it’s consistent with other syntaxes – that’s a huge positive point. It doesn’t however change the fact that as nice as that is it’s also horribly repetitive and wasteful – which isn’t a criticism of picture on it’s own, but a criticism of the entire approach to responsive elements at the moment. They’re all siloed into their own technology stack and all about individual items responding as though they are likely to have unique breakpoints. They aren’t. Some might, but the majority likely won’t.

        Why can’t we come up with something that all technology stacks can share (they all do the same test, so let’s test once per page load, not once per responsive item) and which the current syntax becomes an over-ride when needed? Off the top of my head, I’d like to define something that sits in the HTML head, runs tests, and is a template for all responsive elements that follow in the body. We could keep the current syntax for over-rides of that templated behaviour.

        <head>
        <respond>
        <breakpoint case="one" media="min-width:200px"/>
        <breakpoint case="two" media="min-width:400px"/>
        <breakpoint case="three" media="min-width:600px"/>
        </respond>
        </head>

        <body>
        <picture alt="An alt tag for the image">
        <source src="mobile.jpg" match="one" />
        <source src="medium-res.jpg" match="two" />
        <source src="high-res.jpg" match="three" />
        <source src="high-res.jpg" media="min-width:800px" /><!-- special case not templated -->
        <img src="mobile.jpg" /><!-- fallback -->
        </picture>
        </body>

        I’m glad to hear media queries are going to get better sensors – that effectively solves that issue for me.

        • Matt Wilcox says:

          I should point out, the other nice thing about taking this sort of approach is the CSS and the JS could all be made to inspect the breakpoint’s set in HEAD. Meaning breakpoints are shared through the entire technology stack instead of being individually assigned. Also, doing so would not stop using existing methods to over-ride or expand on them.

  6. Scott Jehl says:

    Good points, Matt.

    The DRY problem is an interesting one, but I’m unsure if such repetition is necessarily expected in a real-world use case.

    In theory at least, content images shouldn’t be so tied to a design should they? Seems to me that the breakpoints for image references in the picture tags are just as likely (more likely?) to be inconsistent throughout a document, since their breakpoints would relate more to the assets themselves (and their relation to device size/resolution) than the design the assets are placed within. In that case, #1 and #2 wouldn’t be so problematic, especially in a fluid layout with fluid media.

    For #3, I agree those APIs are rich/useful/intriguing (if difficult to detect well) – it seems like that sort of feature would wind up in media queries eventually too.

    If we could have a simple markup pattern that solves the problem for the majority of use cases (picture), perhaps JS APIs could evolve to enhance things further for advanced use cases that media queries alone can’t solve.

    Barring a picture tag addition, I’d love to see img[prefetch=false] :)

    • Matt Wilcox says:

      I’d say that the majority of the time that response points are indeed tied to the design. When you think about it the workflow is actually this:

      Get the device width > design for this width > Modify assets to fit the design.

      When we respond to breakpoints we are currently adapting the content directly to the viewport in order to have it then fit into a design. But from a process standpoint that’s wrong – we do a design for a specific viewport range and media should fit into that design. Right now the technology forces us to work a bit muddled up.

      The clearest example of why it should be this way is considering that many designs have a max-width, beyond which the assets inside the design never need to expand. Regardless of the viewport width. I.e., that’s the manifestation of the design’s limit. And the design’s limit is what we care about.

      Viewport dimensions inform the design choices which inform the content representation.

      When you re-design a site, you will very likely change those breakpoints for the design. And so you will want to change the breakpoints for the images to ensure they fit the new design. Right now, with picture as it stands, that’s not possible without manually changing all of the mark-up throughout the site. And that, to me, is a major problem with picture (and video and any other responsive asset where the sensing is embedded directly into the element).

    • Matt Wilcox says:

      Oh, and yes I would *love* to see a toggle for turning off pre-fetching. Somehow I don’t think browser vendors would like the idea though.

  7. Kevin Suttle says:

    Excellent, excellent points, Mat and great contributions Scott.

    I like the idea of attributes on the img tag as a possible solution. I was under the impression that img was off limits though*.

    I guess where I struggle the most is standardizing a range of ballparking to accuracy in semantics. For example, is a logo a ‘picture’? Should that really be ‘graphic’? Does a bio photo need a ‘photo’ tag? I guess

    From a mobile-first perspective, it seems like if we’re going to create an element, we have to have a need for it, and that need should be reinforced by a fitting name. And really, we’re only creating a new element because img is apparently untouchable. *As Scott mentioned, I see no reason we can’t add attributes to it, but I digress.

    My point is, breaking it down, HTML5 was intended to be about adding more meaning to our markup. If we’re to be line with standards, ‘picture’ sounds great on paper, but only up until the point of comparison with img. Does that make sense? It’s not even a question of DRY, it’s more a question of using the asking where the lines of the spectrum are drawn on element names so that they can be specific enough to have meaning, yet generic enough to be flexible. I think there’s an agreeable solution for all involved, but compromise is almost a guarantee.

    Anyway, I just was attempting to think a bit outside the box on some of the approaches I’ve seen mentioned. This was just one point I wanted to make. Thanks for the responses, guys.

    I’m going to do another post similar to the one above, but instead of markup-based solutions, I’ll focus on CSS concepts. Speaking of, Mat, how flexible is the standards body on the CSS spec? Is it as locked down as the img tag? If not, we could possibly negate the new HTML element discussion altogether.

  8. Jason Grigsby says:

    The proposal from Edward O‘Conner (Apple) is pretty interesting. I’ve been thinking a lot more lately that the browser needs to decide what assets to download–particularly when it comes to dealing with bandwidth.

    The hack that Apple.com is rolling out to support the retina iPad is what got me thinking about this. Ideally, I’d like to let the person browsing make the choice about whether or not to download huge images. Especially given the conversations in the whatwg list about people wanting to zoom in, etc.

    Something like the imageset also helps prevent less competent authors from screwing things up too much. Provide a series of images and let the browser figure out the right density to download.

    I’d love to find a good balance though between just having browsers handle density and enabling authors.

    As an aside, I find it worrisome that Apple seems to continue to be focused on 2x. Our lesson should be that multiple image sizes are needed. I think what they propose is flexible enough to handle multiple resolutions, but it would be nice to verify.

    • Matt Wilcox says:

      I think Apple has a different use case. I imagine their reason for implementing high res graphics is nothing to do with web development and everything to do with the fact that iPads in shops will very often be on those pages, so those pages need to look excellent on an iPad 3. Any niceties for any other device are merely a happy side effect I think – which is why the 2x thing is OK for them. It’s not ok for our use cases.

      Why would we want the browser to do this automatically? That takes the control away from the author. And, if we’re going down the “just make it all automated so the author can’t screw it up” road, well lets forget HTML entirely and just do this server side through detection.

      • Anselm Hannemann says:

        Apple’s experiment is not a good solution for real-world. It is a working example for Apple but not a solution. I think they also know this and just did it this way because it is an easy and working solution for now.

    • Matt Wilcox says:

      PS: Do you have a link to that Apple proposal Jason? :)

  9. Matt Wilcox says:

    No feedback from anyone about the head template idea? Are we ignoring the efficient authoring problem for now? The idea’s up above in the comments and I wrote an exploratory post about it too: http://mattwilcox.net/archive/entry/id/1082/

    • Anselm Hannemann says:

      Uhm… I am not sure if that would work. It is too complex for HTML I think and would require a lot of work on other elements such as video, audio, etc., too. So maybe this should be postponed to a whole new spec?

      I know it is an authoring problem but I don’t think we can solve this in the next time properly for every kind of multi-files-content.

      • Matt Wilcox says:

        Yeah, that’s the point- you could use the same process with the same mark-up in any element you want to be responsive. And it’d be backward compatible as the only thing being added to video etc is the “match” attribute – which is what triggers the look-up to the head code to find the matching media query. You essentially use match instead of media.

        “Too complex for HTML”. Why is it? Philosophically or technically?

        The ‘complex’ bit is in the head, as its not describing semantics but document properties. We want sensors in the HTML if we want a central point that CSS/JS/HTML can all use and rely on. Or is there something wrong with my thinking here?

        • Anselm Hannemann says:

          ““Too complex for HTML”. Why is it? Philosophically or technically?”
          I meant that this approach would have consequences on video, audio and other elements, too. I’m not sure if we can do this in near time… if you think, well then for me it would be okay ;)

          “The ‘complex’ bit is in the head, as its not describing semantics but document properties. We want sensors in the HTML if we want a central point that CSS/JS/HTML can all use and rely on. Or is there something wrong with my thinking here?”

          Okay, that’s a point I don’t like too much. HTML should always have a semantic value. If not this has to be declared as not-semantic and should have a direct related/reference to a semantic element.
          I must admit that I don’t remember too much of your proposal here. Do you have a summary on that where I can read the basics?

          • Matt Wilcox says:

            Well, isn’t that what the head does? Describe document properties and linked files? It tells us the title of the page, it allows us to set meta data and set document properties like character set encoding, and we link to assets required to render the page (stylesheets etc). Seems a perfect place to me.

            HTML semantics are for inside body. head does a different job.

            As for impact on existing elements – it has none. This could be added and such elements would simply ignor it and carry on working exactly as they do right now. Later, you add the match element to each in the spec and they can take advantage of it.

            What I am proposing here is not a *replacement* for any of the current tags or attributes, but an extension of them. My blog post has the most detail, but perhaps I ought to elaborate on it soon.

          • Anselm Hannemann says:

            Hm. What about performance when we reference these files in head? Then we urgently need defer and async attributes for the linked files there.
            But you’re right, it would work.

          • Matt Wilcox says:

            How do you mean, “reference these files in the head”? We are not referencing any files?

            The only thing my proposition does is allow us to centralise the media tests. It doesn’t even replace the existing way where we’d use media and do the test directly on the element.

            By using the head technique, the browser ought to be able to run the tests *one time*. Other elements that support match would then simply match their case to the documents currently active case as found in the head. That is far more efficient in terms of processing and authoring than the current behaviour where every single responsive element has a test hard-coded into it which is run when it’s encountered in the mark-up.

          • Anselm Hannemann says:

            I am sorry. I just read your proposal again and now things are changed.

            But nevertheless – if you run tests in head, the browser does it before rendering the page which might have an impact on performance of the website, right? I don’t actually know what this could mean in time but this is a topic to test before we can add this to a draft.

          • Matt Wilcox says:

            It would indeed run the test immediately – and I call that good thing. The same happens when you put script tags up there. It is a behaviour that is actually essential, because if those are done before the rest of the HTML has loaded, then you can get the right images (and any other assets) first time. And, getting the viewport width isn’t exactly time consuming computationally :)

          • Anselm Hannemann says:

            Okay then I’ll go with that. From my point of view this makes sense. And should be added to other elements’ specs, too (video e.g.).

  10. Anselm Hannemann says:

    Now I just discovered this technique from the webM project. Maybe this could be a base for a responsive file-format (WebP)?
    Just to let you know… http://downloads.webmproject.org/adaptive-demo/adaptive/dash-player.html
    And of course this is only additional to the html-solution.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Before you comment here, note that this forum is moderated and your IP address is sent to Akismet, the plugin we use to mitigate spam comments.