Florian’s Compromise
Posted on:I’ve been fairly quiet in this group for a while now. Part of that was wanting to let the dust settle on all the chaos that was surrounding this topic a few short weeks ago, and part of that was giving myself some time to digest the bits of signal that came through amidst all the “responsive images” noise.
More and more it seems that it’s a waste of effort trying to retrofit the original srcset
proposal to cover all the use cases of the original picture
proposal. As we attempt to do so, the srcset
microsyntax grows and more and more confusing, and shares an increasing amount of overlap with media queries. To those ends, I asked Florian Rivoal — editor of the media query spec — to join the discussion on the WHATWG mailing list and offer his perspective.
Florian joined the list by posting a brilliantly thought-out compromise between the two syntax patterns. I’d like to share my thoughts on this proposal here, as I feel it combines the strengths of srcset
and picture
in a practical and logical way.
Let’s begin by taking a look at the proposed syntax.
Sample Markup Pattern
<picture alt="Description of image subject.">
<source srcset="small.jpg 1x, small-highres.jpg 2x">
<source media="(min-width: 18em)" srcset="med.jpg 1x, med-highres.jpg 2x">
<source media="(min-width: 45em)" srcset="large.jpg 1x, large-highres.jpg 2x">
<img src="small.jpg" alt="Description of image subject.">
</picture>
The chain of events followed by the above markup pattern are:
- If the
picture
element is unsupported, theimg
contained therein is shown as fallback markup. - If
picture
is supported, usemedia
attributes to determine whichsource
element best suits the user’s viewport. - Once an appropriate
source
element has been selected, thesrcset
attribute determines which image source is best suited to the user’s screen resolution. If only a single resolution is necessary, thesrc
attribute will function as expected.
In terms of selecting a source element, this markup leverages all the strengths of media queries — the syntax created for this very purpose — to handle the “art direction” use case that Jason Grigsby has illustrated so eloquently.
However, as has been detailed at length on the WHATWG mailing list and elsewhere, device-pixel-ratio
media queries are poorly suited towards these decisions. As an author, using vendor-prefixed min-device-pixel-ratio
media queries in the example above would involve a massive amount of text and twice as many source
elements. This could get unwieldy for authors very quickly, a concern voiced numerous times in these ongoing discussions. Further, implementation of MQ-based resolution switching is far more difficult on the UA side: a very real concern.
Once we’ve used media queries to determine the most appropriate source
element, srcset
’s originally intended usage becomes absolutely ideal for our purposes: simply determining the appropriate image source for a user’s resolution.
It’s worth noting that this example is, in fact, the most convoluted this element can ever be. This pattern in no way precludes the use of srcset
on an img
tag for simply preforming resolution switching, nor does it preclude the use of picture
as originally proposed, with src
in source
elements rather than srcset
.
Bandwidth
The dark cloud hanging over all these discussions is the concept of “bandwidth detection.” We cannot reliably make assumptions about bandwidth based on client capabilities — a MacBook Pro with a Retina display may be tethered to a 3G phone; a high-resolution mobile device is as likely to be connected to WiFi as it is an EDGE connection.
It would assume a great deal if authors were to make this decision for the users. It would add a point of failure: we would be taking the bandwidth information afforded us by the browser, and selectively applying that information. Some of us may do it wrong; some of us may find ourselves forced to make a decision as to whether we account for users with limited bandwidth or not. To not account for it would be, in my opinion, untenable — I’ve expressed that elsewhere, in no uncertain terms.
I feel that bandwidth decisions are best left to the browser. The decision to download high vs. standard resolution images should be made by user agents, depending on the bandwidth available — and further, I believe there should be a user settable preference for “always use standard resolution images,” “always use high resolution images,” ”download high resolution as bandwidth permits,” and so on. This is the responsibility of browser implementors, and they largely seem to be in agreement on this.
In discussing the final markup pattern, we have to consider the above. Somewhere, that markup is going to contain a suggestion, rather than an imperative. srcset
affords us that opportunity: a new syntax _designed_ to be treated as such. I wouldn’t want to introduce that sort of variance to the media query spec — a syntax long established as a set of absolutes.
It seems srcset
won’t be going anywhere, and that’s not an indictment. There is a time and a place for srcset
. I feel that place is resolution switching, as it was originally intended. Our best efforts to bring srcset
closer in-line with the originally proposed picture
element only stand to leave us with a siloed microsyntax that inconsistently serves the purpose of media queries. With that comes further opportunity for errors by implementors and authors alike — countless new potential points of failure.
An Updated Polyfill
In order to better wrap my head around this pattern, I’ve updated Scott Jehl’s Picturefill to make use of the proposed syntax. The source code is available on GitHub, and I’ve posted a demo as well.
Next Steps
I’ve been discussing the implementation details of this pattern with several vendors recently, and the feedback has been extremely promising.
I’d love to hear everyone’s thoughts on this compromise, and through your feedback put together a set of formal proposals: a change proposal returning srcset
to its original resolution-only syntax, and a proposal for picture
that focuses on the “art direction” use case and optimization for client displays through media queries, excluding resolution.
The Constant Caveat
It seems I always end my posts in much the same way, but it always seems to ring true: this solution is not the ultimate solution to every problem in the “selectively loading assets” arena — nor does it have to be, right now.
There will always be limitless room for improvement when it comes to markup — a better way to handle source management for rich media, for example. But we can’t solve everything now — we can’t fall into the trap of seeking the perfect solution all at once.
Our goal is a laser-focused solution with the potential to fall in-line with other rich media elements, as we solve the greater issues one by one — issues of bandwidth detection; issues of organization. Our goal is to solve a very real and increasingly urgent problem, in a way that serves as a canvas for future enhancements. I’m confident that this syntax affords us that opportunity.
We can’t predict the future. We can only strive to be future friendly, while solving the problems of today. I’m confident this proposal does so.
Great summary, Mat. Thanks!
FWIW, I do like Florian’s solution. I think it does the best job so far of addressing all of our concerns. I also see it being useful for serving specific images in the print context without having to resort to
img
image replacement.I have a few notes/comments/clarifications:
It should be emphasized that the first source to match will be used (at least according to the media selection algorithm in use for media sources). The browser will not treat additional sources as potential overrides (as it does with CSS). There is no cascade here; the UA finds a match and moves on. This will make paying attention to
source
order very important.There would be no need for additional
source
s to handle the prefixed versions of the media queries. Media assignments are not like selectors: a parsing error in a media assignment (media query or otherwise) will be ignored and other media assignments in the comma separated list would be applied as the UA is capable of doing so.PS – Any way we could get some
blockquote
style love in here?I like this hybrid pattern for
picture
‘s encapsulation and the reducedsrcset=""
reach. I still don’t like themedia=""
attribute being hardcoded in markup.Let’s say I write 20 blog posts, then change my design. I’d potentially have to scrape through every post to update each
media="width:20em"
attribute. This is no different than writingstyle="font-size:20px"
directly into HTML.There’s a very good reason media queries are defined in CSS and not in markup: separate content and presentation. It makes no sense to start writing them in markup now.
I suggested an alternative on the original Responsive Images Proposal:
http://www.w3.org/community/respimg/2012/05/11/respimg-proposal/#comment-699
Use a
meta
attribute to abstract the media query presets into thehead
(where it can be changed in one place).name=”var:breakpoint1″ media=”screen and (min-width:350px)”
<meta name="media:tablet" media="min-width: 20em" />
<meta name="media:desktop" media="min-width: 40em" />
<picture alt="Description of subject.">
<source srcset="small.jpg 1x, small-highres.jpg 2x" />
<source srcset="medium.jpg 1x, medium-highres.jpg 2x" media="tablet" />
<source srcset="large.jpg 1x, large-highres.jpg 2x" media="desktop" />
<img src="small.jpg" alt="Description of subject." />
</picture>
Note: this example uses a simpler
meta
syntax that Aaron Gustafson proposed afterward, which I’m a fan of:http://www.w3.org/community/respimg/2012/05/13/an-alternative-proposition-to-and-srcset-with-wider-scope/#comment-796
Or better, allow definition directly inside a CSS file:
@picture tablet: (min-width: 20em)
@picture desktop: (min-width: 40em)
As long as we’re inventing, why not? Perhaps this could introduce scoping of media queries for
picture
elements as opposed to a few global variables.Hey Brendan!
Thanks for the comment. Matt Wilcox has done a ton of research in this area, and I do think it’s a brilliant idea.
It’s worth keeping in mind that
picture
would never serve as a replacement for theimg
tag — it would be reserved for larger “hero” images like you see on apple.com, or http://www.bostonglobe.com/magazine . I wouldn’t expect a huge number of them on a page — and even if there were, it’s a good chance they wouldn’t all occupy the same size container and we’d be left with almost as many variables we havepicture
elements (though all those values would all be in one place, which is nice).But yeah, I do think this is a worthwhile consideration — just not a consideration for right now. It’s an enhancement; something that we could use for a number of purposes — this “@media variable” approach is something that should apply to any form of rich media that uses
media
attributes — sovideo
as well. The speccedpicture
solution stays in-line with video. While it doesn’t account for the future, it is future-ready.I don’t feel it’s something we need to solve right now. If we considered
picture
a non-starter without this developer convenience, we’re doing so to the detriment of our users. Retina displays are here, and sites are being built to account for it as we speak. The folks that sign paychecks are already beginning to mandate that their website looks perfect on the newest and hottest machines, and that means defaulting to huge, high resolution images — a marginal benefit to few, but an huge incurred cost to many.For users around the world that access the internet from feature phones, paying for every kilobyte they consume — users that are already forced to dodge websites that they know will mean great deal of overhead — I cannot in clear conscience allow this problem to go unsolved for the sake of a developer convenience.
Let’s get this done; let’s get the core problem solved. Then, once the users are accounted for: let’s work together to make it better. I’d love to be a part of those conversations.
I can get on board with that. I understand this is design by committee (to an extent) and that small steps are faster than big ones.
I only raised the issue because it wasn’t addressed here, though it has been previously. In the wild, there is a demand and application for viewport-aware element beyond the “hero” images though.
If an abstracted syntax (nay bandwidth aware) were available today, I’d use it deeply. The case being presented here is a great first step, but realistically its hardcoded format limits the extend to which any developer would choose to utilize it.
I suppose reaching comes before stretching though. I’m excited to see any traction here.
I concur. The solution will need to be some sort of compromise between mediaquery and srcset syntax.
I just reviewed Kornel’s pic proposal again and given the choice between the two, I’d pick Florian’s proposal. It is more verbose, but I personally consider a little verbosity to be a feature, not a bug.
I’m also encouraged by feedback from some browser makers that the mediaquery syntax would not be crippling for their lookahead preparsing implementations.
In regard to the addition of the alt attribute on the picture element. This is just repeating the limitations of the img element. As the picture element is not a void element it would be much better to provide the text alternative a s a child of the picture element. By doing this text alternatives can use structured markup where required and can be processed by assistive technology in a richer way.
I agree with Steve Faulkner about no alt on proposed picture tag… first thing I thought of was, I don’t like a double alt at all. Suppose this proposal somehow went REC as-is (unlikely but… just saying), we all know somewhere there’d be a browser who insisted it knew what the picture element was but then improperly implement it, resulting in the inner fallback img tag to come through. Possibly with two alt’s depending on how retarded the browser did it (similar to what you get if you’re hearing redundant redundant title attributes on an img with alt). Plus, I don’t like the idea of requiring authors to repeat that twice… each of the sources is actually different, and we know we’re lazy, and one will get left out. Heck, we’re lucky if we get people to write just one.
But Steve mentioned a child text element (and I understand why that would be nice). Since img is the fallback for unsupporting browsers, this still makes authors write the same alt twice and some browser is still likely to repeat it twice, and it *must* be on the img for fallback’s sake… so I’d rather UAs who could support this new set of tags/syntaxes could grab the single, existing alt from the fallback img. Maybe this wouldn’t make it a real fallback; I don’t know how browsers currently deal with fallbacks in things like object or video. If they are always ignored then I guess that’s not a solution either.
But just my thoughts. Remember how developers (authors) have this constant inner struggle between “I want to use all of this stuff because it’s awesome” and “I’m lazy and hate writing all this for two lousy images” -type fights. I wouldn’t want alt to lose that fight.
I’d like to see the proof in the pudding with an example on a much larger scale -say a fake blog of some sort. This way we can actually see the verboseness or see the stuff that is working.
I also feel the same as @brendan that embedding @media queries into the markup seems to be a poor decision much like his comparison with inline styles that were once used when dinos roamed the planet reading websites.
Hunting down media queries in my markup does not sound appealing at the moment but maybe I need some more sleep on it.
Pingback: Bruce Lawson’s personal site : Why HTML5 urgently needs an HTML adaptive images mechanism
Doesn’t it strike anyone else that the markup for this only seems to be applying to images? Surely the problem is the same for videos?
The user agent should decide and the smarter solution is to extend the image formats to behave like .ico files and serve the right format. While this might imply changes to the webserver you could mock it easily enough with javascript. The browser will know how big a hole it has to fill.
If we’ve been asked to kill the “time” element I see even less justification for “picture” which is, in terms of the DOM a meaningless sticking plaster-
I’m down with this idea. It’s a good compromise that can work and (i hope) will be expanded upon as time permits.
I am assuming the long-term idea is to be able to drop the child image element entirely – and that will help remove some verbosity right there.
I also like the idea. Jus curious about the naming of the element again. Hadn’t we had heard that we should use img instead of picture-element? I think this is what I heard from the WHATWG recently…
The other things are great:
– make use of both solutions (MQ + srcset/resolution based)
For the alt-attr problem I would propose to allow the alt-attr in ‘picture’ (if it is named so) and allow fallback inline content as Steve proposed. So you can choose how to handle. If you only have a short alt-text, just use alt, if you want to replace with more information, use the inline fallback alt-content. Okay?
About the templating: I think it is worth considering but I would love to exclude it from a first draft. I think this should go to HTML-variables spec, not a dirty solution just for resp-images.
I’m not sure if this has been mentioned, but the specification should allow the encoding of the image to be specified, with the browser picking the most suitable one. The video element works in a similar way to get around format support, and allowing this on the picture element would enable web developers to use new image formats (like one that solves the bandwidth problem or does away with the necessity of having media queries in the mark up) without having to wait for it to be widely supported (such as what happened with PNG).
I agree with Steve and Stomme in that the picture element should not have an alt attribute. Text only browsers and screen readers would use the fallback, which would be an img element with alt text, or if 99% of browsers support the picture element so no img fallback is needed, just text.
As for the name, I wonder if turning a void element into a non-void element and then having it contain another of its kind as a a child is really wise. Having the closing img tag close the outer of the opening pairs rather than the inner one seems to defy the usual parsing rules. It would also be much easier to market the picture element along with the video and audio elements than to have to explicitly mention having new syntax for the img element.
I also don’t see why the picture element can’t or shouldn’t eventually replace the img element. Many elements have been retired and replaced with newer ones that do the job better. If 10 years from now the picture element is supported by 100% of browsers in use, I don’t see any more justification to keep img around than I do for keeping applet.
I agree with Omar, I don’t see the practical reason for having the image tag sticking around. Is there a reason to have it in the proposed spec?
I was also looking at the polyfill, and I noticed that in the sample markup you have , and I assume this is there as a suggested default. It this really what would be proposed as best practice when using the picture tag? With the polyfill, this causes both the chosen image and the image in the img tag to be loaded, which seems counterproductive to performance. I would just forgo the img tag completely, or at least wrap it around with a noscript tag (which does seem to be the case with the example in your polyfill implementation on github).
In my comment above, “I noticed that in the sample you markup you have”, should have followed with the image tag in the sample in the blog post (with the alt text Description of image subject.). =)
And I’ve answered my own question :). The reason for the img tag to stick around should be for older browsers, and newer browsers that support the picture tag should just ignore the img tag fallback. Although during the transition/polyfill stage, the img tag shouldn’t be wrapped with a noscript tag, since that wouldn’t allow old browsers with javascript support to render that image.
Pingback: Responsive Images : WBarton PostGRAD
I don’t see the need for different image names, we are still serving the same image all along. Keep it simple:
@media (min-width: 20em) { image-append-name: -tablet; }
@media (min-width: 40em) { image-append-name: -desktop; }
@media (device-pixel-ratio) { image-append-name: @2x; }
Serves these:
small.jpg
small@2x.jpg
small-tablet.jpg
small-tablet@2x.jpg
small-desktop.jpg
small-desktop@2x.jpg
Browser bandwidth setting can override the CSS.
Here is my humble understanding of the adaptive content (specifically images) problem, summarised mostly for my own sanity but maybe others’ too (let me know if I have misunderstood anything):
1. There are N number (perhaps an infinite number) of possible screen sizes (we’ll call them devices).
2. Each one of these N devices could (at various points in time) be constrained/inhibited by one or more of the following common factors: speed/size of bandwidth, actual cost of bandwidth, latency (and perhaps more).
3. Additionally, there are multiple vendors amongst these devices, and each vendor will effectively have its own rendering engine/platform.
4. So, with potentially unlimited devices, numerous vendor differences and bandwidth concerns, the problem is clear.
5. Bandwidth limits is an issue for now and for the foreseeable future; hence we must tackle the issue from another angle, or rather we must consider all angles at once. What can we do currently…?
6. Should the server decide which image to send to the client? Yes? Ok, how…?
a) Currently server-side User Agent checks can determine the client is a mobile/desktop. Great, but what about point 2.? It’s too rigid, it’s a one-size-fits-all approach (ok, two-sizes-fits-all), and draws a stark line between ‘mobile’ content and ‘desktop’ content; not exactly responsive/adaptive, and not very One Web.
7. Should the client decide which image to download? Yes? Ok, how…?
a) Mobile-first design to load the smallest size image, and then client-side Javascript libraries test for certain client capabilities/facets before loading the medium/large size image. Sounds good, but what about the unnecessary lookups and [initial] image download? What’s that – it doesn’t matter because ‘mobile’ will be fine, and ‘desktop’ connection can handle the extra calls? Remember point 2.? It also depends entirely on Javascript (much has already been said on this matter).
8. So, it isn’t solely about speed then, it’s also about usability and the ‘right’ solution for the context. Small screen device + fast (Wi-Fi / unlimited 3G) connection + super browser [+ touch screen] = great experience for user and scope for lots of content goodies. Lower end tablet + reasonable screen size + slow (2G/3G PAYG) connection + ok browser = oh dear.
9. While these issues are not a product of HTML5, it can be springboard to resolving them.
10. External stylesheet (CSS) solutions alone might seem inappropriate given the long held idea of separating logic from presentation.
11. Inline CSS solutions (e.g. Florian Compromise) look and feel more appropriate. However, referring to point 2. again, it strikes me that we could end up with a single image being represented in HTML5 with 3,4,5,..n number of lines for each device (i.e. screen size) – and that’s just one image. While we’re future proofing things, we should consider the scalability factor – could/should the same Web content browsed on a 100x100px watch-face be delivered to a 60-inch plasma TV, albeit with larger images etc.? That’s seems like a lot of possible ‘min-width’ declarations in between.
12. A [part] browser solution based on bandwidth could then tie in with the Florian Compromise (as has been discussed).
13. Something has occurred to me, which may in fact be trivial: with all of these versions of the same image (different file name presumably, to benefit from caching) being downloaded into the client’s cache, that cache could potentially fill up a little quicker than before (given the number of Web sites then using the technique, and the number of images downloaded, and the client’s bandwidth fluctuating between slow/medium/fast connections. Perhaps this is inevitable; however, once the browser has downloaded, say, 3 versions of essentially the same image (different file names) would it not make sense for the browser to select the best possible image from the cache (with different names this would be problematic), or should the markup always take precedence in choosing a small image for a small device (or slow connection)?
******
And so, allowing the browser to resize a given image sounds great, if we ignore what we know about bandwidth. Alternatively, if we have multiple versions/sizes of the same image marked-up, then the browser can download the right one, as long as bandwidth can be reliably determined.
Does it matter if our markup is increased by multiple image declarations (e.g. srcset) plus the fallback? Maybe, maybe not. In addition to specifying small/medium/large versions, there’s also the high resolution issue for the new retina display Mac Book; will this lead to more similar tweaks for various hardware? Will this be manageable over time?
Is there a point where one says, “I’m content that my content can be viewed nicely on devices/sizes [E,F,G…R], while [A-D] and [S-Z], with their crappy or super hardware (respectively), will have to make do for now as I just cannot cater for the every single possible case.”? If so, then we’re just expanding the existing spectrum somewhat while edge cases still exist on the periphery. If not, then [obviously] we must find a solution which is completely adaptable and entirely acceptable.
Just my thoughts.
Uhhh, bravo.. That’s freakin’ beautiful, i love it… standardize this princess and let’s get movin’ on it… it sounds so perfect for the real problems we’re having today.
So where are we with this?
Have plans been dropped for the additional
srcset
attribute on theimg
element and going with a newwhatever we call it
element or are the plans to produce both elements withsrcset
scaled back for theimg
element?I’m finding this difficult to follow…
I think we should allow width, height attributes on the source tags, so a browser will know the right dimensions as soon as possible and render the placeholder without having to wait on the image and as such avoiding the bounce layout problems (quick demo http://atix.be/Zks).
I also would like to see support for the class attribute.
I forked @Wilto’s code into https://github.com/attiks/picturefill-proposal.
Peter, I think you have a valid point there. Problem is that we are going to use a standardized element (source) here. This element currently doesn’t support these attributes, so we would have to fix that in another standardization process. But why would you ever need a class attribute for the source-tag? The source tag is not a presentational element but only a reference. You only define a source, maybe its natural size, the type and media which it is for. All presentational attributes should be in the parental picture-element.
You’re right about the ‘class’ attribute, we can add the ‘class’ on the picture tag.
Forgive my ignorance, but why do we need a new element? It seems to me that the definition of the img element would work just fine, by providing an extended syntax with the fallback src as an attribute on the opening tag. Legacy browsers should safely ignore the source and closing img tag, should they not? If not on their own, surely with a display:none rule they would cause no damage. “Image” is more appropriate than “picture” and more universal. I’m tired of memorizing new element titles that mean virtually the same things. Some of the new structural elements in HTML5 are great, but this is too far. I understand that it seems trivial taken as a single element, but when you look at the increasing bloat we’re dealing with here you’re introducing a ton of mental overhead that after a while becomes exhausting to deal with. Em or i? Strong or b? Abbr or acronym, blockquote section article div q or p? Img or picture? Object or video/audio? Will html6 include “movie”, “song” and “artwork” tags? C’mon, meaningless decisions suck. There’s a line beyond which semantics are better described in attributes than new tag names.
Maybe that rant is off-base and there’s a good technical reason for using a new element I didn’t catch. In that case, apologies. But think about bloat anyway.
Second point, there must be a better way to describe dot pitch than 1x/2x/Nx (I assume that’s what those are for, again apologies if that assumption is wrong).
Really all of this stuff is a job for the server IMHO, not the client, but HTTP doesn’t facilitate the sending of useful data about device parameters and capabilities beyond bad UA parsing hacks. You can count the weeks between when this spec gets implemented and someone creates a firefox plugin to trick the browser into reading the “better quality” images in all circumstances. Not that they wouldn’t do the same in the HTTP header I suppose, but it’s still a question that is better contextualized by the server, which will know things like how much load it is suffering, than the client, which only knows what its current state is.
As the subject says, I think the element is a great idea and to have it follow a similar pattern to the and elements make sense.
One small change which I think would make more sense from a readability point of view would be rename the attribute ‘srcset’ to just ‘set’. In its current suggestion the source attribute reads ‘source source set’
e.g.
However in the way I have suggested it would read as ‘source set’ and look like the following:
e.g.
Although very minor, but I think it reads much nicer.
Pingback: Bandwidth Media Queries? We Don't Need ’Em! | Smashing Magazine
Pingback: Clown Car Technique: Solving Adaptive Images In Responsive Web Design | Smashing Magazine