13:41:16 <RRSAgent> RRSAgent has joined #web-networks
13:41:16 <RRSAgent> logging to https://www.w3.org/2020/02/05-web-networks-irc
13:41:19 <Zakim> Zakim has joined #web-networks
13:41:37 <dom> Meeting: Web & Networks IG: Lessons from Network Information API WICG
13:42:03 <dom> Agenda: https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/0003.html
13:42:40 <dom> -> https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/att-0003/Network_Quality_Estimation_in_Chrome.pdf Slides: Network Quality Estimation In Chrome
13:43:44 <dom> Chair: Sudeep, DanD
13:43:50 <dom> RRSAgent, draft minutes v2
13:43:50 <RRSAgent> I have made the request to generate https://www.w3.org/2020/02/05-web-networks-minutes.html dom
13:43:54 <dom> RRSAgent, make log public
13:49:25 <sudeep> sudeep has joined #web-networks
13:52:28 <cpn> cpn has joined #web-networks
13:56:47 <dom> Present+ sudeep, cpn, dom
13:56:56 <dom> Chair+ Song
13:58:06 <dom> Present+ Piers_O_Hanlon
14:00:02 <dom> Present+ Jordi_Gimenez, Dario_Sabella, Tarun_Bansal, Eric_Siow
14:01:22 <dom> Present+ Dan_Druta
14:03:35 <dom> Sudeep: today's session is an important one for the IG
14:03:45 <dom> ... in the past, we've covered a lot about MEC, CDN, network prediction
14:04:00 <dom> ... today we have folks from Google's Chrome team who implemented some APIs around networking
14:04:41 <dom> ... we're glad to have our guest speaker Tarun Bansal from the Chrome Team to give us insights about the APIs implemented in the networking space; how it is used, how useful it is, what lessons to draw from it
14:05:00 <dom> Tarun: I work on the Google Team and will talk about network quality estimation in Chrome
14:05:21 <dom> ... the talk is divided in 2 parts: use cases, and then technical details about how it works
14:05:38 <dom> ... my focus in the Chrome team is on networking and web page loading
14:05:50 <dom> ... I focus on the tail end of performance, very slow connections e.g. 3G
14:06:13 <dom> ... about 20% of page loads happen on 3G-like connections - which feels very slow, e.g. 20s before first content
14:06:25 <dom> ... videos would also take a lot of buffering time in these circumstances
14:06:51 <dom> ... the 3G share varies from market to market; e.g. 5% in the US, but up to 40% in e.g. developing countries
14:07:14 <dom> ... We have a service that provides continous estimates of network quality, covering RTT and bandwidth
14:07:30 <dom> ... we estimate network quality across all the paths, not specific to a single web servers
14:07:43 <dom> ... this focuses on the common hop from browser to network carrier
14:07:59 <dom> ... [this work got help from lots of folk, esp. Ben, Ilya, Yoav]
14:08:21 <dom> ... Before looking at the use cases, we need to understand how browsers load Web pages and why Web pages load are slow on slow connections
14:08:43 <dom> ... First, it is very challenging to optimize performance of Web pages - takes a lot of resoruces
14:09:04 <dom> ... Web pages typically load plenty of resources before showing any content (e.g. css, js, images, ...)
14:09:28 <dom> ... Not all of these resources are equally important - some have no UX impact (e.g. tracking, below-the-fold content)
14:09:47 <dom> ... loading everything in parallel works fine in fast connection, but in slow connections, it slows everything down
14:09:51 <Chunming> Chunming has joined #web-networks
14:10:27 <piers> piers has joined #web-networks
14:10:32 <dom> ... an optimal web page load should keep the network pipe full and should a lower-priority-resource should not slow down a higher-priority resource
14:10:55 <dom> ... e.g. loading a below-the-fold image should not slow down what's needed to show the top of the page
14:11:04 <dom> ... or a JS-for-ad shouldn't slow the core content of the page
14:11:23 <dom> ... this means a browser need to understand the network capacity to optimize loading of resources
14:11:43 <dom> ... this is what led to the creation of this network quality estimation service
14:12:30 <dom> ... Other uses include so called "browser interventions" which are meant to improve the overall quality of the Web by deviating from standard behavior in specific circumstances
14:12:38 <dom> ... in our case, e.g. when a network is very slow
14:13:05 <dom> ... another use case is to feed back to the network stack - e.g. using network timeouts
14:13:31 <dom> ... in the future, this could also be used to set an initial timeout in a smarter way (e.g. higher timeout in poor connection contexts)
14:13:53 <dom> ... lots of use cases for the browser vendor - what use would Web dev make of it?
14:14:26 <dom> ... We've exposed a subset of these values to the developers: RTT estimate, a bandwidth estimate, and a rough-categorization of network quality (in 4 values)
14:14:33 <dom> ... This was released in 2016
14:14:49 <dom> ... and is being used in around ~20% of web pages across all chrome platforms
14:14:54 <dom> ... examples of usage:
14:15:38 <dom> ... the Shaka player (an open source video player) use the network quality API to adjust the buffer; Facebook does this as well
14:16:02 <dom> ... some developers use it to inform the user that the slow connection will impact the time needed to complete an action
14:16:22 <dom> ... Now looking at the details of the implementation
14:16:35 <dom> Present+ Jonas_Svennebring
14:16:42 <dom> Present+ Doug_Eng
14:17:05 <dom> ... The first thing we look at is the kind of connection (e.g. wifi)
14:17:21 <dom> ... but that's not enough: there can be slow connections even on Wifi or 4G
14:17:49 <dom> ... a challenge in implementation this API is being able to make it work on all the different platforms which expose very different set of APIs
14:18:21 <dom> ... We also need to make it work on devices as they are, with often very limited access to the network layer
14:18:47 <zkis> zkis has joined #web-networks
14:18:49 <dom> ... Typically, network quality is estimated by sending echo traffic to a server (e.g. speedtest)
14:19:18 <dom> ... but this isn't going to work for Chrome: privacy (don't want to send data to a server without user intent)
14:19:32 <dom> ... also don't want to maintain a server for this
14:19:47 <dom> ... we also want to make the measurement available to other Chromium-based browsers
14:19:55 <dom> ... so we're using passive estimation
14:20:07 <dom> ... for RTT, we use 3 sources of information based on the platform
14:20:19 <dom> ... the first is the HTTP layer which Chrome controls completely
14:20:32 <dom> ... the 2nd is the transport layer (TCP) for which some platforms provide information
14:20:46 <dom> ... the 3rd is the SPDY/HTTP2 and QUIC/HTTP3 layers
14:21:25 <dom> ... for HTTP, you measure the RTT by the time different between request and response - this is available on all platforms, completely within the Chrome codebase
14:21:46 <dom> ... there are limitations: the server processing time is included in the measurement
14:22:33 <dom> ... for H2 and QUIC connections, the requests are serialized on the same TCP or UDP request, which means the HTTP request can be queued behind other requests
14:22:42 <dom> ... which may inflate the measured RTT
14:23:01 <dom> ... it is mostly useful as an upport bound
14:23:05 <dom> s/upport/upper/
14:23:31 <Louay> Louay has joined #web-networks
14:23:34 <dom> ... for the TCP layer, we look at all the TCP sockets the browser has happened, and ask the kernel what RTT it has computed for these sockets
14:23:42 <dom> ... then we take a median
14:23:46 <Louay> present+ Louay_Bassbouss
14:23:57 <dom> ... this is less noisy, but it still has its own limitations
14:24:12 <dom> ... it doesn't take into account packet loss; it doesn't deal with UDP sockets (e.g. if using QUIC)
14:24:25 <dom> ... and it's only available on some platforms - we can't do this on Windows or MacOS
14:24:32 <dom> ... this provides a lower bound RTT estimate
14:24:46 <dom> ... The 3rd source is the QUIC/HTTP2 Ping
14:25:08 <dom> ... Servers are expected to respond immediately to HTTP2 PING
14:25:23 <dom> ... this is available in Chrome, and it removes some of the limitations we discussed earlier
14:25:43 <dom> ... but not all servers support QUIC/H2, esp in some countries
14:26:01 <dom> ... not all servers that support QUIC/H2 support PING despite the spec requirement
14:26:11 <dom> ... and it can still be queued behind other packets
14:26:43 <dom> ... So we have these 3 sources of RTT, we take for each sources all the samples, and we aggregate them with a weighted median
14:27:17 <dom> ... we give more weight to the recent samples; compared to TCP which uses weighted average, we use weighted median to eliminate outliers
14:27:44 <dom> ... once we have these 3 values, we combine them using heuristics to a single value
14:27:53 <dom> ... these heuristics will vary from platform to platform
14:27:59 <dom> ... Is that RTT enough?
14:28:25 <dom> ... We have found that to estimate the real capacity, we need to estimate the bandwidth
14:28:38 <dom> ... there has been a lot of research on this, but none of them worked well for our use case
14:28:58 <dom> ... we do not want to check a server; we want a passive estimate
14:29:26 <dom> ... What are the challenges in estimating bandwidth? The first one is that we don't have cooperation from the server-side
14:29:48 <dom> ... e.g. we don't know what TCP flavor the server is using, we don't know their packet loss rates
14:30:49 <dom> ... so we use a simple approach: we measure how many bytes we get in a given time window with well defined properties (e.g. >128KB large, 5+ active requests)
14:31:00 <dom> ... the goal being to ensure the network is not under-utilized
14:31:24 <dom> ... with all these estimates, how do they quickly adapt to changing network conditions?
14:31:46 <dom> ... e.g. entering in a parking will slow down a 4G connection
14:32:14 <dom> ... we use the strength of the wireless signals
14:32:27 <dom> ... we also store information on well-known networks
14:32:48 <dom> ... To summarize, there are lots of use cases for knowing network quality - not just for browsers, also for Web developers
14:33:08 <dom> ... but there are lots of technical challenges from doing that from the app layer without access to the kernel layer
14:33:59 <dom> Piers: (BBC) I heard Yoav mention in the IETF that the netinfo RTT exposure might go away for privacy reasons
14:34:07 <dom> ... that was back at the last IETF meeting last year
14:34:38 <dom> Tarun: it's not clear if we should expose a continuous distribution of RTT - a more granular exposure could work
14:34:55 <Chunming> Chunming has joined #web-networks
14:34:56 <dom> Piers: so this is an ongoing discussion - can you say more about the privacy concerns?
14:35:03 <dom> Tarun: 2 concerns: one is fingerprinting
14:35:24 <dom> ... we round and add noise to the values to reduce fingerprint
14:35:45 <dom> ... another concern is that a lot of Web developers may not know how to consume continuous values
14:35:52 <dom> ... simplifying it make it easier to consume
14:36:52 <dom> ... we provide this in the Effective Connectivity Type - which can be easier to use to e.g. pick which image to load
14:37:22 <dom> Piers: we have ongoing work on TransportInfo in IETF that is trying to help with this
14:37:45 <dom> Tarun: if the server can identify the network quality and send it back to the browser, the browser could it more broadly
14:38:52 <piers> https://github.com/bbc/draft-ohanlon-transport-info-header/blob/master/draft-ohanlon-transport-info-header.md
14:39:29 <dom> Piers: one of the use cases is adaptive video streaming; could also useful for small object transports (which are hard to estimate in JS)
14:41:19 <dom> Tarun: is is mostly for short burst of traffic?
14:41:29 <dom> Piers: it's also for media as well
14:42:32 <dom> Tarun: so would the server keep data on typical quality from a given IP address?
14:42:44 <dom> Piers: it would be sent with a response header (e.g. along with the media)
14:43:24 <dom> DanD: (AT&T) for IETF QUIC, are you considering using the spin bit that is being specified?
14:43:36 <dom> Tarun: we're not using it, and I don't think there are plans to use it at the moment
14:43:55 <dom> ... QUIC itself maintains an RTT estimate which we're using
14:45:22 <dom> Dom: has there been work around network quality prediction - we have a presentation from an Intel team on the topic back in Sep
14:45:46 <dom> Tarun: not at the moment - we're relying on what the OS provides
14:46:17 <dom> Jonas: what we're doing for network prediction is to use info coming from the network itself (e.g. load shifting across cells)
14:46:25 <dom> ... we use this to do forward-looking prediction
14:46:38 <dom> Tarun: the challenge is that this isn't available at the application layer
14:47:08 <dom> ... e.g. they wouldn't be exposed to the Android APIs
14:47:53 <dom> ... an app wouldn't know the tower location - you can know which carrier it is, but not more than that
14:48:04 <dom> ... there is a also a lot variation across Android flavors
14:48:24 <dom> ... the common set is mostly signal strength and carrier identifier
14:48:48 <dom> Sudeep: would it be interesting for the browser which talks to the browser to talk to interfaces to the carrier network (e.g. via MEC)?
14:49:05 <dom> ... The carrier/operating networks may have more info about the channel conditions
14:49:13 <dom> Tarun: definitely yes
14:49:19 <dom> ... Android has an API which exposes this information
14:49:30 <dom> ... but it never took off, and most device manufacturers don't support it
14:49:47 <dom> ... there is a way to expose this in Android
14:50:06 <dom> ... I'm not sure what the practical concerns were, but it never took off
14:50:13 <dom> ... it would be super-useful if it was available
14:50:36 <dom> Sudeep: you spoke about RTT, bandwidth that got defined in W3C
14:51:02 <dom> ... but implementations can vary from one browser to another - is there any standardization about how these would be measured, or would this be UA dependent?
14:51:30 <dom> Tarun: it's spec as a "best-effort estimate" from the browser, so it's mostly up to the browser
14:51:41 <dom> ... right now it's only available in Chromium-based browsers
14:52:24 <dom> ... even Chromium-based implementations will vary from platform to platform
14:52:36 <dom> Dom: can you say more about the fact that is is not available in other browsers?
14:53:01 <dom> Tarun: I think it's a question of priority - we have a lot of users in developing markets which helped drive some of the priority for us
14:53:22 <dom> Song: (China Mobile) I'm interested in the accuracy of the network quality monitoring
14:53:34 <dom> ... you mention aggregating data from 3 sources: HTTP, TCP and QUIC
14:53:51 <dom> ... is the weights for these 3 sources fixed, or does it vary based on the scenario?
14:54:16 <dom> Tarun: it's very hard to measure accuracy
14:54:43 <dom> ... in lab studies (with controlled network conditions), the accuracy algorithm does quite well
14:55:01 <dom> ... we also do A/B studies, but it's hard given we don't really know the ground truth
14:55:49 <dom> ... so we measure the behavior of the consumer of the API, e.g. on the overall page load performance
14:56:11 <dom> ... we've seen 10-15% improvements when tuning the algorithm the right way
14:56:32 <dom> Song: when you measure the data from these 3 sources, are they exposed to the Web Dev? or only the aggregated value?
14:56:53 <dom> ... are there any chance to make the raw source data available to Web browsers?
14:57:01 <dom> Tarun: we only provide aggregated values
14:57:08 <dom> Piers: how often do you update the value?
14:57:21 <dom> Tarun: internally, everytime we send or receive a packet
14:57:40 <dom> ... we throttle it on the Web API - when the values have changed by more than 10%
14:57:51 <dom> Piers: that's a pretty large margin for adaptation
14:58:06 <dom> Tarun: most of the developers don't care about very precise estimates
14:58:25 <dom> ... it's pretty hard to write pages that takes into account that kind of continuous change
14:58:35 <dom> Piers: for media, more details are useful
14:58:46 <dom> Tarun: even then, you usually only have 2 or 3 resolutions to adopt to
14:59:04 <dom> Piers: but the timing of the adaptation might be sensitive
14:59:14 <dom> Piers: Any plans to provide more network info?
14:59:19 <dom> Tarun: no other plans as of now
14:59:34 <dom> ... we're open to it if there are other useful bits to expose
15:00:00 <dom> Sudeep: that's one of the topics the group is aiming to build on
15:00:14 <dom> ... are there other APIs in this space that you think would be useful to Web developers?
15:00:29 <dom> Tarun: I think most developers care about few different values
15:00:47 <dom> ... it's not clear they would use very detailed info
15:01:06 <dom> ... another challenge we see is around caching (e.g. different network resources for different network quality)
15:01:52 <dom> ... you might be loading new resources because you're on a different network quality, which if it is of low quality isn't counter productive
15:02:52 <dom> ... In general, server-side estimates are likely more accurate
15:03:06 <dom> Sudeep: Thank you Tarun for a very good presentation!
15:03:46 <dom> ... Going forward, we want to look at how these APIs can and need to be improved based on Web developers needs
15:04:07 <dom> ... we'll follow up with a discussion
15:04:41 <dom> ... Next week we have a presentation by Michael McCool on Edge computing - how to offload computing from a browser to the edge using Web Workers et al
15:04:47 <dom> ... call info will be sent to the list
15:06:17 <dom> RRSAgent, draft minutes v2
15:06:18 <RRSAgent> I have made the request to generate https://www.w3.org/2020/02/05-web-networks-minutes.html dom
16:30:17 <Zakim> Zakim has left #web-networks