Sep 25, 2024 2:45 PM
Attendees:
- - - - - - - - - - - - - -
Queue
- - - - - - - - - - - - - -
Slides: TPAC 2024_ Partitioning _visited links history.pdf
Scribe: Mike Taylor
Minutes:
John: I was surprised to hear that sending down the set of visited links before navigation commit wasn’t performant. Why?
Kyra: This was done via Navigation Throttle. We didn’t run an experiment on this - it was decided this wasn’t a risk worth tatking, because it causes delays to the entire browser process.
John: Other things use them (safe browsing, etc), and are efficient. I would be interested in seeing the experiment run.
Artur: Kyra tried to make this argument, but there were still concerns. So we still don’t have actual data on this.
Kyra: For the sake of making progress, we went w/ per-origin salt option. Maybe something we can explore in the future.
John: Thanks - also would be interested in seeing how other rendering engines would handle this.
Jeremy: It makes sense, depending on what site you’re on (like google.com, having all the links you’ve ever clicked on from google).
Kyra: Right now it’s a very simple hashtable.
Ari: How early does the renderer need the information? Also, on browser startup is this entire thing loaded into memory? How is this even being initialized if it’s such a huge table.
Kyra: …post-commit, during load, pre-paint. Happens iteratively, per anchor element. Each element in the page has its own visited link state, where that info is queried from the hash table. I’ve been informed that it’s extremely hot, in the minds of people who work on CSS/rendering. At browser startup, every single time in the partitioned model, you will read from the VL database - if there’s any corruption in the memory, will read from disk file.
Kyra: In a normal iframe, we can understand VL in 2 days.
… depending on the navigation, we may not store it, and depending on the frame, we may not style it.
… For credentialless iframes - they’re ~ephemeral to the page. It should only show up as visited when you’re on it, but not after navigating away. It shouldn’t store state at all from surrounding context. We found it was storing navigation, and styling links. In M130 experiment, this is no longer true. Identical to sandboxed iframes.
… Fenced frames do not go through HistoryTabHelper. As a result, we weren’t storing navigation. But, nothing to prevent styling if there was a key match. Would style links from outside context. Probably should not be interacting w/ outside context. In M130 experiment, this will no longer be true.
… The future. There are improvements as we go along, including self-links.
… Another item: StorageKey could be beneficial to use, since it has a nonce we can re-use. And it has access to more state than we do currently.
… There’s been questions around BFCache. What are implications of giant restore happening while we’re rebuilding the hashtable. Worth exploring.
… Potentially we could ship on iOS. If WebKit is interested in shipping this, we could ship there as well.
… Call for input. As a guide, the stuff I’d like to cover:
John: You mentioned you had good perf measurements. Do you have stats on how big it is vs unpartitioned table?
Kyra: There’s no evidence of a key explosion. Don’t have exact stats in front of me. Generally speaking, we can approximate the difference by measuring profile size. No significant difference, even in 99th percentile. This could change by using StorageKey.
Ari: Any thought to not exposing any visited info for iframes at all? Makes sense to partition. But to what extent do people expect visited links in iframes at all?
Kyra: Not a lot of data. The % of manual subframe navigation (link click, location.href) was way higher percentage than omnibox navigation. Surprising amount. Less than clicks from top-level - so significant that there was no conversation. But, the bar of what is acceptable is not clear.
Ari: The stats you had up front, it would be interesting to see specifically for iframes.
Kyra…
Artur: doesn’t that seem like a lot?
Kyra: Of all the links that we show on the page.
Artur: That’s larger than I would have expected, with no data.
Simon: I think we shouldn’t deprecate old CSS mitigations until all browsers implement this, to avoid webcompat issues for other browsers.
Kyra: Yes, thanks. Agree.
… On the WebKit side, there was discussion on key shapes. Do other browsers have opinions?
Simon: I don’t have immediate reaction to how it’s stored. It’s not my area (how things are partitioned in Gecko). High-level, I think this is the right direction and this, or something like this, should be implemented.
Kyra: The exact details likely won’t fit, thanks.
Ari: The main difference between current structure and StorageKey - you get nonce if you’re in anon iframe. The other part is that it will help w/ A > B > A case (i.e., you get an ancestor chain bit). For a same-site iframe embedded in a cross-site iframe, you would get visited links.
Kyra: Correct, right now we’re not super concerned about collusion/attacks here. However, that would unlock an additional protection. The main benefit is that StorageKey has additional info that the History system doesn’t have.
Artur: From security perspective, for some APIs it makes sense to have Ancestor Chain Bit - we were thinking about this. Collusion isn’t a huge concern, because you can use postMessage. For localStorage, you want second A in A>B>A to be different from outside A. For visited links, we didn’t see the risk in this case.
Jeremy: What do we do for invalidation? And should we specify that? If you have 2 diff windows open, and you navigate - are there any guarantees about when its :visited style will be updated? I’m curious if browsers align on what they do, and if there are any guarantees that users/developers are relying on? It’s hard to rely on because you can’t read it.
Kyra: We don’t want to create a system where things start flashing. Browsers don’t align on when this takes place. In practice, the second you navigate back, it’s typically going to be refreshed and shown as visited. The way the architecture is created - it has no guarantees for when that will be populated. History is largely async, so that process happens in the background.
Yoav: This is great. I’ve been waiting for 10 years, thank you so much.
Kyra: We are running experiment for self-links vs not-self-links. Maybe going to be in Chrome Beta in the next days. The goal is to understand performance feasibility. The results of that will inform the code we try to ship. That’s what’s next on the table.
Artur: Is there a place where you will publish the results of the self-links experiment? That’s the last major remaining thing. Rather than wait for another TPAC, I wonder if we can publish it there.
Kyra: I’d be happy to provide a summary of the results on the blink-dev experiment. And hopefully in the reasoning for a future Intent to Ship. Please file a bug on the explainer if you have questions. https://github.com/kyraseevers/Partitioning-visited-links-history Or tweet at miketaylr.
Alex: Have you had any feedback from people saying “but I liked seeing my links in a different color”? Especially users.
Kyra: No feedback yet, no bugs filed. My anecdata is that most people don’t notice that their visited links are changing. History is wiped every 90 days, which most people don’t know. The research suggests that visited links are helpful. In link-dense environments (wikipedia, search engines) - they are very helpful. I’m willing to bet that most people wouldn’t notice.
Alex: The classic case I’ve heard, there’s a news aggregator site. You want to see which links you’ve read. But from what you’ve seen, people don’t notice.
Artur: A data point: we talked to a person doing UX stuff and asked them about this problem. The feedback was that it seemed more understandable for this state to not be shared. It’s surprising to see that in a different context. I am sympathetic to the use case you brought up, but I’m not sure it’s universal.
Jeremy: Do browsers sync this state?
Kyra: Yes, but sync is not in the partitioning model. This is kind of weird. When you enter the history system, you are tagged with how you were generated. When you sync, you lose some of that data. So it’s not compatible w/ the triple-key.
Mike: And other browsers probably do different things.
Kyra: History is interesting!
Simon: Wanted to comment on seriousness of no longer styling visited links. One data point, there’s a large chunk of websites that style visited and unvisited sites exactly the same. And users don’t complain, otherwise the sites would change. :visited is useful, but it probably depends on the use case.
Kyra: We did an informal check of top sites.
Simon: maybe it’s something you could add a use counter for, and query HTTP Archive.
Yoav: Not a User Researcher, but it seems like this is important for flows where users go back and forth, or explore a tree. This seems important. The other use case, I saw this on Twitter, and now I know I’ve seen it before. That seems like a price that’s worth paying, we’ve paid worse prices for partitioning caches.
Simon: We were discussing the impact of links in iframes generally. Maybe we don’t need to support them?
Kyra: I think that visited is a concept that is meant to give info to a user, rather than utility. What info should the user have? And how do we keep that from bad actors. On the UX note from Artur, this is kind of a mental model shift. Not just where is an appropriate context, we have the opportunity to shift where people are comfortable seeing their own interactions. That’s pretty cool.
… last minute thoughts?
Kyra: Thank you! Reach out to me, or Artur, or Mike.