Meeting minutes
<mt_hates_irc> hi, I will be your compere this evening. If you have something to say, first, consider keeping it to yourself. If that fails, q+ will register your interest in sharing it with the world
MarkN: broad discussion on understanding the impact of AI on keeping the Web open, following up on a similar conversion at IETF last week
<mt_hates_irc> topics to consider: what is open? what incentive structures exist in support of that openness?
DavidS: AI-Prefs at IETF revising robots.txt to give instructions to AI crawlers
… WebBotAuth looking at whether authenticating bots would help regulate their behavior
Mark: people have different ideas about what the "Open Web" is
… open availability of content has depended on a number of factors:
… - advertising for monetization
<mt_hates_irc> openness in standards, openness in implementation, openness in participation, openness in services and content.
Mark: - free for a while and for-pay
… - community participation / reputation
… what worries me the most is places where people want to make their content open for altruistic reasons and suddenly close it because of AI scraping
Mark: I'd love to hear what people think of directions for the open web
MT: the Web is not open right now
MarkN: would be interesting to measure at the moment
SeanT: what was discussed at IETF?
MarkN: we have notes linked from the calendar invite
… the IETF community focused on identity and authentication
cpn: we're having this conversation because the emergence of generative AI
… in my opinion, generative AI has exploited how copyright is, as a free for all to gather a huge amount of data for LLM and so on
… as a rep of a publishing org, we find this deeply exploitative
… none of this has happened with the consent of content publishers
… we would be looking for to a much more consensual model of data explotation
… ML and AI is a great innovation, but lt's make it happen in a consensual manner
… give a choice to content publishers on the use of their content
… huge imbalance, with so much energy and investment in an extractive approach
… the goal was always to provide content to people, not to feed a machine to displace the content publishers
Heather: after the IETF meeting, I did a little bit of research
<Zakim> EmLauber, you wanted to discuss Heather
Heather: this is very similar t othe conversation about what is an open standards organization
… there is an ITU definition and the openstand principles
… mozilla also took a start at defining the open web
… maybe these would be worth considering for a definition of an open web
jyasskin: you asked what is the open web
… my start of an answer: if you're looking at a content, you can find its name, share it with a friend with the agent of your choice
… and it costs them something proportionate to the value of the content
… people write content for a bunch of reasons
… when that content gets intermediated via an AI system, a lot of the reasons why people write content goes away
… and this isi not just for money, e.g. wikipedia, bcc
… the value can come from other sources than direct monetization
bvandersloot: the production of content by generative AI is also an issue to consider
… dirty cheap content to make at scale, it makes it very easy to produce a lot of inauthentic content
… overloading the open Web with slop
DavidBaron: lots of different definitions of the open web, not sure we can come to consensus on one
… the ability to write new tools that deal with the Web
… either to publish, serve, consume content (e.g. new browsers)
<mt_hates_irc> bvandersloot: it's authentic, bespoke, LLM-rolled content, produced on lovingly crafted silicon, designed to deliver joy and engagement
tantek: +1 to DavidB
<Zakim> tantek, you wanted to discuss "what is the open web" question, I'll stand by my blog post from 15y ago: https://
tantek: I wrote a blog post on this 15 years ago, still stands
… being able to write such a blog post, being able to share it, being able to have it discovered
… plenty of open web not linked to advertising
… the open web shouldn't be based on the assumption of any particular business models
isaac_MS: I work on MS ads
… is content behind paywalls part of the Open Web?
<mgifford2> Just wanted to raise this perspective on the open web from the Drupal community https://
isaac_MS: some publishers are interested in enforcing publishing rights to avoid getting extracted by AI crawlers
… if they do, will be left with just slop?
jyasskin: it's a spectrum, not a binary state
… when something that was freely available goes behind a paywall, it has been at least partially removed from the paywall
… if this is a per-article cost, that feels proportional to the value of the content
… if this is a full subscription, that feels like less proportional
MichaelKleber: (Chrome)
… interested in terms and conditions of how content is used after crawling
… people that are producing content on the Web who would like to have some more rigid expectations about how it is used after it is retrieved
… that has 2 aspects
… one on the policy side - attach legal conditions about how it can be used, e.G. copyright notice
… alternatively, a technical side: visible to some agents and not to others based on knowing/being promised something about how they're going to use the information they've retrived
… either through promises, reputation, business relationship
… there are potential ways on both sides on expectations for information re-use
… if you work on the technical side, you need to know who the agent is for consistent treatment
DavidS: exact summary of the discussions we've been having at IETF on the two groups I mentioned earlier
Mark: AI Prefs and WebBotAUth
… it's not easy
MichaelK: on preferences - do we always expect this to come from publishers? or from the crawler? (my way or no way)
… or an action process? a negotation, with possible fallbacks on ad-hoc processes
MarkN: part of the push back we've had in adding this to robots.txt is that crawlers fear they might blocked for unexpected re-use
hober: I don't think defining the open web matters - what I care about is the ecosystem
… the forces acting on that ecosystem and the levers and parameters we have to tweak
… the current concern about AI related crawling is maybe a new pressure on the ecosystem
… or a different degree of an existing one
… the Web is a conversation between publishers and consumers, incl remixers
… we should focus on these pressures and the levers available to manage it
mt: what do people put web site up? we don't know
… how it is that the sites putting stuff online sustain themsevels?
… some of it is selling product, providing advertising
… there is an economic side
… the Web is a (particularly good) means to an end
… enabling people to have that conversation
<Jxck> https://
<Jxck> > The web should be a platform that helps people and provides a positive social benefit.
<Jxck> > The web should empower an equitable, informed, and interconnected society. It has been, and should continue to be, designed to enable communication and knowledge-sharing for everyone.
mt: Maybe that time is gone - mic drop
Hidde: a lot of people incl me have a site on the open web
… I don't want it to be crawled and have use multiple mechanisms to express it
… and it is being ignored
… part of it is regulatory
… are companies involved in AI prefs among those crawling this content? if not, can we get them involved
MarkN: they were not deeply involved, but the EU code of practice is calling to follow robots.txt
… which changed the composition of the group, incl the # of lawyers in it
RickB: getting more data about the Open Web - I would be interested in providing data from Chrome usage
… we have Crux which could help shed light, happy to hear suggestions
… like Tantek said, it shouldn't be tied to specific business models; have we made a mistake in assuming a particular business model? are there business models that are missing and that we should invest in?
MarkN: re data collection, I've been talking with CloudFlare Radar, CommonCrawl
… maybe we should all chat - but we may need common definitions
<Zakim> jyasskin, you wanted to address Tess: we need some sort of definition of what we want to preserve
jyasskin: Tess' point on definition: we need to look at what we value on the open web
… what to preserve - then work on perserving it
hadleybeeman: also on the TAG
… find this conversation difficult to have without use cases
… even with a paywall, as a developer still use open technologies
… as a user, paying the fare cost of content production feels open
… conversely I hate ads as a user
… esp with tracking
<mt_hates_irc> this is SUCH a different conversation from the one we had last week
hadleybeeman: in the TAG we have discussed agentic AI and how it relates to the notion of user agent
… some aspects are part of the open web, some aren't
<tantek> businesses
markN: "open" is way too used, there are various degrees of openness
reillyg: without my chrome hat off, speaking as a publisher of content on the Web
… in the early days of the Web, people published content of the web just because they could
… a lot of the commercial publishing came later
<tantek> +1 reillyg: in the early days of the web people published content because they just wanted to
reillyg: it has made the Web better by bringing more content
… everything do is for people, to make people's life better
chrishtr: (Google) I like the Open Web, I find it hard to see a future where it goes away and gets replaced by something as good
… interested in hearing more from mt
<johannhof> +1 chrishtr
mt: I don't have a better solution, I was a bit inflammatory
chrishtr: still, what ends is the Web a means to?
mt: there are many ends it serves, not fair to prioritize them
… it's a platform for commerce, a platform for exchange of ideas
… social media has more postiives than negatives
… the way people publish information for other purposes, that enriches the world
… all of this is possible of the Web, and this is not an exhaustive list, nor would that list be short
… I struggle to think if something could be better than that
… but it's worth challenging ourselves on this, instead of just looking at ads revenue
<Zakim> tantek, you wanted to discuss Tess's point of what we value on the open web. frankly to give two examples: Wikipedia and Internet Archive are two of the most important sites on the open web, have zero ad-based business model, zero paywalls. There's tons of opportunities for more such things on the open web that don't depend on the enshittified models of many of today's and to also note Open Street Maps and all the personal sites and blogs of everyone here
tantek: what do we value on the Open Web? what are the use cases?
… ^
<mt_hates_irc> tantek identified wikipedia, open street map, zombo.com, internet archive
tantek: if you had pitched wikipedia and openstreetmap before they existed, you would have heard these were impossible given human nature, and yet, they were created and they exist
… also all the personal sites & blogs of everyone here who has one and beyond - they don't rely on advertising
<jyasskin> https://
<Zakim> jyasskin, you wanted to talk about commons
<mt_hates_irc> plus lots of blogs that people put up for no reason other than to share information or their love of <topic>
jyasskin: I spent the end of last year reading Elinor Ostrom's research on the commons
<tantek> +1 mt_hates_irc
jyasskin: one of her conclusions is that there are several design principles for commons, which I think the Web is
… successful commons have well defined boundaries
<hdv> +1 tantek
<mgifford2> anyone have a reference for that's book on the commons?
jyasskin: both in terms of what can be extracted from it and the boundaries of the commons itself
… that conflicts with the notion of "open"
<tantek> I think the web has disproven that assertion about the commons.
jyasskin: we need to define these boundaries
<tantek> I believe in the web more than a book written by an academic
jyasskin: we are the governance of that commons
Mark: we are A governance
kleber: (Google) want to speak in support of advertising given its property to make content available to both rich and poor people
… looking at new business models and new negotiation models between publishers and consumers should keep that property in mind
<mt_hates_irc> I spoke up in support of advertising last week, as a form of progressive taxation, but noted that it depends on having a reasonable portion of discretionary income available to many people in order to be effective, because of the way that advertising works
Mark: this was mentioned last week, in the context of microtransactions
… which can be regressive where advertising is progressive
<EmLauber> to "why" people were drawn to the Web, my research of W3C participants identifies a shared experience of being drawn to easily sharing and consuming knowledge. Important to acknowledge that as W3C's members experiences, not necessairly indictative of *everyone's* attraction to the Web
MT: +1
kleber: advertising is essentially a tax on all commerce in practice
<mt_hates_irc> think about taxation without representation though... who governs the flow of that taxation?
kleber: and that tax supports a big part of the open web
<mgifford2> Governance is very much dependent on a community to manage it. Just thought I'd highlight https://
cwilso: Google, but more importantly a strong believer on the Web as supporting humanity
… the open web is not a single thing; the things martin listed are important - each of them have to change, evolve, some fade away
… the growth of AI will have a massive effect on some of those
<jyasskin> amy: w.r.t. the research on the commons, Governing The Commons is Ostrom's summary and is very worth reading. https://
cwilso: these are shifting dynamics; see how we can keep what is good to keep, mitigate the other
<kleber> mt_hates_irc The 3rd-party cookies govern the flow, MT!
<mt_hates_irc> kleber: that might be a different sort of governor than the one I was thinking of
<Zakim> tantek, you wanted to react to kleber to respond to the "poor people vs rich people" as if that's exclusive to ads. it's not. the examples I gave, Wikipedia, OSM, IA are also available to "poor people vs rich people", in contrast, a lot of very ad-heavy sites do not perform or work on older devices or especially on slower networks, mobile networks around the world.
<ydaniv> +1 to cwilso! Maintaining quality of content is a key
tantek: ^
<samschlesinger> (typing as a user)
<samschlesinger> There is another side of openness that we haven't spoken much about, which is the ability to access content from arbitrary jurisdictions, devices, platforms, etc.
<samschlesinger> The progressive nature of advertising provides a natural incentive for websites to lock down content to a subset of platforms which provide a higher prior for expected value of advertising. These barriers are often insurmountable by users.
<csarven> strong +1 to tantek
<mt_hates_irc> tantek: if you have read articles on Math and statistics on wikipedia, you might reconsider your "accessible" assessment
MarkN: let's not dive on each best-part of the Web perspective and recognize they are different parts we want to preserve
… and figure out how to do so across different perspectives
Isaac: a persistent assumption is that all of adtech/monetization tech is tracking based
… most people in adtech don't know how adtech let alone browser work
… not all publishers that are supported by governments like the BBC (and the world is not necessarily moving in getting more of that happening)
… not all of adtech need to keep working the way it works
… but we shouldn't assume all of it has to be removed
doniv: we haven't talked about the different type of users
<mt_hates_irc> calling on all economists, we need help here
doniv: producers tend to care more about different type of users
… there should be a way for producers to know they're engaging with humans
rbyers: we're talking about different types of trade-offs
… as a community, we haven't done a great at managing difficult trade-offs
… our jobs is to engineers that admit how society as a whole decide on these trade-offs
… rather than us making decisions on these trade-offs
simon: work for c-side
… the nature of the Open Web is that we don't regulate/control on what happens on the client side
… allowing people to use publicly accessible information, while ignoring robots.txt is the status quo
… if this happens through an AI agent, that's the continuity
… there are countries where browsers that don't follow rules will be allowed, don't think we can prevent that
<mt_hates_irc> ignoring the social contract implied by robots.txt will backfire, not just on you, but on your entire class of use, maybe more; we're seeing that already
<tantek> +1 mt_hates_irc re: "ignoring the social contract implied by robots.txt will backfire, not just on you, but on your entire class of use"
reillyg: +1 - we're not a government of the Web, all of our standards are voluntary for publishers, consumers, people
mark: how can we understand on motivation to publish content? vs mandating how people do things
… we'll summarize this input and compare with the input of IETF
DavidS: very different conversations and direction compared to IETF
… while a lot of alingment on principles and properties we want to keep
… which is probably one thing we can surface as part of next steps
… to help frame future conversation on the topic
Mark: also getting meaningful metrics to help guide us
<brentz> I think it would be very valuable to outline the positive things we want to keep