Skip to toolbar

Community & Business Groups

Robustness and Archiving Community Group

The goal of this community is to design web architecture and specifications to mitigate problems such as link rot, content drift, Internet censorship, and denial-of-service attacks. If, after following a hyperlink, the content is missing or not what you expected, we want it to be easier to find what you were looking for.

Group's public email, repo and wiki activity over time

Note: Community Groups are proposed and run by the community. Although W3C hosts these conversations, the groups do not necessarily represent the views of the W3C Membership or staff.

No Reports Yet Published

Learn more about publishing.

Chairs, when logged in, may publish draft and final reports. Please see report requirements.

This group does not have a Chair and thus cannot publish new reports. Learn how to choose a Chair.

Harvard University’s Berkman Center Releases Amber, a “Mutual Aid” Tool for Bloggers & Website Owners to Help Keep the Web Available

The Berkman Center for Internet & Society at Harvard University is pleased to release Amber, a free software tool for WordPress and Drupal that preserves content and prevents broken links. When installed on a blog or website, Amber can take a snapshot of the content of every linked page, ensuring that even if those pages are interfered with or blocked, the original content will be available.

“The Web’s decentralization is one of its strongest features,” said Jonathan Zittrain, Faculty Chair of the Berkman Center and George Bemis Professor of International Law at Harvard Law School. “But it also means that attempting to follow a link might not work for any number of reasons. Amber harnesses the distributed resources of the Web to safeguard it. By allowing a form of mutual assistance among Web sites, we can together ensure that information placed online can remain there, even amidst denial of service attacks or broad-based attempts at censorship.”

The release of Amber builds on an earlier proposal from Zittrain and Sir Tim Berners-Lee for a “mutual aid treaty for the Internet” that would enable operators of websites to easily bolster the robustness of the entire web. It also aims to mitigate risks associated with increasing centralization of online content. Increasingly fewer entities host information online, creating choke points that can restrict access to web content. Amber addresses this by enabling the storage of snapshots via multiple archiving services, such as the Internet Archive’s Wayback Machine and Perma.cc.

Amber is useful for any organization or individual that has an interest in preserving the content to which their website links. In addition to news outlets, fact-checking organizations, journalists, researchers, and independent bloggers, human rights curators and political activists could also benefit from using Amber to preserve web links. The launch is the result of a multi-year research effort funded by the U.S. Agency for International Development and the Department of State.

“We hope supporters of free expression may use Amber to rebroadcast web content in a manner that aids against targeted censorship of the original web source,” said Genève Campbell, Amber’s technical project manager. “The more routes we provide to information, the more all people can freely share that information, even in the face of filtering or blockages.”

Amber is one of a suite of initiatives of the Berkman Center focused on preserving access to information. Other projects include Internet Monitor, which aims to evaluate, describe, and summarize the means, mechanisms, and extent of Internet content controls and Internet activity around the world; Lumen, an independent research project collecting and analyzing requests for removal of online content; and Herdict, a tool that collects and disseminates real-­time, crowdsourced information about Internet filtering, denial of service attacks, and other blockages. It also extends the mission of Perma.cc, a project of the Library Innovation Lab at the Harvard Law School Library. Perma.cc is a service that helps scholars, courts and others create web citation links that will never break.

Amber is now available for sites that run on WordPress.org or Drupal. Find out more and download the plugin at amberlink.org.

What’s the Problem?

Before we enumerate use cases, we can state some initial indications of a problem.

Whether links fail because of DDoS attacks, censorship, or just plain old link rot, dead links are a problem for Internet users everywhere.

This isn’t a new problem (W3C – Cool URIs don’t change).

49% of links in Supreme court opinions are dead
NYT – In Supreme Court Opinions, Web Links to Nowhere

136,312 Wikipedia articles contain dead external links
Wikipedia – Category:All articles with dead external links

Some initiatives, such as the Internet Archive, Perma.cc, and Memento, are attempting to snapshot and preserve the Internet and provide seamless access to those snapshots.

But more and more, just a handful of centralized entities host information online. Online centralization creates “choke points” that can restrict access to web content.

This Community Group intends to pursue complementary solutions to missing online content from various angles:

  • date stamped archiving of web content
  • enabling content management systems and content authors to embed knowledge of archives and citation dates into links
  • providing browsing users with ways to discover this information

The more routes we provide to information, the more all people can freely share that information, even in the face of filtering or blockages.

Missing Link proposal

I felt like mentioning that, in addition to the Memento protocol and the mset document mentioned by Ryan, there is also the Missing Link document that emerged last year as a result of work in the Hiberlink project. The document provides:

  • A motivation for annotating links with attributes aimed at increasing link robustness,
  • A couple of proposals for such attributes.

The Missing Link document precedes the mset document and hence needs to be read as such.

Thank you for supporting Robustness & Archiving

This group began as a collaboration between teams at Los Alamos National Laboratory, Old Dominion University, and more recently, Berkman Center for Internet at Harvard University, and Perma.cc.

To start things off, here are links to a couple efforts already in progress:

The Memento Protocol Specification (published as RFC 7089), an application-layer protocol to query & obtain prior states of an online resource: http://www.mementoweb.org/guide/rfc/

The mset attribribute (working title), an HTML attribute to provide temporal context to & locations of copies of target content for a hyperlink: http://berkmancenter.github.io/cache-link/mset-attribute.html
(issue tracker for spec)

Apart from getting the word out that link rot, content drift, Internet censorship, and denial-of-service attacks are actually problems worth mitigating, implementations of RFC 7089 and design of the mset attribute are our top priorities.

Call for Participation in Robustness and Archiving Community Group

The Robustness and Archiving Community Group has been launched:


The goal of this community is to design web architecture and specifications to mitigate problems such as link rot, content drift, Internet censorship, and denial-of-service attacks. If, after following a hyperlink, the content is missing or not what you expected, we want it to be easier to find what you were looking for.


In order to join the group, you will need a W3C account.

This is a community initiative. This group was originally proposed on 2014-05-08 by Ryan Westphal. The following people supported its creation: Ryan Westphal, Nick Doty, Olivier Thereaux, Karl Dubost, Matisse VerDuyn. W3C’s hosting of this group does not imply endorsement of its activities.

The group now has access to W3C-hosted services for email, blog, wikis, irc, tracking tools, and more. Read more about tools and services available by default and upon request.

If you believe that there is an issue with this group that requires the attention of the W3C staff, please send us email on site-comments@w3.org

Thank you,
W3C Community Development Team