The goal of this community is to design web architecture and specifications to mitigate problems such as link rot, content drift, Internet censorship, and denial-of-service attacks. If, after following a hyperlink, the content is missing or not what you expected, we want it to be easier to find what you were looking for.
Note: Community Groups are proposed and run by the community. Although W3C hosts these
conversations, the groups do not necessarily represent the views of the W3C Membership or staff.
The Berkman Center for Internet & Society at Harvard University is pleased to release Amber, a free software tool for WordPress and Drupal that preserves content and prevents broken links. When installed on a blog or website, Amber can take a snapshot of the content of every linked page, ensuring that even if those pages are interfered with or blocked, the original content will be available.
“The Web’s decentralization is one of its strongest features,” said Jonathan Zittrain, Faculty Chair of the Berkman Center and George Bemis Professor of International Law at Harvard Law School. “But it also means that attempting to follow a link might not work for any number of reasons. Amber harnesses the distributed resources of the Web to safeguard it. By allowing a form of mutual assistance among Web sites, we can together ensure that information placed online can remain there, even amidst denial of service attacks or broad-based attempts at censorship.”
The release of Amber builds on an earlier proposal from Zittrain and Sir Tim Berners-Lee for a “mutual aid treaty for the Internet” that would enable operators of websites to easily bolster the robustness of the entire web. It also aims to mitigate risks associated with increasing centralization of online content. Increasingly fewer entities host information online, creating choke points that can restrict access to web content. Amber addresses this by enabling the storage of snapshots via multiple archiving services, such as the Internet Archive’s Wayback Machine and Perma.cc.
Amber is useful for any organization or individual that has an interest in preserving the content to which their website links. In addition to news outlets, fact-checking organizations, journalists, researchers, and independent bloggers, human rights curators and political activists could also benefit from using Amber to preserve web links. The launch is the result of a multi-year research effort funded by the U.S. Agency for International Development and the Department of State.
“We hope supporters of free expression may use Amber to rebroadcast web content in a manner that aids against targeted censorship of the original web source,” said Genève Campbell, Amber’s technical project manager. “The more routes we provide to information, the more all people can freely share that information, even in the face of filtering or blockages.”
Amber is one of a suite of initiatives of the Berkman Center focused on preserving access to information. Other projects include Internet Monitor, which aims to evaluate, describe, and summarize the means, mechanisms, and extent of Internet content controls and Internet activity around the world; Lumen, an independent research project collecting and analyzing requests for removal of online content; and Herdict, a tool that collects and disseminates real-time, crowdsourced information about Internet filtering, denial of service attacks, and other blockages. It also extends the mission of Perma.cc, a project of the Library Innovation Lab at the Harvard Law School Library. Perma.cc is a service that helps scholars, courts and others create web citation links that will never break.
Amber is now available for sites that run on WordPress.org or Drupal. Find out more and download the plugin at amberlink.org.
Some initiatives, such as the Internet Archive, Perma.cc, and Memento, are attempting to snapshot and preserve the Internet and provide seamless access to those snapshots.
But more and more, just a handful of centralized entities host information online. Online centralization creates “choke points” that can restrict access to web content.
This Community Group intends to pursue complementary solutions to missing online content from various angles:
date stamped archiving of web content
enabling content management systems and content authors to embed knowledge of archives and citation dates into links
providing browsing users with ways to discover this information
The more routes we provide to information, the more all people can freely share that information, even in the face of filtering or blockages.
I felt like mentioning that, in addition to the Memento protocol and the mset document mentioned by Ryan, there is also the Missing Link document that emerged last year as a result of work in the Hiberlink project. The document provides:
A motivation for annotating links with attributes aimed at increasing link robustness,
A couple of proposals for such attributes.
The Missing Link document precedes the mset document and hence needs to be read as such.
This group began as a collaboration between teams at Los Alamos National Laboratory, Old Dominion University, and more recently, Berkman Center for Internet at Harvard University, and Perma.cc.
To start things off, here are links to a couple efforts already in progress:
The Memento Protocol Specification (published as RFC 7089), an application-layer protocol to query & obtain prior states of an online resource: http://www.mementoweb.org/guide/rfc/
Apart from getting the word out that link rot, content drift, Internet censorship, and denial-of-service attacks are actually problems worth mitigating, implementations of RFC 7089 and design of the mset attribute are our top priorities.
The goal of this community is to design web architecture and specifications to mitigate problems such as link rot, content drift, Internet censorship, and denial-of-service attacks. If, after following a hyperlink, the content is missing or not what you expected, we want it to be easier to find what you were looking for.
This is a community initiative. This group was originally proposed on 2014-05-08 by Ryan Westphal. The following people supported its creation: Ryan Westphal, Nick Doty, Olivier Thereaux, Karl Dubost, Matisse VerDuyn. W3C’s hosting of this group does not imply endorsement of its activities.