Automated Link Maintenance

The HTTP protocol has a redirection feature where a server can tell a client that a document has moved. But there are few tools that allow authors to set up a redirect simply.

In traditional servers (CERN and NCSA), it requires administrative access to the server configuration. The apache server asis feature removes that barrier and allows authors to set up a redirection.

Some editing clients have a "check links" feature. This feature should offer to update the link when it encounters a "301 Moved permanently" response.

Research Notebook: Further Reading

Frank Kappe: A Scalable Architecture for Maintaining Referential Integrity in Distributed Information Systems, J.UCS, Vol. 1, No. 2, February 1995.
Abstract: One of the problems that we experience with today's most widespread Internet Information Systems (like WWW or Gopher) is the lack of support for maintaining referential integrity. Whenever a resource is (re)moved, dangling references from other resources may occur. This paper presents a scalable architecture for automatic maintenance of referential integrity in large (thousands of servers) distributed information systems. A central feature of the proposed architecture is the p-flood algorithm, which is a scalable, robust, prioritizable, probabilistic server-server protocol for efficient distribution of update information to a large collection of servers. The p-flood algorithm is now implemented in the Hyper-G system, but may in principle also be implemented as an add-on for existing WWW and Gopher servers.

Hmmm... cool protocol/mechanism. But what about policy? Do all the servers in the world trust each other? What do you do when you find a resource has gone away? Does the author get notified?

Supporting the Web: A Distributed Hyperlink Database System
James E. Pitkow & R. Kipp Jones
Graphics, Visualization, & Usability (GVU) Center
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332-0280
pitkow@cc.gatech.edu kjones@harbinger.net


In our last paper [Pitkow & Jones 1995], we presented an integrated scheme for an intelligent publishing environment that included a locally maintained hyperlink database. This paper takes our previous work full cycle by extending the scope of the hyperlink database to include the entire Web. While the notion of hyperlink databases has been around since the beginnings of hypertext, the Web provides the opportunity to experiment with the largest open distributed hypertext system. The addition of hyperlink databases to the Web infrastructure positively impacts several areas including: referential integrity, link maintenance, navigation and visualization. This paper presents an architecture and migration path for the deployment of a scalable hyperlink database server called Atlas. Atlas is designed to be scalable, autonomous, and weakly consistent. After introducing the concept and utility of link databases, this paper discusses the Atlas architecture and functionality. We conclude with a discussion of subscriber and publisher policies that exploit the underlying hyperlink infrastructure and intelligent publishing environments.


World Wide Web, Internet infrastructure, distributed hyperlink databases, protocol extensions, consistency, integrity.

cool: free software. gotta try it out.

atlasp: add to http?

Connolly created Aug 96