What is Haystack?

Haystack is an extensible Semantic Web Browser developed by the Haystack research group at the MIT Computer Science and Artificial Intelligence Laboratory (http://haystack.lcs.mit.edu) . The project aims to explore how the Semantic Web data model (RDF) can be applied by users to better organize, navigate, and retrieve information, both personal and shared. The Haystack Semantic Web Browser was made available for download in 2004 to illustrate how the Semantic Web can improve not only machine agents' ability to automate tasks for users but also the end-user information management experience. Haystack was also designed to be extensible; based on the Eclipse Rich Client Platform, Haystack allows developers to add new functionality in the form of Eclipse plug-ins. Moreover, and in line with the Semantic Web model, end-users can refine their experience extended by creating and selecting user-interface components called semantic lenses.

Haystack is based on the idea that merely providing access to information is insufficient; it needs to be aggregated in meaningful (semantic) ways; therefore, applications must help users filter and select information that is relevant to the specific needs at hand, or context. Relevance depends on factors such as the kind of information being looked at, the context of the task the user is performing, etc. A semantic lens is a machine-readable description (as RDF) of how to extract relevant information for a specific set of circumstances. Like optical lenses, which bring particular aspects of an image into focus, semantic lenses allow a Semantic Web Browser to present a set of facets around an object to the user in an intelligent fashion. And just as an avid photographer possesses various lenses for capturing different kinds of images, a Semantic Web Browser can be equipped with various semantic lenses to enable users to explore various forms of information.

As an example, think about looking up a gene in a browser. Collectively, the Internet, a local intranet, and genomic information records, etc. allow access to information ranging from the gene’s genomic location, sequence, splice variants, polymorphisms, and related homologues to lists of related research papers, annotations, known patterns of expression, and relation to diseases. While giving users a single tool for getting at all of this information from one place might be useful, simply putting every piece of knowledge about a gene on one screen would be overwhelming. Instead, specific subsets are needed depending on whether the user is trying to find sequence motifs or is researching the different roles the protein product plays within molecular pathways. For the latter case, “pathway model”, "co-expression", and "disease link" semantic lenses might prove useful-- the former case might require the "metabolic" and "signaling" semantic lenses to be shown. In this way, semantic lenses act as building blocks for information displays that can be pieced together to bring the right subset of knowledge to the scientist for whatever task the scientist is doing at the moment.

Return Home