On October 11, people from W3C met with people from HP/MIT's DSpace project. Following are the proceedings.
Ralph Swick, Dan Brickly, Art Barstow, Eric Prud'hommeaux and Dave Becket arrived from W3C met Mick Bass and Margar Benchofsky and David Stuve. Mick Bass: project leader for DSpace Hp employee DSpace has MIT, Hp, and contractors Eric: Ralph Swick: RDF is a long-time project. integrated with other work under name of metadata though it's all data Danny W and ??? have had prior conversations but tx to Art and Mick Margaret Benchofsky long time librarian factulty contact for DSpace UI interests Dan Brickely: RDF interest group RDF schema spec other .5 time: working with Dave in Bristol Dave Becket working on metadata for 5 years developing meatadata library called redlin David Stuve: Hp donation to MIT for next 1.5 years DSpace design work archive aspects metadata storage may also be doing some UI stuff Ralph: we have several metadata projects in the middle of defining next phase of metadata activity Art is here to do RDF tool development 1 year ago we got .5 danbri from Bristol Bristol work is seperately coordinated danbri: another connection - Hp labs in Bristol 4-5 ILRT developers in lunch c=powows joyce: master student working with Mick Hal Ableson is thesis advisor sheet 1: purpose: establish W3C-DSpace relationship get feedback - avoid blindspots goals: W3C understand DSpace DSpace understand frontier/direction sheet 2: Steps: intros - done DSpace Metadata Approach - dstuve W3C - RDF activities Review - art W3C - Metadata activiteis Review - ralph free-for-all lunch! ---- DSpace Metadata Approach ---- dstuve: sponsored by Hp and MIT libraries goal: institutional intellectual capitol archive others: lanl( physis + math), cognet (MITPress) lots of domain-specific <enter peter> umbrella archive for everything produced by MIT cross discipline - therefor need metadata abstraction +------------------------------------+ | math subspace | | periodicals MIT press theses | +------------------------------------+ | physics subspace | ... Mick: there is metadata associeated with the subspace itself as well as the individual ?things? dstuve: Asset store - metadata and document store <enter bill catty> Services layer - web server, harvesting, sharing, replication Ralph: maintain access to papers in perpetuity - how about intellectual prop rights? dstuve: indeed it's a big part - hoping to be a pioneer - hope to cause conflict in the area Mick: Eric Celeste has 1.5 positions to look at business models view libraries as source rather then cost center Bill Catty MIT informations systems learned from dublin core, warwick framework must let disciplines define their own metadata set [schema - ed] must integrate with existing discipline-specific metadata sets (eg medline) automatic import of metadata standards Mick: fundamental hard prob: continuum between efficiency of search and ability to extend hard coded tables <-----------> abstract Ralph: multiple metrics of efficient, DB efficiency, query creation efficiency, schema maintain efficient Eric: each select decomposes to a join art: do you see Marg: yes Ralph: slow evolution Ralph: you hope to absorb a lot of data from existing repositories Marg: will ask end users to submit dstuve: bulk import is a way to jumpstart and build credibility/interest ?subspaces is an illusion to orient user? Mick: there is an interest that disciplines publish their own stable schema how much can we automate schema import Eric: not always necessary for mere data slurp and burp danbri: how about UI extensions Mick: that and what mods need to be made to the data store Ralph: what sort of user insentives? Marg: not a huge amount of insentive. math discipline has expressed this need before osceanography needs a system anyways Ralph: need access to old data Mick: feedthrough mechanism to share data with exisint repositories that already have critical mass - probably possible Indus Leason Program has an incentive program Ralph: goal: absorb *anything* that the author wants to say about their article dstuve: MIT: if you build it, they won't come. you have to go to them. Margaret is on board to get influential users on board. extensible looks and feels policy-independent as we will adopt *your* policy Margaret: some labs and centers have outside company support and have small DBs that are undeveloped and work to maintain danbri: will you have folks maintain their own systems and export to DSpaces dstuve: document evolution is a draw pick up incarnations from various domains (whitepaper - thesis - book) may pass-through DSpace to other repositories Ralph: we want to represent relationships between resources over time evolve a few that most are familiar but need to hold all Mick: providence - new class of metadata i hadn't considered need to offer harvesting, we don't have a lot of leverage to change peoples behavior Ralph: will you be content-type agnostic Mick: "for you content to be useful in the future, we reccoment x" danbri: if documents are blobs it's a non-issue dstuve: don't want arbitrary UI Ralph: what about the data itself? dstuve: import/export mechanism system can put data between spaces Ralph: what do you want to make efficient? Mick: make certain structures efficient and general queries less so danbri: "framework" is a problem word: disciplines schemas are not germain to that discipline only dstuve: author may mean different things depending on the disipline workflow items outside of the space Ralph: it's different vocab, but does the system need to handle it differently dstuve: metadata comes from user workflow comes from administrator Mick: simply a question of roles let's move on to rights need to decide if there are two modeles Eric: two? three? n? what about in-between data? Ralph: can you issue workflow based queries? ---- Rights ---- dstuve: there is a god admin who is allowed to defined rights and empower folks rights - one is the ability to hang code off a right rights may be time-based, person based. we expect to get some rights stuff wrong attach rights to any element Ralph: what is the granularity? dstuve: element is any file in the silo code on elements for indexing [more refs to fedora] rights are namespaced to distinguish ASAP right from ... right Eric: wanna see my model? (documents, ACLs) organized by homegeneity of rights Mick: how volitile are those rights? Ralph: user feedback for expensive non-optimizable queries dstuve: let users know when data is non-local Peter: corba is being superceeded by XML stuff < discussion of HTTP NG > danbri: are you looking at somehting like corba? dstuve (or maybe bill?): user interface would correspond to hard java code in the server Peter: peter bretten - consolutant on the project need ad hoc ... corba is too heavy and slow looking at SOAP perl can talk to java in SOAP SOAP isn't really mature looking at Hp's E-Speak SOAP addresses marshalling E-Speak addresses what can i do, what types? < XML protocol charter not including service advertisement > dstuve: pushing not for RDF but a standard entity relationship diagram and think about it in less abstract terms for the first cut RDF is a new tool set and new nomenclature may want to use present tools and nomenclature danbri: what you do in the privacy of your database is your own business sergey melnick is looking at UML and RDF merging data from multiple domains may not be addressable with conventional relational tools Ralph: implications have to do with long-term extensibility danbri: large tool gap - many more tools for UML Ralph: there are *no* tools that use XML namespaces the way they are meant to be used. < lunch > Art, Ralph, danbri: discussion of current tools ---- future-proofing ---- Ralph: Eric: contents, decosntructed? if you peer into an image/gif, you can get the size if you peer into a text/xml, you can harvest RDF metadata ---- rights ---- Ralph: need to decide whether to serve documents as well as tell the client what their rights are Ralph descrives p3p: not actually in RDF danbri: salvagable with xslt Ralph describes XSLT