IRC log of simile on 2003-10-09
Timestamps are in UTC.
- 15:01:54 [RRSAgent]
- RRSAgent has joined #simile
- 15:02:21 [ericm]
- RRSAgent, pointer?
- 15:02:21 [RRSAgent]
- See http://www.w3.org/2003/10/09-simile-irc#T15-02-21
- 15:02:51 [marbut]
- marbut has joined #simile
- 15:03:39 [kevins2]
- kevins2 has joined #simile
- 15:06:30 [marbut]
- mickBass: we need to take a decision about whether to hold the plenary in November or December
- 15:06:51 [marbut]
- mickBass: I can't attend in November, but not certain I will be able to attend in Dec
- 15:06:57 [ericm]
- q+ to remind himself to discuss Leonardo,da Vinci,1452-1519 example as an algorithmic means that several communities use to create a unique id for people
- 15:07:24 [marbut]
- mickBass: couple of reaons why to move it - we need to get the data, this will help us refine the demo script
- 15:07:38 [marbut]
- then have some working suggestions on what is required to drive this to completion
- 15:07:54 [marbut]
- with the november date, we might have the script in the required state, but its a bit risky
- 15:08:08 [ericm]
- q+ to make the point that xml2003 is dec 7-12 http://www.xmlconference.org/xmlusa/ and as such at risk for dec f2f meeting 7-10
- 15:08:13 [marbut]
- the second objective is to do some planning for the project post demonstrator
- 15:08:33 [AndyS]
- re: eric's queue: we should be processing the values to extract composite information
- 15:08:38 [marbut]
- so we need to review key learnings, contributions from the team members going forward
- 15:08:50 [ericm]
- +1 AndyS
- 15:09:26 [jse]
- jse has joined #simile
- 15:10:16 [mickBass]
- marbut: key point, think its important to include the hires in the plenary, important to use the plenary as a way to bring them into the team
- 15:10:31 [mickBass]
- marbut: may be logistics issues wrt start dates, but we should try to work around these
- 15:10:58 [mickBass]
- marbut: suggestion of position papers to capture current thinking, interested in feedback from the team and PIs
- 15:11:16 [mickBass]
- karger: in Vancouver dec 9-10
- 15:11:48 [marbut]
- karger: I am at NIPS from Dec 8 to the 13th
- 15:12:09 [marbut]
- mickBass: lets fork discussion into 2 threads - we need to find dates that are workable
- 15:12:19 [marbut]
- and we need to have a structure that is reasonable
- 15:13:06 [marbut]
- so we would like feedback on 1) using position papers to stimulate discussion and 2) the inclusion of the new hires in this process
- 15:13:31 [marbut]
- kevins2: I think day one issues are more important that day 2 issues, we feel off schedule, so we need to concentrate on getting
- 15:13:38 [marbut]
- the demonstrator complete
- 15:14:14 [marbut]
- invite rssagent
- 15:14:38 [marbut]
- AndyS: I have a concern that we are trying to cram too much into day 1
- 15:15:00 [marbut]
- mickBass: so I am hearing we need to allocate more time to the topics currently on day 1
- 15:15:47 [marbut]
- AndyS: there are different people present on day 1 and day 2, this means we can't reschedule
- 15:16:24 [marbut]
- mark: do we need the whole team involved in the demo discussion?
- 15:16:49 [marbut]
- kevins2: maybe we could make more progress with a smaller team. I think the people we need are David, MacKenzie and Mick
- 15:16:58 [marbut]
- we want a script of the user interface for the demo
- 15:17:10 [marbut]
- the rest of us would slow things down
- 15:18:06 [marbut]
- kevins2: the mapping, modelling, inference rules stuff is something Andy, Kevin, Mark could work on
- 15:18:22 [marbut]
- but without knowing what the output is (the script) that could be hard
- 15:18:33 [marbut]
- AndyS: we need to know what the script is
- 15:19:27 [marbut]
- mickBass: so by having the plenary in december, we could enter with some solid confidence in the demo script
- 15:19:55 [marbut]
- kevins2: there is a problem with Mick is not there, would Mark be filling in?
- 15:21:03 [marbut]
- ericm: I like the ideas of pushing this to december, having the new staff in place, and doing it in parallel, and also like the idea of
- 15:21:11 [marbut]
- using haystack to create user interfaces
- 15:21:37 [marbut]
- there is another aspect of this: how these demos are compelling to a number of audiences
- 15:22:07 [marbut]
- mickBass: I suggest Mark & I take this feedback and rework the proposal
- 15:22:32 [marbut]
- wrt to dates, David you are out the week of the 8th, so are there preferences for week of the 15th or week of the 1st.
- 15:23:08 [marbut]
- ericm: my preference is for the 1st, the 15th is slipping towards the beginning of the holiday season
- 15:24:05 [marbut]
- david: I don't have a preference, but I need to check my availability first. If you put some candidate dates on the table
- 15:24:33 [marbut]
- AndyS: also the 1st is near thanksgiving, we need to know soon because getting on flights can be hard
- 15:25:13 [marbut]
- mickBass: I propose the 3rd/4th of december, or the 16th/17th of december. Please RSVP availability for those dates
- 15:26:18 [marbut]
- mickBass: corpus data - Eric?
- 15:27:10 [marbut]
- ericm: Martin (Doerr) is looking over the license agreement, he's also putting together a bundle of metadata in CIDOC.
- 15:27:17 [marbut]
- mickBass: what are the next steps?
- 15:27:48 [marbut]
- ericm: they haven't taken a decision yet, so they make a decision, they accept our proposal or modify our proposal. If they accept we
- 15:27:56 [marbut]
- get the data, if not we have to come back to them
- 15:28:41 [marbut]
- Martin has been very good at turning things round quickly, he's interested in participating in a more active way, either in
- 15:29:29 [marbut]
- an intellectual way, or as a user
- 15:29:50 [marbut]
- mark: so it would be good to get Martin involved in the SIMILE?
- 15:29:58 [marbut]
- ericm: yes.
- 15:30:39 [marbut]
- mickBass: eric, you're running the link here, can you make an introduction for Mark at the appropriate time
- 15:31:20 [kevins2]
- http://www.w3.org/mid/5EDF4B64-F347-11D7-B049-000A9582FD3A@w3.org
- 15:31:47 [marbut]
- ericm: another quick status update: I'm making headway with the Getty folks, but I don't have specific numbers
- 15:32:12 [marbut]
- I'm trying to understand who may be able to help us, and answer some questions. They are in a transitional phase of
- 15:32:31 [marbut]
- providing their data in different forms, but they are not in consultation with any consumers. So part of it
- 15:32:51 [marbut]
- is getting the data, then manipulating, then making services available based on their data, so I'm trying to find
- 15:33:00 [marbut]
- out what we can/can't do, and the costs
- 15:33:13 [marbut]
- I'm hoping to be able to give you an answer here by next thursday.
- 15:33:50 [marbut]
- mickBass: I don't have an update from MacKenzie, apart from her messages to the list, not sure where we are on IMS metadata from OCW
- 15:34:29 [marbut]
- I think the artstor people are working on getting the records, but not the thumbnails
- 15:34:56 [marbut]
- ericm: there are ways around this, but I don't want to derail the conversation
- 15:35:41 [marbut]
- we might be able to negotiate with individual content owners, to get at least a collection of thumbnails and perhaps images
- 15:36:09 [marbut]
- mickBass: please send the suggestions to MS and myself, then we'll schedule a call if necessary
- 15:36:35 [marbut]
- kevins2: I have a question of OCW - it looked like MIT have done a new release in the last two weeks. MS said that there
- 15:36:49 [marbut]
- is more metadata available internally, is that going to be available?
- 15:37:23 [marbut]
- mickBass: there is more metadata in the microsoft content management system they are using for publishing, but they don't have a good export mechanism for that metadata
- 15:37:42 [marbut]
- we are trying to get hold of some examples
- 15:38:32 [marbut]
- mickBass: I wanted to update the group on progress on getting haystack connected to Joseki
- 15:38:53 [marbut]
- and hand off of the history system code from Jason Kinnear to the DSpace / SIMILE team
- 15:39:22 [marbut]
- Jason needs to update the code to use Jena 2 / the latest version of Joseki.
- 15:40:04 [marbut]
- Jason can do that work, and support migrating his installation from mySQL to Postgres which might be easier to deal with in the SIMILE environment
- 15:40:21 [marbut]
- we are still working logistics, it looks like it might take a couple of weeks to get it done.
- 15:40:47 [marbut]
- we are also trying to get an RDF/XML snapshot of several thousand triples of history data, so that the haystack team can start to explore
- 15:40:57 [marbut]
- how to create a UI for the history data
- 15:41:39 [marbut]
- AndyS: we need to separate the issues: MS raised the issue about getting a publically available server up, and we need to schedule that work, and
- 15:41:51 [marbut]
- the kind of system that David would need for testing.
- 15:42:38 [marbut]
- David: I sat down with an incoming faculty member at MIT, we looked at Jena 2 / Joseki. It looks like Postgres / mySQL can be tweaked for our
- 15:43:06 [marbut]
- purposes. It looks like we may be able to use Jena / mySQL as the one RDF repository for Haystack.
- 15:43:26 [marbut]
- mickBass: we have about 15 minutes left.
- 15:43:32 [mickBass]
- marbut: vra data
- 15:43:55 [mickBass]
- ... design decisions required to make a style sheet and schema for artstore data
- 15:44:04 [mickBass]
- ... in xml, have nested elements
- 15:44:17 [mickBass]
- ... these model three different things
- 15:44:26 [mickBass]
- ... 1. embedded classes
- 15:44:36 [mickBass]
- ... 2. superproperty/subp relationships
- 15:44:38 [mickBass]
- ... 3. context
- 15:44:41 [mickBass]
- ...
- 15:44:54 [mickBass]
- ... so key decision: which elements in artstor are classes?
- 15:45:36 [mickBass]
- ... decision: image, mediafiles/mediafile, collection, relation, and creator
- 15:45:40 [mickBass]
- ... are classes
- 15:45:41 [mickBass]
- ...
- 15:45:50 [mickBass]
- ... on subproperties:
- 15:46:35 [ericm]
- ?
- 15:46:51 [mickBass]
- ... der suggestion - add a "qualifier" to your schema
- 15:47:32 [mickBass]
- andys: if title.variant is a subproperty of title, then IF title.variant is "blah" THEN title is also "blah"
- 15:48:07 [ericm]
- q+
- 15:48:32 [ericm]
- +1 of Andy's point
- 15:48:42 [mickBass]
- eric can you capture andy's point
- 15:50:00 [marbut]
- AndyS: series is a first class object
- 15:50:00 [mickBass]
- andys: my position is that there is a conceptual first-level object which is the series, which itself has a title
- 15:50:24 [marbut]
- kevins2: I would have thought of it the other way round
- 15:50:42 [mickBass]
- kevins2: series is a virtual object, not a real object?
- 15:50:53 [mickBass]
- em: no, it's very much real, bought & sold, has ip rights etc.
- 15:50:53 [marbut]
- ericm: you can think of series having a title, as well as the article having a title
- 15:52:36 [mickBass]
- andys: if series is a first class concept, then if article has title.series, it is not true that title.series == title
- 15:52:51 [mickBass]
- andys: second point (sorry missed it... Andy?)
- 15:53:54 [mickBass]
- andys: some vra elements are subproperties, some links to other objects, some I could not discern either way
- 15:54:06 [mickBass]
- andys: vra really a syntactic way of writing down certain info
- 15:54:20 [mickBass]
- andys: need an application profile for additional semantics
- 15:54:57 [mickBass]
- marbut: may have several instances of vra schema where individuals have made different decisions about usage
- 15:58:21 [ericm]
- haystack rdf
- 15:58:21 [ericm]
- oops
- 15:59:44 [mickBass]
- em: agree andys that different communities will use VRA differently
- 16:00:00 [mickBass]
- em: probably we'll need a transformation for each store or collection of data
- 16:00:39 [mickBass]
- andys: hope to get in common a vra vocabulary?
- 16:00:57 [mickBass]
- andys: particular transformations will be messy, hacky
- 16:03:41 [mickBass]
- ericm: artstor data should be quite consistent
- 16:04:09 [mickBass]
- andys: key question: how consistent will the data be? We have sample size of 1 - hard to make observations/decisions about modelling without risk of them becoming unstuck
- 16:06:15 [mickBass]
- em: artstor is an intermediary, so data has been cleansed/crosschecked
- 16:06:43 [mickBass]
- em: but especially wrt names, we may need to do some parsing on names to tease out e.g. name, birthday, death date
- 16:07:02 [mickBass]
- andys: this would ease the task of merge w/ non-image sources e.g. OCW
- 16:07:17 [mickBass]
- marbut: back to the point of first class objects for various concepts
- 16:08:01 [mickBass]
- marbut: on artstor leave data untouched, but also hav an entity "artStorName" with reference to a vcard with firstname lastname bdate deathdate etc.
- 16:09:20 [mickBass]
- kevins: don't necessarily want to bulk out, say, da vinci record with all the metadata from any of the sources
- 16:14:11 [mickBass]
- andys: here's what we can do now:
- 16:14:25 [mickBass]
- 1. work thru vra spec and create an abstract spec of what's happening
- 16:14:40 [mickBass]
- ... what's clear and what's confusing
- 16:14:59 [mickBass]
- ... crosscheck vs. observations from artstor data
- 16:17:14 [mickBass]
- marbut: given larger dataset we can run a translate to DC, this will show up some of the errors that Andy's pointed out
- 16:17:29 [mickBass]
- kevins: we can discuss how we want to represent these crosswalks technically
- 16:17:45 [mickBass]
- kevins: gets at core problem of how to represent records from foreign sources
- 16:18:26 [mickBass]
- mark: rdfs for IMS already exists
- 16:18:41 [mickBass]
- mark: but we may find its not correct
- 16:18:52 [mickBass]
- imsproject.org/rdf (em)
- 16:20:03 [ericm]
- q+
- 16:26:21 [ericm]
- Mark - sample CIDOC records http://cidoc.ics.forth.gr/data_transformations.html
- 16:26:31 [ericm]
- from my previous email http://lists.w3.org/Archives/Public/www-rdf-dspace/2003Sep/0062.html
- 16:36:17 [ericm]
- ericm has left #simile