IRC log of simile on 2003-10-09

Timestamps are in UTC.

15:01:54 [RRSAgent]
RRSAgent has joined #simile
15:02:21 [ericm]
RRSAgent, pointer?
15:02:21 [RRSAgent]
See http://www.w3.org/2003/10/09-simile-irc#T15-02-21
15:02:51 [marbut]
marbut has joined #simile
15:03:39 [kevins2]
kevins2 has joined #simile
15:06:30 [marbut]
mickBass: we need to take a decision about whether to hold the plenary in November or December
15:06:51 [marbut]
mickBass: I can't attend in November, but not certain I will be able to attend in Dec
15:06:57 [ericm]
q+ to remind himself to discuss Leonardo,da Vinci,1452-1519 example as an algorithmic means that several communities use to create a unique id for people
15:07:24 [marbut]
mickBass: couple of reaons why to move it - we need to get the data, this will help us refine the demo script
15:07:38 [marbut]
then have some working suggestions on what is required to drive this to completion
15:07:54 [marbut]
with the november date, we might have the script in the required state, but its a bit risky
15:08:08 [ericm]
q+ to make the point that xml2003 is dec 7-12 http://www.xmlconference.org/xmlusa/ and as such at risk for dec f2f meeting 7-10
15:08:13 [marbut]
the second objective is to do some planning for the project post demonstrator
15:08:33 [AndyS]
re: eric's queue: we should be processing the values to extract composite information
15:08:38 [marbut]
so we need to review key learnings, contributions from the team members going forward
15:08:50 [ericm]
+1 AndyS
15:09:26 [jse]
jse has joined #simile
15:10:16 [mickBass]
marbut: key point, think its important to include the hires in the plenary, important to use the plenary as a way to bring them into the team
15:10:31 [mickBass]
marbut: may be logistics issues wrt start dates, but we should try to work around these
15:10:58 [mickBass]
marbut: suggestion of position papers to capture current thinking, interested in feedback from the team and PIs
15:11:16 [mickBass]
karger: in Vancouver dec 9-10
15:11:48 [marbut]
karger: I am at NIPS from Dec 8 to the 13th
15:12:09 [marbut]
mickBass: lets fork discussion into 2 threads - we need to find dates that are workable
15:12:19 [marbut]
and we need to have a structure that is reasonable
15:13:06 [marbut]
so we would like feedback on 1) using position papers to stimulate discussion and 2) the inclusion of the new hires in this process
15:13:31 [marbut]
kevins2: I think day one issues are more important that day 2 issues, we feel off schedule, so we need to concentrate on getting
15:13:38 [marbut]
the demonstrator complete
15:14:14 [marbut]
invite rssagent
15:14:38 [marbut]
AndyS: I have a concern that we are trying to cram too much into day 1
15:15:00 [marbut]
mickBass: so I am hearing we need to allocate more time to the topics currently on day 1
15:15:47 [marbut]
AndyS: there are different people present on day 1 and day 2, this means we can't reschedule
15:16:24 [marbut]
mark: do we need the whole team involved in the demo discussion?
15:16:49 [marbut]
kevins2: maybe we could make more progress with a smaller team. I think the people we need are David, MacKenzie and Mick
15:16:58 [marbut]
we want a script of the user interface for the demo
15:17:10 [marbut]
the rest of us would slow things down
15:18:06 [marbut]
kevins2: the mapping, modelling, inference rules stuff is something Andy, Kevin, Mark could work on
15:18:22 [marbut]
but without knowing what the output is (the script) that could be hard
15:18:33 [marbut]
AndyS: we need to know what the script is
15:19:27 [marbut]
mickBass: so by having the plenary in december, we could enter with some solid confidence in the demo script
15:19:55 [marbut]
kevins2: there is a problem with Mick is not there, would Mark be filling in?
15:21:03 [marbut]
ericm: I like the ideas of pushing this to december, having the new staff in place, and doing it in parallel, and also like the idea of
15:21:11 [marbut]
using haystack to create user interfaces
15:21:37 [marbut]
there is another aspect of this: how these demos are compelling to a number of audiences
15:22:07 [marbut]
mickBass: I suggest Mark & I take this feedback and rework the proposal
15:22:32 [marbut]
wrt to dates, David you are out the week of the 8th, so are there preferences for week of the 15th or week of the 1st.
15:23:08 [marbut]
ericm: my preference is for the 1st, the 15th is slipping towards the beginning of the holiday season
15:24:05 [marbut]
david: I don't have a preference, but I need to check my availability first. If you put some candidate dates on the table
15:24:33 [marbut]
AndyS: also the 1st is near thanksgiving, we need to know soon because getting on flights can be hard
15:25:13 [marbut]
mickBass: I propose the 3rd/4th of december, or the 16th/17th of december. Please RSVP availability for those dates
15:26:18 [marbut]
mickBass: corpus data - Eric?
15:27:10 [marbut]
ericm: Martin (Doerr) is looking over the license agreement, he's also putting together a bundle of metadata in CIDOC.
15:27:17 [marbut]
mickBass: what are the next steps?
15:27:48 [marbut]
ericm: they haven't taken a decision yet, so they make a decision, they accept our proposal or modify our proposal. If they accept we
15:27:56 [marbut]
get the data, if not we have to come back to them
15:28:41 [marbut]
Martin has been very good at turning things round quickly, he's interested in participating in a more active way, either in
15:29:29 [marbut]
an intellectual way, or as a user
15:29:50 [marbut]
mark: so it would be good to get Martin involved in the SIMILE?
15:29:58 [marbut]
ericm: yes.
15:30:39 [marbut]
mickBass: eric, you're running the link here, can you make an introduction for Mark at the appropriate time
15:31:20 [kevins2]
http://www.w3.org/mid/5EDF4B64-F347-11D7-B049-000A9582FD3A@w3.org
15:31:47 [marbut]
ericm: another quick status update: I'm making headway with the Getty folks, but I don't have specific numbers
15:32:12 [marbut]
I'm trying to understand who may be able to help us, and answer some questions. They are in a transitional phase of
15:32:31 [marbut]
providing their data in different forms, but they are not in consultation with any consumers. So part of it
15:32:51 [marbut]
is getting the data, then manipulating, then making services available based on their data, so I'm trying to find
15:33:00 [marbut]
out what we can/can't do, and the costs
15:33:13 [marbut]
I'm hoping to be able to give you an answer here by next thursday.
15:33:50 [marbut]
mickBass: I don't have an update from MacKenzie, apart from her messages to the list, not sure where we are on IMS metadata from OCW
15:34:29 [marbut]
I think the artstor people are working on getting the records, but not the thumbnails
15:34:56 [marbut]
ericm: there are ways around this, but I don't want to derail the conversation
15:35:41 [marbut]
we might be able to negotiate with individual content owners, to get at least a collection of thumbnails and perhaps images
15:36:09 [marbut]
mickBass: please send the suggestions to MS and myself, then we'll schedule a call if necessary
15:36:35 [marbut]
kevins2: I have a question of OCW - it looked like MIT have done a new release in the last two weeks. MS said that there
15:36:49 [marbut]
is more metadata available internally, is that going to be available?
15:37:23 [marbut]
mickBass: there is more metadata in the microsoft content management system they are using for publishing, but they don't have a good export mechanism for that metadata
15:37:42 [marbut]
we are trying to get hold of some examples
15:38:32 [marbut]
mickBass: I wanted to update the group on progress on getting haystack connected to Joseki
15:38:53 [marbut]
and hand off of the history system code from Jason Kinnear to the DSpace / SIMILE team
15:39:22 [marbut]
Jason needs to update the code to use Jena 2 / the latest version of Joseki.
15:40:04 [marbut]
Jason can do that work, and support migrating his installation from mySQL to Postgres which might be easier to deal with in the SIMILE environment
15:40:21 [marbut]
we are still working logistics, it looks like it might take a couple of weeks to get it done.
15:40:47 [marbut]
we are also trying to get an RDF/XML snapshot of several thousand triples of history data, so that the haystack team can start to explore
15:40:57 [marbut]
how to create a UI for the history data
15:41:39 [marbut]
AndyS: we need to separate the issues: MS raised the issue about getting a publically available server up, and we need to schedule that work, and
15:41:51 [marbut]
the kind of system that David would need for testing.
15:42:38 [marbut]
David: I sat down with an incoming faculty member at MIT, we looked at Jena 2 / Joseki. It looks like Postgres / mySQL can be tweaked for our
15:43:06 [marbut]
purposes. It looks like we may be able to use Jena / mySQL as the one RDF repository for Haystack.
15:43:26 [marbut]
mickBass: we have about 15 minutes left.
15:43:32 [mickBass]
marbut: vra data
15:43:55 [mickBass]
... design decisions required to make a style sheet and schema for artstore data
15:44:04 [mickBass]
... in xml, have nested elements
15:44:17 [mickBass]
... these model three different things
15:44:26 [mickBass]
... 1. embedded classes
15:44:36 [mickBass]
... 2. superproperty/subp relationships
15:44:38 [mickBass]
... 3. context
15:44:41 [mickBass]
...
15:44:54 [mickBass]
... so key decision: which elements in artstor are classes?
15:45:36 [mickBass]
... decision: image, mediafiles/mediafile, collection, relation, and creator
15:45:40 [mickBass]
... are classes
15:45:41 [mickBass]
...
15:45:50 [mickBass]
... on subproperties:
15:46:35 [ericm]
?
15:46:51 [mickBass]
... der suggestion - add a "qualifier" to your schema
15:47:32 [mickBass]
andys: if title.variant is a subproperty of title, then IF title.variant is "blah" THEN title is also "blah"
15:48:07 [ericm]
q+
15:48:32 [ericm]
+1 of Andy's point
15:48:42 [mickBass]
eric can you capture andy's point
15:50:00 [marbut]
AndyS: series is a first class object
15:50:00 [mickBass]
andys: my position is that there is a conceptual first-level object which is the series, which itself has a title
15:50:24 [marbut]
kevins2: I would have thought of it the other way round
15:50:42 [mickBass]
kevins2: series is a virtual object, not a real object?
15:50:53 [mickBass]
em: no, it's very much real, bought & sold, has ip rights etc.
15:50:53 [marbut]
ericm: you can think of series having a title, as well as the article having a title
15:52:36 [mickBass]
andys: if series is a first class concept, then if article has title.series, it is not true that title.series == title
15:52:51 [mickBass]
andys: second point (sorry missed it... Andy?)
15:53:54 [mickBass]
andys: some vra elements are subproperties, some links to other objects, some I could not discern either way
15:54:06 [mickBass]
andys: vra really a syntactic way of writing down certain info
15:54:20 [mickBass]
andys: need an application profile for additional semantics
15:54:57 [mickBass]
marbut: may have several instances of vra schema where individuals have made different decisions about usage
15:58:21 [ericm]
haystack rdf
15:58:21 [ericm]
oops
15:59:44 [mickBass]
em: agree andys that different communities will use VRA differently
16:00:00 [mickBass]
em: probably we'll need a transformation for each store or collection of data
16:00:39 [mickBass]
andys: hope to get in common a vra vocabulary?
16:00:57 [mickBass]
andys: particular transformations will be messy, hacky
16:03:41 [mickBass]
ericm: artstor data should be quite consistent
16:04:09 [mickBass]
andys: key question: how consistent will the data be? We have sample size of 1 - hard to make observations/decisions about modelling without risk of them becoming unstuck
16:06:15 [mickBass]
em: artstor is an intermediary, so data has been cleansed/crosschecked
16:06:43 [mickBass]
em: but especially wrt names, we may need to do some parsing on names to tease out e.g. name, birthday, death date
16:07:02 [mickBass]
andys: this would ease the task of merge w/ non-image sources e.g. OCW
16:07:17 [mickBass]
marbut: back to the point of first class objects for various concepts
16:08:01 [mickBass]
marbut: on artstor leave data untouched, but also hav an entity "artStorName" with reference to a vcard with firstname lastname bdate deathdate etc.
16:09:20 [mickBass]
kevins: don't necessarily want to bulk out, say, da vinci record with all the metadata from any of the sources
16:14:11 [mickBass]
andys: here's what we can do now:
16:14:25 [mickBass]
1. work thru vra spec and create an abstract spec of what's happening
16:14:40 [mickBass]
... what's clear and what's confusing
16:14:59 [mickBass]
... crosscheck vs. observations from artstor data
16:17:14 [mickBass]
marbut: given larger dataset we can run a translate to DC, this will show up some of the errors that Andy's pointed out
16:17:29 [mickBass]
kevins: we can discuss how we want to represent these crosswalks technically
16:17:45 [mickBass]
kevins: gets at core problem of how to represent records from foreign sources
16:18:26 [mickBass]
mark: rdfs for IMS already exists
16:18:41 [mickBass]
mark: but we may find its not correct
16:18:52 [mickBass]
imsproject.org/rdf (em)
16:20:03 [ericm]
q+
16:26:21 [ericm]
Mark - sample CIDOC records http://cidoc.ics.forth.gr/data_transformations.html
16:26:31 [ericm]
from my previous email http://lists.w3.org/Archives/Public/www-rdf-dspace/2003Sep/0062.html
16:36:17 [ericm]
ericm has left #simile