15:59:15 RRSAgent has joined #simile 15:59:29 no parameters needed, apparently :) 15:59:33 OK - who did that? :-) 15:59:55 I think you did. Apparently RRSAgent doesn't respect embedded quoted strings ... 15:59:59 I tried /invite - did you? I womdered it it was because I am not an admin in the room (@rob) 16:00:23 no I was just watching. No action taken 16:00:39 Rob - did you try? 16:00:40 that might be it (the ops thing) 16:00:43 yes it was me 16:00:57 So the error message is - err - wrong. 16:01:00 jse has joined #simile 16:01:18 kevins2 has joined #simile 16:01:52 ericm has joined #simile 16:02:57 eric, are you dialing? 16:03:17 Eric Miller 16:03:21 MacKenzie Smith 16:03:24 Kevin Smathers 16:03:26 Paul Shabajee 16:03:29 John Erickson 16:03:31 John Erickson 16:03:33 Rob Tansley 16:03:42 Andy Seaborne 16:03:52 Eric, MacKenzie, Paul, Rob, Mick, Andy, Kevin, John 16:04:14 Time discussion 16:04:28 Action to Dave to check in with each PI and work out schedule 16:05:50 Corpus discussion 16:06:56 ms, artstore most likely for large volume. 16:07:20 ms, in deathmarch to beta release. should be able to get data to us next week. 80k records with thumbnails 16:07:42 mick, what targets? 16:07:47 ms, currently up to them. 16:08:13 ms, subject focus seems to be chosen, so may have more input for them now. 16:08:52 mick, paul, any subject area to focus on? 16:09:23 paul, need to check what collections are available. Need readily available secondary community. Higher ed, school teachers, etc. 16:09:40 ms, all content is aimed toward art education generally 16:10:13 ms, only metadata though. possibly thumbnails will be metadata. 16:10:22 paul, no user access to images? 16:10:37 mick, artstore is a subscription service 16:10:49 ms, i thought the demo was for the reviewers. 16:11:06 ms, we'd like to move off of artstore data to MIT IMS data as it becomes available. 16:12:03 mick, sorry, this is a sidetrack. I was considering ... if there is a corpus for simile, then is there an advantage to sharing metadata in that corpus if the rights can be made to align. 16:12:20 ms, just for metadata alone, probably no big problem. 16:12:40 paul, for the demonstrator, seeing images in addition to thumbs is probably important 16:12:56 ms, courseware output will be rated for public consumption. 16:13:10 paul, there is a concern of timescale and risk for choosing an appropriate corpus 16:13:27 ms, the things that are available are mostly small collections. 16:13:48 ms, i'm working on large collection available. 16:14:18 em, another option is not just one dataset, but many datasets. 16:14:27 em, working on access to BBC creative archive 16:15:06 em, bbc is opening up access to their repository of content behind a lot of their services. Lessig talked to them about creative commons and they bought in. 16:15:31 em, many images, etc, + simple metadata publicly available. richness unknown. large body though. 16:15:38 em, hp team know more? 16:15:55 paul, nothing here 16:16:23 em, SWAD europe contacted BBC for test corpus? 16:16:34 paul, sounds appropriate, any timeline? 16:16:43 em, probably not in line with SWAD-E 16:17:09 em, didn't arkive have contact with BBC? 16:17:18 paul, BBC was probably the largest donor. 16:17:43 em, following these leads might provide results both for SWAD-E and Simile. 16:17:50 paul, takes action. 16:18:28 em, even if only say 5 full images from artstore? 16:18:49 ms, probably, but need to contact artstore again after beta is complete. 16:18:55 (ie next week.) 16:19:54 mick, explore quality of links from artstore to other image sources, other repositories, etc. Maybe contact SWAD-E for collaboration. 16:20:50 ms, this not what i was discussing with the CTO. That type of interaction could possibly be negotiated at a higher level, but right now the data is available just to get us going on some data. 16:21:05 em, i think it is worth pursuing. 16:21:43 em, follow-up as the demo proceeds and it becomes clear what a few detailed images could do. 16:21:55 ms, will discuss with Tony. 16:21:59 em, please copy me. 16:22:11 ms, will discuss f2f. 16:22:30 em, message from Martin Duerr 16:23:22 ms, haven't followed up because of our interest in VRA and IMS. If we want to add CIDOC then I can contact him. 16:23:47 em, if more than two is too much, then maybe not, but we could get the data first and see whether it is useful later. 16:23:59 em, could be useful for many things, maybe even SWAD-E. 16:24:10 em, Action to follow up with Martin. 16:24:38 ms, I have someone who can follow up for getting the actual data from Martin. 16:24:57 em, on Amico, no progress, I think they are on vacation. 16:25:45 em, currently have one record. hope to get 1000 records in original XML. 16:26:43 em, not currently in IMS or VRA though. 16:26:50 em, not meant to conform to any standard. 16:27:25 mick, there seem to be two threads; 1 is modelling; 2 is increasing the size of the corpus. 16:27:36 mick, is there any need to serialize? 16:27:49 em, i think they can be done in parallel. 16:28:11 mick, i think mark was hoping for help from em for the script changes. 16:29:14 mick, mark isn't in the call, lets sort it out early next week. 16:29:34 mick, i think mark thought you were leading the charge in getting IMS data. 16:29:47 em, i think he was waiting for me to get more of the RAW xml. 16:30:12 em, but the script modifications can be progressed separately. 16:30:27 Demo Script discussion 16:31:14 mick, tried to draw out types of data and value to the user. Particular and specific example with manufactured image examples. 16:31:19 mick, any feedback? 16:31:46 ms, prefer to read at leisure and send comments by e-mail. 16:31:54 mick, ok to the list. 16:32:04 ms, action to read it this weekend. 16:32:19 em, scanned over it. quickly, seems a good starting point to me. 16:32:41 em, will comment more as time to read thoroughly. 16:33:20 em, Mark's aha moment on the data is very positive. Almost taken for granted by ms and me, but very good to have identified it in the team. 16:33:50 mick, i'm concluding that it will be very difficult to make this valuable without making controlled vocabs in scope in some way. 16:34:44 ms, diane is working on crosswalking controlled vocabularies. informed her that oclc has big market for this. gave her pointers to some of the controlled vocabs. 16:35:03 very interested in services to do this kind of mapping once for everybody. 16:35:13 another paper i heard on schema crosswalking. 16:35:32 rdf is interesting to them. schema used to represent crosswalking between schemas. 16:35:48 em, RDF schema or OWL. 16:36:11 ms, not technology, but content. how do these specific two schemas get mapped? 16:36:28 ms, need to bring Jean and Diane to talk to the team if possible. 16:36:39 ms, either way we are delighted to collaborate with them. 16:37:56 em, link authority files link lots of different representations together. including controlled vocabularies at some level. merge or not to merge depends on the application. the same ideas apply to many different topics. 16:38:36 mick, inferencing about schema synonyms probably not sufficient to get over hump of compelling demonstrator. need also to consider relationship of contents. 16:39:03 em, implication is inferencing of equivalence of classes. 16:40:06 mick, select specific domain for schemas, and contents for same. is there a scope for concept space or controlled vocabulary. if you continue with Frank Lloyd Wright... Getty thesaurus is very large -- 16:41:25 ms, domain vocabularies are often very large. 16:42:08 mick, when reviewing the story board, review with eye to concept mapping. 16:42:22 ms, will depend on data sources 16:42:33 ms, will subset appropriately based on the data. 16:43:05 ms, if creating IMS records from VRA, then we get to decide, or may even have to map by hand. 16:43:34 ms, if not indexed against the same vocabulary then need to eg. map Getty to CIT. 16:43:53 mick, like to get this reflected in the storyboard. 16:44:04 ms, often controlled vocabs don't map. 16:44:27 ms, sometimes not right for discipline. need to solve approximate mappings. 16:44:40 paul, looked at this with arkive. 16:45:00 paul, if 10k records then we can't hand crosswalk the data. 16:45:36 em, As I told Diane, there are people who do the hand crosswalk. 16:46:00 em, Part of the thing is if there is a human involved and they do the work, then they do it for the team and others benefit from it. 16:46:32 em, the other is statistical matching and manipulation. Diane's work incorporates them all and keeps track of what mapping was applied. 16:47:08 paul, and if the mapping is impossible or tortured? 16:47:38 ms, if no good mapping then either don't map, or dumb down the mapping to the next most general identifier that is in vocabulary. 16:48:37 mick, kevin and andy, does the demo script indicate likely areas of mapping? 16:49:01 mick, what inferencing for schemas, what for controlled lists of concepts. 16:49:57 ks, some conv on list 16:50:03 ks, prob not complete conv yet 16:50:10 ks, my take is to make the controlled vocab an index 16:50:43 ks, works well going from a concept to a list of corresponding records 16:50:52 ks, doesn't work particularly well the other way 16:51:13 ks, could use bidirectional pointers, but then have overhead of info that needs to be kept in sync 16:51:24 ks, you end up with duplicate pieces of data 16:51:38 as, one approach is to give significant concepts URI's 16:52:11 ks, that works fine unless you have a literal 16:54:59 andy, demos in short timescale are driven by availability of data presumably. 16:55:15 andy, need to see exactly the types of queries, the mappings, etc? 16:55:38 andy, as soon as the corpus is available we really need to expand the demo in that area. 16:56:00 andy, demo needs to illustrate research agenda. 16:56:14 andy, not *be* research agenda. 16:56:28 paul, thesaurus project in SWAD-E may be useful. 16:56:43 em, announcement just posted on ESW list. 16:56:52 http://www.w3c.rl.ac.uk/SWAD/thesaurus/tif/deliv81/final.html 16:58:07 mick, let's break here. vocabs to be discussed on list. 16:58:19 wrap 16:58:47 kevn can you post notes to list? 16:59:14 eric - can you set the security and post the URL for the minutes, please? 17:02:32 rrsagent, pointer? 17:02:32 See http://www.w3.org/2003/09/19-simile-irc#T17-02-32 17:03:32 http://www.w3.org/2003/09/19-simile-irc should be set to world-readable in next couple sec