05 Jan 2010

See also: IRC log





I think there is a class of tasks that need metadata to operate properly

<dbooth> jar, perhaps the killer app for metadata is content-type.

from discovery to trust

not sure if I understand dbooth - example?

or do you mean the HTTP header?


<dbooth> yes, exactly

sorry, you don't know the context

I criticised it recently via Twitter

RSS vs RDF etc.

<dbooth> link?

see http://twitter.com/mhausenblas/status/7244707353

<jar> ok, so metadata in regard to "core web functions" is an angle i hadn't thought about

<jar> i was thinking more along the lines of itunes

and Geoffrey was so nice explaining it http://twitter.com/gsnedders ;)

jar, please define core Web functions :)

<jar> but data re "core web functions" is not the same as metadata.

<jar> not is data generally (e.g. linked data)

<jar> my term for what you were saying... let me find it

hm, this sounds like an artificial distinction to me "but data re "core web functions" is not the same as metadata."

<jar> metadata is very clearly data about data. canonical example is DC

<jar> metadata is a subset of all data


<jar> molecular weight is not metadata, nor is a user's public key

but there is also data about other things such as services

what is a WSDL file?

<jar> i don't consider that metadata.

<dbooth> wsdl is not metadata. it's data about a service

<jar> that's why i suggested "data related to core web functions"

<jar> some of which is metadata, some of which isn't

wow, then you have a very very very focused definition of metadata. Mine is broader ;)

<jar> google define:metadata

<jar> mine is the majority view

ok, jar, I believe you ;)

<dbooth> but content type seems to me to be exactly metadata: it allows a string of bytes to be re-interpreted for a particular use.

still, what are the core Web functions?

may I ask what a DOAP description then is?

(that is http://trac.usefulinc.com/doap)

it's data about a project

aka metadata

or am I totally wrong?

<jar> i'm fudging in order to characterize what i perceived was your interest. core web functions would include HTTP, authentication, web services, content-type, site-meta, link:, POWDER

<jar> there's tons of metadata; data about a project qualifies, i think, although there is a slippery slope from the data to the social process that created / will create it

<jar> i.e. a journal is a data source (social institution), but it corpus to date is data

ok, thanks for the clarification

<jar> in a proper ontological treatment the two would be distinct entities... but no need to get into that, we could take data sources to be honorary data, if pressed

so, back to your draft 282

how about to start with this sort of definition for metadata

and then discuss the core Web functions

and then list examples for each of these domains?

<jar> i think we need to start with a menu of potential efforts, listing 3-4 of them, and then pick one effort, and dive in

(if I agree or not re your definition doesn't matter for now )


but why only dive into one?

<jar> (1) semweb, (2) data re core web functions, (3) classic metadata a la XMP / DC, (4) ...

drop (1)

<jar> because the topic is too big. ocean-boiling.


still, drop 1 ;)

sorry to say this, but this is something 89% of the audience is not interested in

<jar> fine, but many people will drag it back in that direction (molecular weight), so it needs to be *explicitly* listed and then dropped

hehe, I see your point

<jar> "this"?

this = Semantic Web at large

when we talk about concrete technologies, say RDFa or URIs, fine

<jar> oh. right. i think we agree.

within certain use cases such as GoodRelations in RDFa yielding a new sort of SEO then people are interested


<jar> SEO = ?

Search Engine Optimisation

<jar> don't know GoodRelations

wanna show up in Google on first place? use GR and RDFa ;)

<jar> ok. so this is why i want it to be content-oriented and application-oriented, not technology-oriented


<jar> sorry


I agree

<jar> so rdf and rdfa are just generic subroutines you invoke when needed.

<jar> as is XMP

yeah, sort of - for certain tasks usable but not always and everywhere

or if you contrast Atom with RDF, etc,

ok, so we agree to have it content/app oriented

what is the list of the efforts, now?

1. data re core web functions such as HTTP, auth, trust, etc.

<DanC> I have very little interest in the "what is metadata?" question. I'm more interested in models of HTTP that help with anarchic scalability; i.e. models that help independently-developed apps work together.

sorry, DanC, this is a bit confusing. I was sort of hijacking this telco into 282 action of jar

<jar> i'm thinking about what you (mh) said, covering the union of "core web data" (the web as application?) and metadata sensu stricto (itunes exemplar)... i dislike documents that don't have unique focus... but maybe could live with one that is admittedly bifocal; or with two; or with finding some common thread

aha, yes, I see

<jar> the question is not "what is metadata", it's "what problem do we want to work on"

<jar> the latter masquerades as the former

and +1 to DanC's distributed app on Web-scale approach

jar was sort of briefing me re metadata, so forget about the question what is metadata, please :)

<jar> of course. but no open metadata apps are emerging that i can see. so why bother.

hm. wouldn't OpenCalais Freebase and the like fall into this category?

is 1. ok with you jar?

<jar> freebase open & anarchic??

open sort of

<DanC> no one project is anarchic; it's the whole that's anarchic

anarchic (maybe internally ;)

yup, agree, DanC

<DanC> as to why bother: some proposals get the "aboutness" bit wrong. that means a web site owner can't use that technology along with others.

no single entity can be but the collective operations. but one can create rules that allow or disallow certain behaviour

<jar> well, mosaic both exploited and encouraged anarchy, that's what i meant. you don't need an architecture if there are no integration points

DanC, not sure what you're talking about. Concrete example, please?

<DanC> integration points for freebase are clients that use it.

<DanC> i.e. clientXYZ wants to use freebase _and_ OpenCalais

ok, so?

<DanC> so if freebase and OpenCalais have conflicting models, clientXYZ has a hard life.

<jar> you think there is or soon will be demand? or that the TAG can be effective at promoting things like this somehow?

both have a linked data interface

or what model are you talking about? the schema? sorry, /me a bit dense as it seems

<DanC> right... but if one uses http://www.w3.org/ to refer to "the web consortium" and the other uses it to refer to "the home page of the web consortium", then life is hard for clientXYZ


anyway, shall we come back to the 4 domains for the 282, jar?

<jar> this is just RDF semantics. does it need a champion? (not a rhetorical question)

my guess would be: no. the community will sort it out

<DanC> perhaps that was a bad example... but I think the aboutness stuff is a good example. Maybe not life-changing, but useful in that it keeps coming up on www-tag

I *love* aboutness

<jar> 4 domains. i guess the sensible thing is to keep the 2 we've talked about, what i call "core web" and "metadata sensu stricto", but proceed in parallel with them. maybe split to 2 docs later

my approach is simple: you'll always need a human in the loop (see http://sig.ma) to disambiguate

<jar> yes, aboutness is very important, it's the same problem as using URIs to refer, and is addressed by RDF model theory

agree jar

the question is: which ocean? :D

<jar> yes, i'm trying not to boil the ocean. am desperate to gain focus

<jar> larry masinter favors the "metadata sensu stricto" ocean

ok, to a better term for "metadata sensu stricto"

<jar> the TAG's usual audience would probably be more interested in "core web data"

<jar> they are just different i think, but with common subroutines

but "metadata sensu stricto" might be a bit over the top, can I have a more casual title for it, plz

<jar> i would just call it "metadata" except that this confuses everyone outside the library community

<DanC> umm... "metadata per se"? or "data about digital artifacts"?

<jar> how about "data about data"

how about bibilographic metadata

hm, to narrow maybe

<jar> itunes and flickr are bibliographic?

on the other hand digital artifacts reminds me on MPEG21

<jar> data about documents (where 'document' is term of art including images, audio, video) ?

<DanC> sure

ok, I guess I can live with that

lemme quickly get the brainstorm generator hat ... what else ... digital media item, digital artefact, digital asset

blech. let's stick with 'document' ;)

<jar> well we can figure this out later. that will do for now. so michael, i think we have a way forward, yes? how about this: 1 document with 3 parts (1) content and applications around data about documents-broadly-construed; (2) content and applications around data about "core web functions" (see above); (3) subroutines common to both

sounds like a plan!

<jar> with the focus on content and applications, e.g. Dan's example above


will you draft that in http://www.w3.org/2001/tag/2009/12/metameta.html and then I start to fill in, or ...?

<DanC> hmm... all 3? I thought you were trying to focus, jar.

<jar> hmm. i think i can do this in a day or two

<DanC> oh... it's mostly 2 areas.

<jar> i want to focus but am indecisive and i believe i'm being asked to do both... also mh is volunteering :)

<jar> any, 3 is common, so is properly part of 1 and 2, so really there are only 2 oceans


ok, I think I'm gonna call it a day (re IRC) and head out to my next meeting

<jar> ok me too.

thanks for the enlightening discussion and lemme know when I can start to input, jar, please

<jar> ok will do

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2010/01/05 15:12:21 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.135  of Date: 2009/03/02 03:52:20  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/articats/artifacts/
No ScribeNick specified.  Guessing ScribeNick: mhausenblas
Inferring Scribes: mhausenblas

WARNING: No "Topic:" lines found.

WARNING: No "Present: ... " found!
Possibly Present: DanC dbooth jar
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy

WARNING: No meeting title found!
You should specify the meeting title like this:
<dbooth> Meeting: Weekly Baking Club Meeting

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Got date from IRC log name: 05 Jan 2010
Guessing minutes URL: http://www.w3.org/2010/01/05-awwsw-minutes.html
People with action items: 

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.

WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report

[End of scribe.perl diagnostic output]