SIOC/WorkingGroup/2008-04-10

From W3C Wiki

In this meeting we discussed which URIs to use for instances of various SIOC classes and best practices in related matters. One outcome of it are the guidelines for SIOC URIs

SIOC URI Working Group on April 10th, 2008

Attendees

  • Uldis Bojars
  • Dan Brickley
  • Richard Cyganiak
  • Sergio Fernández
  • Tuukka Hastrup
  • Alexandre Passant (via skype chat)
  • Thomas Schandl

URIs for sioc:Post

With content negotiation

cygri: use the same URI as the webpage URI for the post. If you have this uri and you want rdf, you will get rdf.

It is useful to have an additional triple saying: here is a version which guarantees html - use sioc:link for that.

  • TODO: need to revise spec - revise description of sioc:link: take out "it is recommendend to limit its use"

Without content negotiation

cygri: Make a new uri for the RDF page:

former-link-for-the-uri/sioc/

Appending or prepending "sioc" doesn't matter, just make a new URI (using slash vs. using hash: if it is a document it has to be a different document, so it must be slash - if you just append a hash and you ask for it, then you will still get the original html page)

uldis: but we are adding another level of indirection.

danbri: so you are suggesting that you con neg the post and its rdf description at the same uri? Depending on the con type you ask for you get one of those.

danbri quotes timbl: "usually the rdf version of something is too reduced and too fragmented - too lossy to count as the same thing" e.g. the foaf spec vs. the schema at the same URI - those are two separate things

cygri: true when there is clearly more information, but with sioc post: all the content is there in the rdf (danbri: "just missing the banner ads"), only dressed up less, layout... I think for sioc:Post it's appropriate to say html and rdf are the same thing - just html and rdf version of the same thing.
danbri: In a couple of years it will be RDFa

ulids: It is safer without con neg.

cygri: There will be systems where you can't do it. because the hooks in the api won't let you do it. But if you can do it cleanly: then yes, you should do it.

Sergio's approach

Sergio uses another approach to integrate his blog enriched with RDFa and an external service. Unfortunately we don't have a whiteboard picture of that, but basically he has something like this embedded in the markup:

<http://www.wikier.org/blog/www2008-a-pity> a foaf:Document ;
  foaf:primaryTopic <http://www.wikier.org/blog/www2008-a-pity#post> ;
  xhv:alternate <http://www.wikier.org/blog/rss> .

<http://www.wikier.org/blog/www2008-a-pity#post> a sioc:Post ;
  dc:title "WWW2008, a pity"@en ;
  sioc:has_creator <http://www.wikier.org/blog#wikier> ;
  dct:created "2008-04-21"@en ;
  sioc:content "..."@en .

And then he has deployed autodiscovery and content negotiation to get the RDF/XML representation using an external service.

Paging

cygri: seeAlso is too generic - it breaks RDF browsers like the Tabulator. Tabulator would automatically retrieve all seeAlso(s) for a resource which in the case of John's blog leads to retrieving many, many pages.

uldis: Using another paging property would mean that work for a crawler becomes more difficult / changes. Do we need best practices for RDF crawlers?

cygri: big RDF crawlers follow all different kinds of properties, not just seeAlso's. Since linked data says that any kinds of a triple is (or can be) a link pointing to more RDF data

Options for paging:

  1. use seeAlso
  2. use link property from HTML5 namespace Dan pointed to + use seeAlso

Archives

Modelling the existing artifacts that are represented in html pages already - so that archive pages have proper rdf representation. It's just another way to navigate - redundancy.

Downside if we don't say how to do that: no permanent links for containers and you cannot use an RDF browser to navigate to see e.g. what someone wrote in April '03.

Feeds

We could use sioc:Feed to describe what category a rss feed is about - you could attach it to anything e.g. a user. There might be an intersting use case, e.g.: What is that feed about? It is about a particular category within this forum.

Discussion about URI for sioc:User

danbri: We could say that sioc:User is first and foremost a document. If we wanted to use real HTML page's URIs for a sioc:User

alex: I do not agree. I think we must use different URIs - a webpage describes a sioc:User but is not a user - use case: if I want to describe both the user and the page and got the same URI everything is broken

uldis / danbri: the problem is how many indirections we need. XFN is much simpler if you compare wth RDF spaghetti with 4-6 indirections

alex: eg for the flickr exporter, I created a special URI for the user, which indeed points to the webpage if using an HTML browser - using the same URI as the page would have lead to things like "this page is the creator of this picture galery"

cygri: a sioc:User is not a person, it's an account.

alex: neither a person, neither a document, we agree.

uldis: do we want to do anything in SIOC where the distinction between an account and that account's page matters?

alex: yes uldis. or more precisely, we do not want to deal with things where the not-distinction breaks everything (imho). For me a User is not a document, this is a virtual representation of a Person on a online service, which can have an HTML page describing properties of that account

cygri: sioc:User is an account, not a user

alex: yes that's what I said, eg: http://twitter.com/captsolo - is that your account ? or the webpage describing the status of your account? for me, that's the 2nd. If you say that's your acccount, it will lead to tw:cptsolo a sioc:User ; a foaf:Document => I think we will bump into some inconsistency in OWL if we use inference

dabri: there is an intuitive inconsistency since it is 'obvious' that a User/OnlineACcount is an Agent:

_x foaf:mbox p .
_x a sioc:User .
implied by domain of mbox
_x a :Agent
_x NOT a Document
_x.toURI() http:code '200'
etc

danbri: it's a bit related to http behavour, so owl reasoners probably won't notice it

alex: you're right, there's no disjonction in owl between onlineaccount and document. unless we want to explicitely subclass OnlineAccount from Agent (which makes sense)

uldis: it is a compromise saying to use the same URI, but it also makes things simpler

uldis: and we need a concrete use case then showing why it is absolutely wrong to use the same URI if we think of test driven development: We currently have a use case where we can use the document URI for a sioc:User: Because it is the simplest thing. Is there a test case showing this is totally wrong?

alex: but imagine a simple example: I create an account on flickr. but still do not have a screenname, i.e. do not have an homepage on flickr. then I create my page, one month later. if I run query ":myuser dc:created ?when" I'll get the timestamp of the page, not of the account

danbri: flickr has 3 things: ugly identifiers, path identifiers which you can change once and then a screen name.

Discussion summed up by uldis: Sometimes you may want to conflate things into one URI for simplicity.

alex:I think we must think in terms of semantics / reasoning / machine-redeable view rather than simplicity (to a certain extent, I agree)

Discussion summed up by uldis: when conflating things you have options:

  1. to use different properties to express properties of each of the things conflated into one URI
  2. say that one of the things is primary and that you don't care that much about the other

e.g., in Alex's case we could just ignore the timestamp of the page as in export in SIOC account creation date as a dc:created of the URI which is a conscious compromise where we choose to ignore the creation date of the page which is not important for us in this context it was also mention that there is a page (actual HTML generated / retrieved) and a thing denoted by a page and the timestamp of last regeneration of HTML (which can be useful for caching purposes) anyway will not be the date of creation of the resource denoted by this HTML foaf:accountName has to be a plastic proeprty so that it can be different on Monday and Tue

cygri: one test which indicates that you need to keep those things apart is when there is clearly a piece of information in our domain that is different about the two things - like creation date. if the account and the page for that account were clearly created at a different time and that distinction matters, then keep them apart. but if not, why not gloss a bit over the details and pretend they are the same

Additional question for sites like flickr: main photo page of a user vs. users' profile page - which one to use for URI for sioc:User?

cygri: in the end, we see, there is not consensus, people always will have different opinons on this. so I don't think it makes sense to prescribe anything for sioc. in the regrad if a sioc:user is the same as the account's profile page or not.

4 postitions we could take:

  • web page (account page) and account (user) are the same.
  • web page and account are different things.
  • we don't know, we don't prescribe anything.
  • it is okay to do either one, you choose, whatever you choose is right.

danbri: You can also distinguish between what is possible to say and what is useful to say.

danbri: There is also a difference in saying "you can do this, but in 1 year we might ban it" or "in a year we might make it legal".

URI for sioc:Usergroup

The same question as for sioc:User:

Is the usergroup an information resource or not?

  • if yes: you could use the RDF document (assuming there is no html doc for a usergroup) without an hash
  • if not append #group (or whatever) to the RDF document (appending a hash should always be done with the rdf version not the html.)

Case sensitivity

E-mail addresses

Prescribe lower casing email addresses before calculating their sha1-sums?

sergio, from rfc821:

For some hosts the user name is case sensitive, and SMTP implementationsmust take case to preserve the case of user names as they appear inmailbox arguments. Host names are not case sensitive.

-> host names are not case sensitive, user names can be

uldis: but that doesn't prevent us to still say they should all be brought to lowercase before doing sha1. I mean we decide what should be the rules for calculating foaf sha1sum.

cygri: it means that you potentially smush different people.

Web page URIs

There are algorithms for normalizing uris. in the domain name you can normalize it lower case, but in the rest you can't - it makes a difference.

uldis: you could instruct a reasoner: ignore case of uris (maybe that's not the way to go, but you can always check and see what it's lower case is) there is a rfc - but probably we won't need it - http://gbiv.com/protocols/uri/rfc/rfc3986.html#comparison

Actions to do, follow-up

URI decisions

alex: work on wiki page about pros / cons about different URIs for sioc:User and related document + announcement on mailing list so there is some public discussion and record

Best practices guide

uldis: alex had an idea for a best practices guide so an actual document (which might be an outcome of this and other meetings) for publishing social semantic web data

  • which uris you use is one question
  • the properties, vocabularies/ontos you use another
  • even the patterns who you express different things eg info about your online account. probably that is best described by lots of examples. "this is how you would describe.." ..a person, ..a tag

Collection of test cases

uldis: should we make a collecion of examples of expressing in RDF of different kinds of pages e.g., DanBri's page on Dopplr and then see how we distinguish b/w properties that pertain to a foaf:Document and to a sioc:User

uldis: a wiki page where we have done this modelling excercises and saying: that is what's on the web, say: danbri has a flickr account and that is how we model it. not even as a rule - it could be: this are 3 options, how we could model it and then we can discuss and choose one of them. it is hard to think about wihtout having concrete examples -> wiki, possibility to add comments there

Whiteboards

Whiteboard 1

Whiteboard 2