W3C

Data on the Web Best Practices Working Group Teleconference

06 Nov 2015

Agenda

See also: IRC log

Attendees

Present
phila, yaso, annette_g, MTCarrasco, hadleybeeman, newtonca_, ericstephan, antoine, laufer, Ig_Bittencourt, BartvanLeeuwen, gatemezi, Caroline, fradulov
Regrets
Dee
Chair
hadley
Scribe
phila

Contents


<trackbot> Date: 06 November 2015

<annette_g> Is someone having construction done?

<newton> newton

<scribe> scribe: phila

<scribe> scribeNick: phila

<hadleybeeman> http://www.w3.org/2013/meeting/dwbp/2015-10-30

PROPOSED: Accept last week's minutes

+0 (absent)

<hadleybeeman> +0

<ericstephan> +1

<yaso> 0

<newton> +0 (absent)

<Ig_Bittencourt> +0

<yaso> +0

<laufer> 0 (not there)

<antoine> 0 (absent)

<BernadetteLoscio> 0

<annette_g> +1

RESOLUTION: Accept last week's minutes

<annette_g> *now I see why we had trouble with phila's name

Update on the vocabs

<ericstephan> https://www.w3.org/2013/dwbp/wiki/New_thoughts_on_citations

Dataset Usage...

<ericstephan> https://www.w3.org/2013/dwbp/wiki/Newer_thoughts_on_citations

ericstephan: I talked last week about new thoiughts on data citations. Not spent a lot of time on it this week, but...
... I've got something called even newer thoughts on citations - we're trying to move forward.
... We've found a good citation model from the Force11 and OKFN
... They've been looking at how they tie in to JSON-LD
... They also form a community that we can call on for validation of the model
... Hoping for closure next week.

hadleybeeman: Do you need anything from the rest of the WG before next week?

<BernadetteLoscio> ;)

ericstephan: No, I just need to talk to Sumit and Berna
... We just need the editors to get together and move forward.

Data Quality Vocabulary

antoine: Usual scheduling problems.

<scribe> ... New sections have been added

UNKNOWN_SPEAKER: Looked at DCAT Dataset and Distribution
... dqv:computedOn and dqv:asQualityMeausure may not need to have strict dcat:Dataset/Distribution for domain and ranges. Prob say that we expect them to be used with them but they may not be
... I believe last week discussed this and general consensus that informal D&R is probbaly better

<hadleybeeman> s/dvq:asQualityMeausure/dqv:asQualityMeasure

phila: +1 to not defining domain and range unless necessary

antoine: If there is an objection, OK, we can discuss by e-mail. If not then we'll take it as resolved

<laufer> no objection

RESOLUTION: dqv:asQualityMeasure and dqv:computedOn will not have defined domains and ranges but will encourage use of relevant DCAT classes

antoine: Next question is one of schedule
... We have had feedback that is significant and we'd like to address that before we go for a new draft. OK? This will put us back a little but it feels right

phila: What length of delay?

antoine: I don't know, maybe a couple of weeks, maybe 3 at most
... Other issues are more tech so we should continue that discussion on the mailing list

hadleybeeman: So please jump in to the Ml discussions

Best Practices

hadleybeeman: It occurs to me that we should talk about the two things in the agenda but are there otehr things we should cover today?

-> http://www.w3.org/2013/share-psi/bp/ Share-PSI BPs

<hadleybeeman> phila: this is for information. The Share-PSI project has a project review in 2 weeks time, in front of the European Commission

<hadleybeeman> ...One of the things we are being judged on is the best practices of that project.

<hadleybeeman> ...To remind you: we all said that if those BPs are technical, then we should take care of it (which is happeneing).

<hadleybeeman> ...But if it is policy-related, and out of scope for this group, then Share-PSI should handle it.

<hadleybeeman> ...As a result, there is a lot of them.

<hadleybeeman> ...I just wanted this group to be aware of this set of BPs.

<hadleybeeman> ...If you are writing a policy statement, the Share PSI BPs might be useful.

<hadleybeeman> ...There are 40 partners and 22-23 sets of guidelines being created across Europe.

<hadleybeeman> ...This will help us in DWPB to prove implementation experience.

<hadleybeeman> ...(to implement what we are recommending)

<antoine> this is really great!

<hadleybeeman> ...We agreed in Sao Paolo that our next working group will be held with the Share PSI meeting, 14-16 March in Zagreb, Croatia

<hadleybeeman> ...There are a few of us in both groups.

<yaso> thanks for the report, Phil

Webby URIs -- links to values within datasets

<hadleybeeman> phila: This came from Jeremy Tandy, from CSV on the Web and Spatial Data on the Web. He pointed out that it was missing, so I quickly wrote it.

-> http://w3c.github.io/dwbp/bp.html#identifiersWithinDatasets

<hadleybeeman> It's now this BP, number 11.

<hadleybeeman> ...What was missing was persistent URIs within the data, or something that can be turned into a persistent URI

<hadleybeeman> ...I'm pleased that BernadetteLoscio reviewed it. We need others to look at it and agree/comment.

<hadleybeeman> ...It agrees the "webby data" thing that Eric Wilde keeps talking about .

<hadleybeeman> ...I want to make sure we don't write things we aren't comfortable with. And that doesn't go against what he or a future group might write.

<hadleybeeman> ...Grateful for feedback.

<MTCarrasco> +

<hadleybeeman> annette_g: Subsetting data? Can we address that here too?

<MTCarrasco> ?

<ericstephan> +1 annette_g

<hadleybeeman> phila: This has come up a lot in the Spatial Web group. When they met in Sapporo, they were trying to establish the relationship between them and us.

<hadleybeeman> ...Geospatial datasets are enormous; so subsets are helpful.

<hadleybeeman> ...This raises the question: who writes it? Us or them?

<hadleybeeman> ...They are thinking: Protocol OpenSearch, which is a standard query string that you send to a search engine.

<hadleybeeman> ...Its what's allows you in your browser to select different search engines, since the query that is sent to Google or Yandex or Yahoo is the same.

<hadleybeeman> ...q=......

<antoine> http://www.opensearch.org/

<hadleybeeman> ...That search string is independent of the technology that uses it.

<hadleybeeman> ...OGC added to that — but only in 2 dimensions

<hadleybeeman> ...Spatial data are looking at adding in time and space

<hadleybeeman> ...Where this fits between the two groups — open to debate.

<hadleybeeman> ....@annette_g, is your use case under spatial/temporal?

<hadleybeeman> annette_g: No. Commonly done with a data cube.

<hadleybeeman> ...Certainly can be.

<ericstephan> +1 annette_g

<hadleybeeman> ...My feeling is we don't need to define a new way of referencing it, but publishers should provide it.

<hadleybeeman> phila: Are you happy to leave it at that? Or should we define a way to identify a subset of a dataset?

<hadleybeeman> annette_g: Since our charter doesn't indicate we define new technologies, that's beyond our scope.

<hadleybeeman> BernadetteLoscio: @annette_g, if we have a best practice for this, where does it fit? Data access/

<hadleybeeman> annette_g: Yes, that is a good place. But I'm also seeing if you want identifiers for subsets, it's also related to identifiers.

<hadleybeeman> BernadetteLoscio: So it's to have identifiers for subsets, and a way to access subsets?

I talked about this yesterday at SemWebPro (Paris) and used this slide to illustrate it http://www.w3.org/2015/Talks/1105_phila_semwebpro/#(14)

<hadleybeeman> annette_g: Primarily it's about access. A corollary of that is that they need identifiers. The sections should reference each other.

<hadleybeeman> BernadetteLoscio: It would be nice. We could give an example? It's hard to see how to adjust the document.

<hadleybeeman> annette_g: I wrote this in the OpenMSI use case. They allow selection of subsets of the data.

<hadleybeeman> ...The data is terabytes, so too big to work with by download.

<hadleybeeman> BernadetteLoscio: Okay, I'll look at it.

<hadleybeeman> ...I don't know how to include this yet.

ericstephan: +1 There are a number of communities that need to do this
... I think this has an opportunity for domains that don't do that, and are burdened by always using the full dataset
... Some domains will be very opiniated about how to do that

<MTCarrasco> There should be a way for URIs to directly identify at least the variants: language, format, version and subsets (granularity)

<MTCarrasco> http://dragoman.org/comuri.html#variant

<MTCarrasco> http://dragoman.org/webdata.html#granularity

MTCarrasco: The question of granularity is more general. We need to address the variance of things like language, format etc. For subsetting - it's not the same
... if you want a single term that's a few bytes, you don't want 2 TB of data
... And this should be done with URIs

<hadleybeeman> +1 to using URIs as identifiers

MTCarrasco: this is in my draft

phila: +1 on using URis for this as well

MTCarrasco: Mememnto is not always webby.
... It relies on an off-the Web protocol

<Zakim> phila, you wanted to say that subsetting is something for next iteration of the doc

<hadleybeeman> phila: on Memento: I'm not disagreeing, but I think that' s a side issue.

<hadleybeeman> ...I think we all agree that we need URIs for identifiers, and subsets are useful.

<hadleybeeman> ...I'd suggest we make a note that we are talking about this, and Spatial Data are talking about this.

<ericstephan> At times it is cheaper to move the code to the data to do something such as extraction.

<BernadetteLoscio> we're gonna include a note about this

<hadleybeeman> ...It will take a while to sort this.

<BernadetteLoscio> yes

<hadleybeeman> ...I'd suggest we have an action for the editors to note in the document that we are discussing this, and talking to Spatial Data on the Web too.

<scribe> ACTION: BernadetteLoscio to add note to the BP doc that we are discussing the issue of subsetting data, and identifying those subsets. And that we're talking to the SDW WG about this issue too [recorded in http://www.w3.org/2015/11/06-dwbp-minutes.html#action01]

<trackbot> Created ACTION-216 - Add note to the bp doc that we are discussing the issue of subsetting data, and identifying those subsets. and that we're talking to the sdw wg about this issue too [on Bernadette Farias Loscio - due 2015-11-13].

issue: What to say about subsetting data, and identifying those subsets. This to be done in conjunction with SDW WG about this issue

<trackbot> Created ISSUE-208 - What to say about subsetting data, and identifying those subsets. this to be done in conjunction with sdw wg about this issue. Please complete additional details at <http://www.w3.org/2013/dwbp/track/issues/208/edit>.

BernadetteLoscio: There was another message from Erik about Webby data about typed links
... I think I'll raise an issue about this and discuss it by mail
... I will send a message about this
... We are reviewing the document and complementing it with examples
... We really need some help with some sections
... One of them is the data preservation section created by Christophe
... Who is not in the WG any more
... So I need some help to create some examples. This is a subject that isn't my thing

<gatemezi> Is the term "Webby data" owl:sameAs "Data on the Web"?

BernadetteLoscio: We need help to write some text
... And they're not all to the same standard

<Zakim> phila, you wanted to offer to contact CG

<hadleybeeman> phila: I will take an action to write to Christophe (who hasn't disappeared, he just isn't in the group anymore) and ask if he can help the group by writing some examples.

<hadleybeeman> BernadetteLoscio: Great. I can do this with him.

<scribe> ACTION: phila to write to Christophe and ask for help with writing the examples for data preservation [recorded in http://www.w3.org/2015/11/06-dwbp-minutes.html#action02]

<trackbot> Created ACTION-217 - Write to christophe and ask for help with writing the examples for data preservation [on Phil Archer - due 2015-11-13].

<gatemezi> I would suggest to clarify the term in our glossary?!

BernadetteLoscio: The other section I need help with is data access
... But I was discussing this with Newton before the meeting
... We're going to change some stuff in that section. Again we need help and feedback
... We don't have examples now. We need to review this section. I hope to have the section done by next meeting
... We also need help with data identifiers examples
... You said, phil, that you'd do that

<hadleybeeman> phil: I will.

<scribe> ACTION: phila To add the example for the data identifiers section [recorded in http://www.w3.org/2015/11/06-dwbp-minutes.html#action03]

<trackbot> Created ACTION-218 - Add the example for the data identifiers section [on Phil Archer - due 2015-11-13].

<hadleybeeman> BernadetteLoscio: I created a document — feel free to improve the example.

BernadetteLoscio: Everyone is free to improve things
... Please use the same example in each case
... So we have how to apply the BPs to the smae example

<Ig_Bittencourt> I can help with examples to Data Vocabularies.

BernadetteLoscio: I also need help with the data vocabularies section, with examples#
... In the way that...
... Also some of the BPs need rewriting, especially in the intended outcomes section.
... Don't know if others agree, but there is a uniformity/standard we're trying to reach for each BP
... 'It should be possible to...'
... I think it would be nice to have this type of description for each BP
... But I don't feel comfortable changing BPs I didn't write

antoine: I was editing this section. not sure I'll have time to do a lot of work on it, but I can't do examples in next week or so
... Is it the case that the intended outcomes in the vocab section are not good

BernadetteLoscio: Not quality, the way it's written
... If the Wg agrees, I think it's noice to have a common form of words (scribe interpretation)

phila: +1 on uniformity of style

BernadetteLoscio: It's about style, not substance

<annette_g> +1 for consistency

antoine: OK, I can try, but I don't see a consistent style, e.g. no. 11

BernadetteLoscio: That's why I need help.

<hadleybeeman> phila: @BernadetteLoscio, could you point us toward one BP where you think the style is right?

<hadleybeeman> ...I can then look at the ones I've written and will edit them to match the style you want.

<MTCarrasco> Common style +1

BernadetteLoscio: BP for data formats anda data versioning, quality - all follow the same style
... until no. 9 they're the same, then they become less consistent
... IDs, vocabs, need changing

<hadleybeeman> phila: Thanks, I will look at the ones I need to.

<hadleybeeman> ... Also, I will do a "native English" check.

<hadleybeeman> BernadetteLoscio: Is it best to start this now, or after this review is done?

phila: Offers a native speaker check

<annette_g> without!!

<ericstephan> I rely on my technical editor to tell me ;-)

reuse as the verb

<ericstephan> I can ask

also no hyphen for the nous

<annette_g> :)

<hadleybeeman> Hadley: from the dictionary: http://www.merriam-webster.com/dictionary/reuse

antoine: Rewriting the intended outcome...

<annette_g> *loll, lau-fer*

antoine: is the idea that every IO starts "it should be possible to...

BernadetteLoscio: Basically, yes. The idea was to use RFC terms whuch we've got rid of

antoine: Then feel free to do it in the vocab section
... So if we're doing that then I woinder why we decided to drop the RFC keywords. Anyway, I don't have time to do it
... You can do it, I don't think I have time to

<ericstephan> Sorry I have to go....

(scribe notes tone is more, I could but I don't think I want to, but I'm not going to stop you, you're the editor)

BernadetteLoscio: I included a new section in the doc 'BP Benefits'

phila: Really likes that new section

<BernadetteLoscio> http://w3c.github.io/dwbp/bp.html#bp-benefits

phila: It actually matches what we're doing in the Share-PSI work

BernadetteLoscio: So it's an action on the group to review

newton: My point is about a possible new BP about content negotiation. Berna and I werre talking about this. WE don't have a BP about conneg. Should we have?

<yaso> +1 to newton

<hadleybeeman> phila: my initial reaction is: where do we stop?

<yaso> maybe citing conneg in the best practices that already exists

<annette_g> +1 to phil

<newton> -> we have an item about content negotiation http://www.w3.org/TR/cooluris/#implementation

<hadleybeeman> ...If we tell people they should use conneg, which I think they should — then where do we stop? Should we tell them to use HTML?

hadleybeeman: Please continue this on the list, but we're out of time.

<annette_g> I was just going to say what Phil said.

<BartvanLeeuwen> thx hadleybeeman

<laufer> bye bye all

<newton> I'll start a thread in the mailing list

<BartvanLeeuwen> bye

<Ig_Bittencourt> thanks. Bye

<annette_g> bye all!

<BernadetteLoscio> thanks

hadleybeeman: back on vocabs next week. Probably DQV next week, but as always editors, please think about what you most want

<MTCarrasco> bye

<gatemezi> Bye!

<yaso> bye all!

Summary of Action Items

[NEW] ACTION: BernadetteLoscio to add note to the BP doc that we are discussing the issue of subsetting data, and identifying those subsets. And that we're talking to the SDW WG about this issue too [recorded in http://www.w3.org/2015/11/06-dwbp-minutes.html#action01]
[NEW] ACTION: phila To add the example for the data identifiers section [recorded in http://www.w3.org/2015/11/06-dwbp-minutes.html#action03]
[NEW] ACTION: phila to write to Christophe and ask for help with writing the examples for data preservation [recorded in http://www.w3.org/2015/11/06-dwbp-minutes.html#action02]
 
[End of minutes]