W3C

DWBP WG F2F meeting, day 1

31 Mar 2014

Agenda

See also: IRC log

Attendees

Present
+44.207.202.aabb, ericstephan, Hadley, Yaso, PhilA, deirdrelee, BrianMatthews, Ig_Bittencourt, Antoine, MakxDekkers, +33.4.93.00.aacc, gatemezi, HadleyBeeman
Regrets
Chair
HadleyBeeman
Scribe
PhilA, Ig_Bittencourt, yaso, Caroline_

Contents


<MakxDekkers> passcode 3927# is not valid, don't get voice connection

<MakxDekkers> was there also going to be a skype session?

<PhilA> Yes, Makx

<PhilA> Deirdre is working on getting a laptop set up for that

<MakxDekkers> OK hasn't the meeting started yet?

<HadleyBeeman> makxdekkers we're getting set up

<MakxDekkers> OK, gives me time to grab a coffee

<HadleyBeeman> go for it

<HadleyBeeman> :)

<gatemezi> ah ok.. it gives me time to do other stuff ;)

<HadleyBeeman> We're sorting out the A/V and the phone line … bear with us

<ericstephan> Okay standing by

<ericstephan> zakim +1.509.554.aaaa is ericstephan

<MakxDekkers> Connected voice

<deirdrelee> For those wanting to join webcam, we can connect via my skype: deirdrelee

Welcome

<PhilA> Steve: Welcomes everyone, thanks those who have flown/travelled to be here

<ericstephan> I wish I could be there with you all

<PhilA> ... attended conference on 50 years of alleviating proverty, programme begun by LBJ

<PhilA> scribe: PhilA

Steve: Percentage of population in poverty reduced, absolute numbers increased
... tells a story about how people know each other, but many people don't work together. Seems a lost opportunity
... Describes outcome of meeting, plan was to put collected papers into a new book
... problem is that people aren't working together
... so our job is to provide standards that allow people to work together. That's what we're trying to achieve - better collaboration through open data
... we have a chance to make a big difference
... people are looking at us

The Name Game

Guest: Rick Robinson

Rick: Works on smart cities for IBM

JohnGoodwin: Introduces himself

markharrison: Here today to represent GS1 (although works for Cambridge University)
... mentions LOD project at GS1

Laufer: Intrduces self

<MakxDekkers> deidre my skype name is makxdekkers

<gatemezi> I can now see them all.. Thanks Deirdre ;)

Vagner_Br: Introduces himself

bernadette: Introduces herself, from Recife in NE Brazil

Ig_Bittencourt: Introduces self

Newton: Introduces self (from Nic.br)

Flavio: Introduces self from NIC.br, Sao Paulo

Guest: Phil Tetlow

PhilT: Introduces self and recognises follow IBMers

Adriano: Introduces self
... interested in big data, data mining etc

Antoine: Introduces self from Europeana

CarlosIglesias: Introduces self
... experience with OD in Spain, working with Web Foundation, CTIC etc.

Guest: Brain Matthews

<laufer> +laufer

BrianMatthews: Introduces self from STFC (Science and Technology Facilities Council) - astronomy, physics etc.
... refers to SKOS, see http://www.w3.org/TR/skos-primer/

deirdrelee: Introduces self

yaso: Introduces self

PhilA: Introduces self

HadleyBeeman: Introduces self

gatemezi: Introduces self (from Eurecom)

ericstephan: Introduces self

<deirdrelee> https://plus.google.com/hangouts/_/7ecpi31t3gsbcrq6onet5o0280

<deirdrelee> I'll post it on wiki too

PhilA: Just to note that ericstephan and BrianMatthews have similar interests here

MakxDekkers: Introduces self
... Consultant in Spain, works with PhilA on an EC project

<ericstephan> BrianMatthews Kerstin Kleese Van Dam (my manager) sends her greetings to you.

<MakxDekkers> maybe, I don't hear you guys not so well either

<deirdrelee> Google Hangout details on wiki: https://www.w3.org/2013/dwbp/wiki/London_2014#Google_Hangout

<MakxDekkers> Disconnected phone line. Hearing you on hangout

<gatemezi> @all , I will send regret in 30 minutes for one hour because I have an appointment with my Doctor.. Sorry for that :(

<HadleyBeeman> Oh dear — sorry makxdekkers. We'll see if we can fix that

<HadleyBeeman> Bye for now, gatemezi!

Target for meeting

HadleyBeeman: We hope to get to the point by end of tomorrow that we can publish the First Public Working Draft of the Use cases document
... an issue that has come up a lot, is how we work together. So I want to talk about how we can organise ourselves
... we need to do things that make sense to us
... we have to be working in the open
... everyone can see what we're doing (e-mails are publicly archived)
... but we can use whatever we like
... to build the deliverables
... quick reminder of the deliverables
... minimum things are 3 docs
... the Best Practices Recommendation (some bullet points)

-> http://www.w3.org/2013/05/odbp-charter charter

HadleyBeeman: We can work through that bullet list. We can drop them, add news ones, they're just a guide
... in addition, the WG agreed to create UCR
... we may well decidce that we need to write more, split the BP into multiple documents. We have a lot of leeway
... but this is what we begin with
... Vagner was concerned about keeping track of our work on the UCR
... the wiki makes it easy to make lists, centralise things etc.

-> https://www.w3.org/2013/dwbp/wiki/Use_cases_timetable Use Cases timetable

scribe: probably want to change the wiki homepage
... a big part of the chairs' job is to make the wiki useful and useable

laufer: I think the main page can be a little confusing if it has too many items
... the main things of the groups are there

The Use Case document

deirdrelee: Following on from what Hadley said... we've been working on the UCR doc
... aim is that it is a lead into the deliverables (BP, QDV, DUV)
... goal is FPWD of UCR
... We suggest that we start from the challenges
... what were the problems/issues - so this is where we could potentially help
... yaso made the point about highlighting positive aspects
... the challenges are in the Google doc https://docs.google.com/spreadsheet/ccc?key=0AhTZf3B9yQ3odGVvU3pBazFsY3pyUVppNDFSZGtyQkE&usp=sharing&richtext=true#gid=2
... is the challenge relevant?
... so data on the Web should ... to make it reusable for example
... data should be in format X to be reusable
... or should include metadata Y to be reusable
... and then we need to consider whether it's in scope
... is it in scope of W3C in general
... (i.e. about the technical infrastructure of the World Wide Web)
... and is it within our own expertise
... if so, then it becomes a requirement for one of the deliverables (potentially more than one)
... refers to

-> http://www.w3.org/TR/csvw-ucr/ CSVW UCR

Steve: I just want to ask about data on the Web, Open data, and data
... do we mean that these thingsa re the same or that these things are different?

laufer: I think for each of us it's different
... I am one of the people discussing this and the metamodels that we have
... things like CKAN and Soctrata have their own metamodels

Steve: So What did you mean when you wrote the charter

<HadleyBeeman> scribenick: hadleybeeman

phila: It began as the "Open Data Best Practices" working group. But discussing it, it became clear that was too narrow.

… for example, one of the papers presented at that workshop were from Fujitsu, who use open data to augment a private system. Healthcare needs. The technology they use is the same, whether open or not.

… So, from a tech point of view, any distinction is irrelevant.

… A lot of our use cases will be open data, but we must not exclude non-open data.

<PhilA> yaso: I'd like to suggets a radical way through this

<scribe> scribenick: phila

UNKNOWN_SPEAKER: I think today we can talk about data on the Web, not open data

yaso: There's too much focus on open data. We spent too much time talking about publishing and not eneugh about reuse

<HadleyBeeman> steve: w3C does web standards. A city may do a lot of things before publishing data that may be out of scope for the W3C. Does "Data on the Web" show the distinction between what a group does before publication, vs out on the web?

<HadleyBeeman> Laufer: When you want to put this data on the Web, you have to make a transformation. This is the issue here. Our recommendation.

<HadleyBeeman> Phila: That's not ON the web — that's using the web as a glorified file transfer system

<ericstephan> Do we need a "Data on the Web" definition document to bound this? CSV on the Web made something similar to define and bound tabular data http://w3c.github.io/csvw/syntax/

RickRobinson: I'll start from the city councils that I work with
... they have a lot of data that they want to open, but don't have the tech skills to do some of the things we talk about
... I think it would be helpful to have a common language for tech and non-tech people

PhilA: Nods to 5 stars of LOD

RickRobinson: There's a section in the UCR on the revenue models tha implies that open data is freely available
... that's contentious
... is this WG getting into this area?

PhilA: Nods to recent Web payments Wworksho[p

ericstephan: Do we need a data on the web definition doc
... in the CSVW WG we found ourselves in the predicament of defining tabular data - surprisingly
... in science we're always pushing at the edge of the definition
... so maybe we need a separate definition doc to define that

antoine: OP/Data I agree that we should try to ignore it for now. Maybe consider it later
... and then see what others have done
... as for data on the Web... it's up to us to make it better, not say that your council's PDF is bad
... You started with data vs data on the Web - but in our rec we should make sure that some of the work applies earlier in the process
... if we want them to be implemted properly, X needs to be done before it makes it to the Web

Steve: I think there's way to do this. The charter preserves scope but gives us a limit to where we can go. i.e. not reinventing data dictionaries for mainframes
... but we can say what we expect to find in data and metadata on the Web
... we expect lineage, names etc.

PhilA: We can recommend methods for doing that

yaso: I want to highlight the definition of open data. I don't want to discuss those now

<yaso> http://opendefinition.org/

yaso: it's another reason for me to forget about OD for now
... we don't want to discuss licences?

Steve: Some of our use cases are open data oriented?

yaso: I want some use cases on closed data

Steve: You want the WG to cover non-open data? Copyrighted data?

<ericstephan> Agreed Steve to not covering copyrighted or closed data

yaso: There are many licences that offer less than opnness

RickRobinson: Is your point that there are issues that are technical?

<BrianMatthews> +q

RickRobinson: and these are separate from open data in a legal or financial issues?

yaso: Yes
... I can collect data from my car and put it on the Web, say with a CC licence
... maybe CC-NC

markharrison: Going back to what Steve was saying, I'm not sure we need to differenitate between open/closed, dumps.
... what about liability issues?
... there's new EU food labelling obligations saying that it must be available online and the same as found on the packaging
... the retailer may want to reuse/reference that data in their own site , and there can be apps that reuse it
... so we need to think about licences, yes, but also liability, up to date

CarlosIglesias: On open/other - I tend to agree that from the tech perspective that's not important
... we also have the linked/non-LD approaches
... most of us work with LOD
... but it's not only about technology
... most of us are familiar with the underlying principles
... we often ttalk about data not following 5 Star paradigm

->http://5stardata.info/ In case anyone here is unfamiliar with the 5 stars of open data

CarlosIglesias: There are some basic principles that we all agree on? Licences etc?
... we are already not focusssing on non-tech issues
... We are already talking about all these points, open government principles, bring value to open data

<CarlosIglesias> http://www.w3.org/TR/gov-data/

<CarlosIglesias> https://public.resource.org/8_principles.html

<ericstephan> To me if data is not following 5 star, it is not in scope of this group that means not even meeting the 1 star criteria. How could it be data on the web then?

Steve: Tells story about Long Beach defibrillator.
... ideas was to publish data on where these things are around the city
... what happens if the data is wrong because it was used and not put back
... need to have indemnity
... we can't stipulate that, but we can't offer advice on legal areas
... So I think we can be aware of these issues

<MakxDekkers> lost my conection. Now conference bridge does not allow dial-in: this conference is restricted at this time

Ig_Bittencourt: I think we should ignore the open/non-open distinction
... it's not important for us as such

<HadleyBeeman> oh dear, makxdekkers. We'll try to sort it in 2 or 3 minutes

<Zakim> HadleyBeeman, you wanted to ask about the technical differences between open and closed (closed licenced) data

<MakxDekkers> I am back on hangout with sound

HadleyBeeman: I keep trying to work out what is tech and can be written and what is out of our scope
... We can provide the mechanism through which people can describe how accurate/reliable their data is
... what they say is out of our ken

BrianMatthews: I also wanted to reiterate that open data might be a red herring
... from the science perspective, the data might be free but it's not necessarily open
... sometimes specific people/groups are able to read the data
... doesn't matter about that here - it's delivered by Web protocols
... as for whether we should have stars etc. We should look for best practices that help take people forward

<Ig_Bittencourt> +1 BrianMatthews proposal

BernadetteLoscio: I think we're talking about data which can be unstructured, structured. I don't think we're concerned about non-structured data
... Structured data maybe a relational data, Excel etc. Non-structured can be anything such as text
... When we start to think about this we come back to some of the principles of open data

<Vagner_Br> +1 to support the idea of having principles

BernadetteLoscio: Maybe we should think about principles for data on the web which may not be the same as principles of open data

<scribe> scribe: Ig_Bittencourt

<Ig_Bittencourt_> PhilA: Does everybody know what 5 start data are?

-> http://5stardata.info/ 5 star LOD

<Ig_Bittencourt_> ... the difference we might have to change

<Ig_Bittencourt_> ... first of all about open and close

<Ig_Bittencourt_> ... there is another 5 start which is useful is

Tim Davies 5 stars of data engagement http://www.timdavies.org.uk/2012/01/21/5-stars-of-open-data-engagement/

<Ig_Bittencourt_> ... not just about use the data but feedback

<Ig_Bittencourt_> and these are other 5 start that could be useful

<ericstephan> Available on coffee mug! :-) http://www.cafepress.com/mf/45953815/five-star-linked-data_mugs?productId=480759174

<HadleyBeeman> spot on, ericstephan!

<markharrison> http://www.opendataimpacts.net/engagement/

<Zakim> CarlosIglesias, you wanted to talk more about structured vs. unstructured data

<Ig_Bittencourt_> CarlosIglesias: I would like to get back about the format

<yaso> Antoine: This is the original from TimBL? http://www.w3.org/DesignIssues/LinkedData.html

<Ig_Bittencourt_> ... a PDF already could be data on the web

<Ig_Bittencourt_> ... PDF structured or non-structure data depends how you publish it

<Ig_Bittencourt_> ... it is not about how the data is, but how to use it

-> http://www.w3.org/DesignIssues/LinkedData.html The original LOD definitions from TimBL

<Ig_Bittencourt_> ... my point here is not to discuss philosophical points

<Ig_Bittencourt_> ... but we need some definition and background definition

<Ig_Bittencourt_> ... to build the best practices

<Ig_Bittencourt_> ... every one has a difference understanding about data

<Ig_Bittencourt_> Vagner_Br: I want to go back about deirdrelee presented

<Ig_Bittencourt_> ... i would like to understand more about the methodology

<Ig_Bittencourt_> ... about challenges

<Ig_Bittencourt_> .. when we are talking about challenges

<Ig_Bittencourt_> ... we want to reach some certain points

<Ig_Bittencourt_> ... i would like to support the idea about define basic principles

<Ig_Bittencourt_> ... if we define any challenge without a basic reference or common definitions could be bad

-> https://docs.google.com/spreadsheet/ccc?key=0AhTZf3B9yQ3odGVvU3pBazFsY3pyUVppNDFSZGtyQkE&usp=sharing&richtext=true#gid=2 Challenges

<Ig_Bittencourt_> ... for instance, interoperability, data granurality

<Ig_Bittencourt_> ... another example is like privacy

<Ig_Bittencourt_> ... what are the basics about privacy

<Ig_Bittencourt_> in order to define the challenges

<Ig_Bittencourt_> ... even licenses

acl laufer

<Ig_Bittencourt_> laufer: i think we have some issues here

<Ig_Bittencourt_> ... that are the formats of the data

<Ig_Bittencourt_> .. we can have some information on a PDF

<Ig_Bittencourt_> ... or in a CSV

<Ig_Bittencourt_> ... but we are talking about the conteent and if it is relevant or not

<Ig_Bittencourt_> ... it is valuable if the data is not structured

<Ig_Bittencourt_> ... but if it is relevant.

<Ig_Bittencourt_> ... we can have the CSV

<Ig_Bittencourt_> ... if we have a recommendation bout pdf we can achieve

<Ig_Bittencourt_> ... so I think that is not related to the format

<ericstephan> just a few thoughts...I think we need to discuss data on the web, not data near the web, but when it is on the web, what are the best practices?

<Ig_Bittencourt_> ... if a human can extract the information, it does not matther

=== 10 Minute Break ===

<Ig_Bittencourt_> HadleyBeeman: we have 10 minutes stop

<Ig_Bittencourt_> HadleyBeeman: welcome back

<ericstephan> I am drinking espresso

<ericstephan> 2.5 hours sleep and a full day ahead after this :-)

<Ig_Bittencourt_> HadleyBeeman: useful discussion about the scope

<Ig_Bittencourt_> ... we could spend next days talking about what is useful

<Ig_Bittencourt_> .. about what are the use cases

<Ig_Bittencourt_> ... i would like to ask to the editors about the use cases and what is useful.

<Ig_Bittencourt_> deirdrelee: points related to the scope

<Ig_Bittencourt_> ... but also about the methodology

<scribe> scribe: PhilA

<scribe> scribe: yaso

Ber looking in to the use cases, we tried to collect the main problems

Bernadette: nadette: looking in to the use cases, we tried to collect the main problems

<Ig_Bittencourt> scibe: Ig_Bittencourt

…what we need to have clear: what do we want from the use cases document?

<Ig_Bittencourt> BernadetteLoscio: we have to agree that the goal of the uc doc is about the potential BP

… help to identify potencial best practices? What kind of infomation we can extract from the UC elements?

<Ig_Bittencourt> .. and them we can look and ask if the information we have is enough

… do we need something else?

…I think these are important questions fot us

<Ig_Bittencourt> deirdrelee: what do we want the group to achieve...

<Ig_Bittencourt> ... BP in terms of challenges

<Ig_Bittencourt> ... it would based on different levels

deirdreelee: what do we want the group provide? These challenges are the core, but they can be based on use cases of different levels

…the best practices can be focused on the moturity of the publisher

<Ig_Bittencourt> ... the BP could be about different levels

<scribe> scribe: ig_Bittencourt

markharrison: we have to thing not just about people publishing data on the web
... but also on both sides

PHIL: for me it is more about preserving data on the web
... characteristics of the data
... my suggestion to the group is about information management
... and information dissemination
... and how to best take care of the data

<ericstephan> +1 PhilT

PHIL: if we look at the practice of generating data
... it is about the creation and structuring of the data
... how do we make the reference correctly

<ericstephan> I was +1 PhilT comment

<ericstephan> :-)

PHIL: how do we now when it is not relevant anymore
... the reason could work on collecting data...

antoine: there is an agenda about life cycle

<BrianMatthews> +1 PhilT

antoine: we don't actually know about BP or requirements
... I would hope that this end up in requirements

<PhilA> +1 to antoine around the data lifecyle, which is also a +1 to PhilT

<yaso> +1 to this

antoine: i would keed in challenges now.

HadleyBeeman: how much do we do

<MakxDekkers> +1 to antoine

deirdrelee: I think it would be good to keep it open

Vagner_Br: I also agree that we need to add more UC
... if we want to publish BP
... we need more UC
... such as from Asia or Africa

<ericstephan> good point+

deirdrelee: we need the foundations of BP
... perhaps we could start about 5 stars
... the more stars you have the more you have BP

HadleyBeeman: it is about the maturity of the data.

BernadetteLoscio: It also if a beginner wants to publish data on the web
... he would like to publish data on the web based on the BP
... so he could be interested on data integration
... for example, A small stratup publishing data on the web and could not be interested on advanced points.

deirdrelee: meybe the BP could be to try to encourage easy way about data on the web

BernadetteLoscio: It depends on the scenario
... if you want to publish a dataset

<ericstephan> For beginners are there "core" best practices that could be recommended? Interesting Deirdrelee

BernadetteLoscio: and i think it is more about the problems when you have lot of projects
... you know how to solve some simple problems
... and you don't know how to solve big problems.

<Zakim> CarlosIglesias, you wanted to comment on data publishing, consuming and the full data cycle of life

CarlosIglesias: I really like the way the discussion is moving
... from technical issues
... about the preparation of the data
... I think that was my ambition about the BP group
... and it is connected with my initial point
... about guidance
... I think it is really important we agree about the scope
... for example we don't have any uc about people demading data trying to reuse not from data publishing perspecctive
... it is the first think is about data publishing

I think we should be working on this

scribe: licensing issues
... we are technical group
... and there are other groups that can point about this

laufer: ack laufer
... I think we have to make recommendation about the distribution of data
... and we have to do recommendation about the way they link data
... if the tools can make it easy to do
... so I think we can forbidden about publishing the data
... and we need to make recommendation about the nature of the data too
... another one is about the skeleton of the UC
... we could have a running example
... we have the skeleton and we don't have an example about the UC

ach PhilT

<PhilA> scribe: yaso

<Ig_Bittencourt> Thanks.

ig_Bittencourt: -)

…Second one is Value

scribe: You can get value proposition on the use cases

… the 3rd one is: we have to construct credibility to this BP

PhilA: there’s a lot of agreement in the group, this is positive.

<Zakim> PhilA, you wanted to suggest a DWBP Primer?

…Bernadette’s lifecicle is really important

…the cicle arount it is really important

<ericstephan> +1 DWBP Primer

…we’re trying to get some use cases on the developer’s point of view

<HadleyBeeman> This is what PhilA is referring to: http://www.w3.org/TR/mobile-bp/

PhilA: There’s a section on the Best practices document that says: this is how to get value..

…I think i’d rather see this in the Best Practices Document

…so that’s kind of a beginner’s guide

<Ig_Bittencourt> +1 about beginners guide

+1 to Markharrison

<Zakim> PhilA, you wanted to talk about ShEX etc

PhilA: we’re not talking about testing

…it’s about the ability to say “for this tool, you must to include the title…”

<HadleyBeeman> PhilA: New W3C working group coming on RDF validation.

…this is going to be useful for us

<HadleyBeeman> We are looking at this: https://docs.google.com/spreadsheet/ccc?key=0AhTZf3B9yQ3odGVvU3pBazFsY3pyUVppNDFSZGtyQkE&usp=sharing&richtext=true#gid=3

<ericstephan> loading very slow :-)

<HadleyBeeman> Sorry, ericstephan — we're going to the document linked from the bottom of the Dimensions tab

<ericstephan> okay thanks

<newton_> https://docs.google.com/presentation/d/13gakj4BzYcAMf1NCNIpXXpPwr35qkVwfKgGbsZ-fpHE/edit#slide=id.p

<newton_> Link to the slides

Bernadette: the description is there. We had a discussion about the mais steps or how can we organize the steps of publishing and using data on the web

…we have 4 steps

…we can have more steps

…if we need it

<PhilT> I would change "Data Usage" to "Data Application and Management"

…we can have best practices for each step

…these steps are related to the challenges that we identified

…how we relate the challenges to each one of the steps

scribe: we’re talking about this, so if you want to do something about this, use that BP.. Like a framework

Hadley: how the use cases fit your spiral

HadleyBeeman: my question is more about “Data Usage"

…is there any difference between data usage and dara reuse?

PhilT: reuse is a pottencial not an action

…in software engineering the term “use” is a verb

<Zakim> HadleyBeeman, you wanted to ask about "data usage" as a term

…and the term “Reuse” is a potencial so it’s a property, not a verb

PhilA: my understanding is about the source of the data
... how that we know that we included everything?
... somewhere in this feedback it should say: it
... having metrics for the value of this

Bernadette: I’m not sure if everything is there

PhilA: how do we know that we’ve got enought?

Bernadette: maybe this is a draft

…maybe we need a methodology for our work

…I’m not sure if this would be a problem

<antoine> for the record, philT's 3 points: invariance, value, measurement

PhilT: for me that slide represents a data lifecicle

tks antoine

<HadleyBeeman> Yaso: About the data usage problem: we have to worry about the provider of the data and the people using the data.

<HadleyBeeman> … They are not necessarily the same people.

<PhilA> Here are some graphics of life cycles... https://www.google.com/search?q=lod2+data+lifecycle&client=opera&hs=S0b&channel=suggest&tbm=isch&tbo=u&source=univ&sa=X&ei=RFA5U_blL8boywOouoHoDg&ved=0CGQQsAQ&biw=1366&bih=577

<ericstephan> +1 Yaso

<HadleyBeeman> … We have to find a place in this lifecycle to differentiate the two personas.

<markharrison> +1 Yaso

<HadleyBeeman> … We have already referenced work in the Linked Data working group about lifecycles of data. Michael Hausenblas did it, if I remember correctly.

<ericstephan> Many times consumers are not considered from the producers perspective.

PhilT: we shoul look http://www.slideshare.net/mediasemanticweb/linked-data-life-cycles

http://www.slideshare.net/mediasemanticweb/linked-data-life-cycles

<antoine> philT: we need to refrrence other lifecycle definitions. it s a matrr of reputation for the group

<HadleyBeeman> Yaso: The linked data lifecycle isn't the same but there are many intersections for us.

<PhilA> -> GLD Best Practices http://www.w3.org/TR/ld-bp/

<ericstephan> losing sound?

CarlosIglesias: I would add more about sources

<CarlosIglesias> Just one Open Data lifecycle more http://www.slideshare.net/carlosiglesiasmoro/estrategias-open-government-data (in Spanish, but happy to elaborate on this if needed)

Bernadette: this is about the use cases that we have now

<PhilA> The LOD2 Data Life Cycle

<ericstephan> Yes thank you

Deirdrellee: maybe it’s just a way of thinking about it

Bernadette: this is just a draft, we have to work on it

<Zakim> CarlosIglesias, you wanted to talk about a similar model he has been applying in OD projects

CarlosIglesias: I really like the model, because we’re building on that is similar

<deirdrelee> There are a lot of 'data use' use-cases as well as data publication use-cases

…the main differences is that we are working on indicators, an actions depending on the indicators

<deirdrelee> use-cases can cover multiple stages of the data life-cycle

…also, we are talking about licencing

…what are the current licence issues within data usage

…I miss a lot of things in data reuse, because we have to thin on data engagement on 5 stars

… I really like the IBM ??

<CarlosIglesias> http://www.businessofgovernment.org/report/designing-open-projects-lessons-internet-pioneers

s/??/http://www.businessofgovernment.org/report/designing-open-projects-lessons-internet-pioneers

Vagner_br: I can see that interoperability, I can see in this lifecicle here, we should think about other elements like updating data

Bernadette: this are the main actions, and the lifecicle is based on it. We have 2 things: the elements and the process

<CarlosIglesias> Previous reference is on the role of openness and user engagement for the success of the Web and it can be applied also to data reuse

…for example: versioning it’s a process. Interoperability can be a principle

scribe: Machine-readability it’s a principle

…I think there are different things: aspects, processes and ???

…I call these dimensions. There is the aspects elements

Bernadette: I think we can divide in to principles and processes,

…for example: traceability is a principle

…data versioning… I’m not sure if this is a principle

…we have some aspects, or some elements that we have to look at

…we have problems, for example: heterogenity: it will be there

vagber_br: my question is about data collecting

…should we now consider some other aspects, like the management aspects of how the data is available before we can collect them?

…consider some ecossystem aspects

…because in this lifecycle we are considering that data is available

…the government shoul consider legal aspects, for example

HadleyBeeman: it’s off the web

PhilA: these are issues that you have to consider, but these are legal aspects and W3C will not deal with it in this WG

deirdrelee: we have to go back to the use cases requirements, based on the life cicle

<Ig_Bittencourt> +1 deirdrelee proposal

…do we need more use cases? Maybe we’re not in position to answer it now, but we can decide if we need more use cases if we look at the challenges

<HadleyBeeman> yaso: Principle that fits in best practice for data collection are the problems of performance for REST APIs, for example.

<HadleyBeeman> … When we collect data, we have to think about how big this dataset will be. Data for four cars is one thing; data for billions of cars, using REST APIs — how will we do that? And how will it affect the performace of applications using this data?

<PhilA> +1 to Yaso

<PhilA> (and +1 to Deirdre's plan too, hence I took myself off the queue)

<HadleyBeeman> +1 to yaso

PhilT: using use cases is generally a tool used to set up a scope

<CarlosIglesias> +1 to Deirdre's plan and to Phil's plan of taking myself off the q for that

<ericstephan> +1 to yaso,

PhilT: we have to look for properties

…best practices should be aplied to all this range

<ericstephan> scale versus efficiency

<PhilT> OSI Data management and interchange - http://www.iso.org/iso/home/store/catalogue_tc/catalogue_tc_browse.htm?commid=45342

<CarlosIglesias> +1 for starting with group discussion on challenges

<PhilT> Organization for the Advancement of Structured Information Standards - https://www.oasis-open.org/

Dinner venue

<PhilA> Jamie Oliver Italian

<PhilT> Other - perhaps useful URL's - http://www.usgs.gov/datamanagement/index.php

<HadleyBeeman> Break for lunch — back for 14:00 BST

<PhilA> === LUNCH ===

<ericstephan> ok

<PhilT> Also perhaps - http://www.dama.org/i4a/pages/index.cfm?pageid=3364

<JoaoPauloAlmeida> : I am trying to get online on the conference bridge and google hangout but can't

<JoaoPauloAlmeida> are you in session?

<JoaoPauloAlmeida> ok, I see you are probably breaking for lunch. Please let me know when the google hangout is back on!

<PhilA> == Starting Again==

<JoaoPauloAlmeida> Hi, I see you are back from lunch :-)

<gatemezi> We are alone in the call with JoaoPauloAlmeida ?

<JoaoPauloAlmeida> PhilA, are you dialing in so we can hear what's going on in the room?

<markharrison> PhilA is dialling in

<JoaoPauloAlmeida> ok thanks

<ericstephan> I will have to leave the meeting early to attend a Force 11 Implementation telecon at I believe 4pm London time.

<deirdrelee> New google hangout for the afternoon: https://www.w3.org/2013/dwbp/wiki/London_2014#Google_Hangout

<PhilA> scribe: Caroline_

Challenges

<PhilA> Looking at second sheet on the challenges

deirdrelee: we don't have to go into details on how to solve all the problems, it is more about the scope
... regarding the challenges we can go through one by one
... metadata
... 1st challenge is on metadata
... do we need to put more details?

<PhilA> Guest: Jeremy Debattista

<PhilA> ODI Certificates

Steve: When you publish the data some people are deciding what to publish, but none of these is documented

<PhilA> The Provenance Ontology

Steve: Chicago decided to build its on metadata to describe its data
... NYC also did it

<JohnGoodwin> I just asked about the metadata work discussed here could/should, for example, fit in with other initiatives like INSPIRE

Laufer: I think we have 2 issues: metadata is not standarized and if it is machine readable
... we will decide the format of the metadata?

<markharrison> http://inspire-geoportal.ec.europa.eu/

Hadley: do we want to take all metadata or some kinds of metadata?

<Zakim> PhilA, you wanted to pick up INSPIRE use case

PhilA: data shoujld be machine readable and also could be human readable

<gatemezi> s/shojld/should

<adler1> +q

PhilA: if we are talking about a data tool to describe data catalogues, this group would decide for dcat
... best practices includes to describe metadata, it includes data vocab

<Zakim> CarlosIglesias, you wanted to comment on metadata to add it should be unambiguous defined

PhilA: you might have general cases and some others with 80%... etc. How do you handle that

CarlosIglesias: 1. an agreement on standarized metadata, 2. a good description of metadata, 3. a machnie readable format on metadata

Steve: can we define a metadata vocab that is agnostic?
... can we provide use cases examples?
... use cases that could use different tools

BenadetteLoscio: asks CarlosIglesias what is the difference between medatada and vocab
... for example: if we describe information about hospitals

CarlosIglesias: dcat is vocab about metadata
... you may have lots of different vocabs
... there is a specific domain which is metadata, and then you have many other domains
... geographic domains for example
... metadata is just a particular use of a given vocab

BernadetteLoscio: metadata can be used to describe a data catalog
... if you are going to describe a csv file would you consider that a matadata?

<ericstephan> +q

CarlosIglesias: the data is what you have inside of the scv file
... from my perspective they are metadata

BernadetteLoscio: we have different leves of metadata, that is why I think we should have an agreement on that
... we can use vocabs to describe metadatas
... to describe specific domains it is not the idea
... defining vocabs would be interesting to understanding what kind of metadata

PhilA: we should talk with the CSV WG

<ericstephan> PhilA +1

Vagner_Br: are we saying that metadata should be standarized and readable machine format?
... are we saying that any kind of metadata standard is part of our scope or should we have at least a minimum to consider that?

<ericstephan> I think there should be a joint telecon at some point PhilA

<HadleyBeeman> to ericstephan phila: we can definitely make that happen

deirdrelee: it is a challenge according to the use cases. Now we are discussing if this should be adopted by the wg
... maybe the metadata should be generic, but should have an specific domain. That is what we should discuss with the challenges

<Zakim> PhilA, you wanted to mention something I though of earlier around BP doc

deirdrelee: what is the requirment? What part of this challenges we whant to address

PhilA: I wonder if it might be helpful to start writing best pracatices.

<gatemezi> I guess a metadata in UML is out of our scope.. but yes, if it is in CSV , it could be ok for us...

PhilA: if we know which are the best practices we want to write we might start writing them
... metadata should be available in different formates. E.g. Json
... we might find people who have implemented before we get to the end of the process
... we must see the reality

laufer: I am thinking about the granularity
... we have collections, catalogues, datasets, resources
... how can I identify the resources and put semantic on them
... how do you describe resources

<yaso> Hi nathalia

<JoaoPauloAlmeida> +1 to laufer's point

<nathalia> hello

laufer: we need a metadata to describe these things

<nathalia> the sound is not good here

<PhilA> +1 to Laufer, but I know Eric is about to talk about this point

laufer: if csv wants to describe this kind of metadata, it is a kind of transformation from scv file to another file
... if I have xml file I have another kind of metadata to transform
... How do we describe things?

<nathalia> tks Caroline

<nathalia> I'm at Hangout

ericstephan: I agree that the csv wg will be very helpful with some of the discussions
... metadata is mentioned into the use cases and became very important

<PhilA> Outputs from CSVW WG

ericstephan: the second point is from my point of view vocab is a data model
... as a tec agnostic
... terms, relationships and definitions

<laufer> +1

<PhilA> I always start vocabs that way

<CarlosIglesias> +1 to data models as the central point

HadleyBeeman: does anyone has comments about data models and vocabs?

<JoaoPauloAlmeida> what we are calling vocabularies are data models with some level of sophistication

<JoaoPauloAlmeida> this is not all a "vocabulary" could be, but seems to be the prevalent meaning of the term as it is being used

Steve: we have a mandate to create data quality, comparability and vocab

<JoaoPauloAlmeida> ... in w3c setting

<PhilA> The two vocabs are data quality and data usage, not comparability

Steve: also to define what we expect

<markharrison> i.e. provenance metadata?

Steve: I think we can do that in an open standard way so anybody can use any tec they want to

<PhilA> Steve: Talking about the vocabs we have to do. Quality and granularity etc.

<ericstephan> http://www.w3.org/TR/2013/REC-prov-dm-20130430/ PROV model

PhilA: the provenance is very important

<gatemezi> +1

PhilA: it is a huge subject and we don't have to define it
... it is stuff we can point to

<PhilA> Provenance ontology is at http://www.w3.org/TR/prov-o/

Steve: after publishing it might just be indicated the archive

<Zakim> HadleyBeeman, you wanted to talk about how we choose our vocabularies: grounding in the problems

Steve: or having a metadata filled for that
... that might be enough
... I keep bringing the open tec angle because when I talk with a city they say they are "only 4 guys" in the city and they don't have resources to study RDF, for e.g.
... they don't have resources to do the perfect job, they just do what they can
... rdf is a little academic
... the world we live people just do what they can
... I think we have a lot value to add and that is why I keep pushing to the group recommend what is useful

<Zakim> PhilA, you wanted to answer Steve's RDF points

PhilA: LA is a big city. Of course they don't know how to use RDF. But they can specify what they want

<HadleyBeeman> philA: Because they are paying for a tool, they can specify how that tool works.

PhilA: we are not going to say things aren't best practices only because some people won't use them
... they have to provide the metadata
... we must point that on the best practices
... it might have different formats

laufer: any URI can be associated to a column.
... you don't have to do it, but you may
... they do it in Socrata

PhilT: we should only not overlap the previous work

<PhilA> gatemezi: I wanted to agree on metadata - we have to help the publishers to make their metadata at least in 3 star data

<PhilA> ... I think 5 star is better of course but 3 star is a good start

<PhilA> ... and we can refer to the CSVW

<JoaoPauloAlmeida> we need to keep in mind that we need to offer a gradual path

<JoaoPauloAlmeida> for implementers of the practices we recommend

<JoaoPauloAlmeida> perhaps we can be explicit on "levels" of compliance? (are we aiming at "compliance" at all?

CarlosIglesias: we should use metadata with data to make it machine readable

<JoaoPauloAlmeida> "machine readable" is too coarse a statement, we should be more specific in our communication

CarlosIglesias: it is important not only to provide but also to lock at the demand side

<JoaoPauloAlmeida> a stream of bytes is "machine readable"

CarlosIglesias: the good thing about all this is that both solutions can be done

deirdrelee: lets try to refocus every on and then!
... we spent 45min talking about one challenge
... the other challenges on metadata are on metadata standards and how to bring them together
... how often and regularly the data is publish
... there are different challenges related with metadata
... we are missing: are not available on machine readable format
... we need agnostic models, not only RDF
... what are the actual requirements on metadata?

PhilA: would you find useful for each of these things to be treated of an issue?

deirdrelee: I think we could talk all together
... metadata should be machine readable. Would taht be enough for a requirement?

<JoaoPauloAlmeida> in my opinion that is not enough

HadleyBeeman: we might have to define machine readable at some point

<JoaoPauloAlmeida> again, a stream of bytes is machine readable (?)

<PhilA> PROPOSED: Include a requirement that metadata should be machine readable

CarlosIglesias: you can have notes on literature describing it
... maybe we should be careful to use the official meanings of these terms

<yaso> +1

<gatemezi> PhilA: could you add both human and machine readable ?

+1

<adler1> +1

<Ig_Bittencourt_> +1

<JohnGoodwin> +1

<Vagner_Br> +1

<markharrison> +1

<ericstephan> +1

<PhilA> +1

<BernadetteLoscio> +1

<adrianov> +1

<fkyanai> +1

<jeremy> +1

<MakxDekkers> +1

<CarlosIglesias> +1

<gatemezi> +1

<newton> +1

<laufer> +1

<PhilA> Resolved: Include a requirement that metadata should be machine readable

<JoaoPauloAlmeida> thanks for reading my comment :-)

HadleyBeeman: I would suggest that human readable is a separate discussion

<yaso> JoaoPauloAlmeida what about “browser-readable”? hehe

Steve: we don't yet see streaming data as part of open data
... but it might become
... as telephone crossing might become also

<Vagner_Br> As far as I am understanding the definition of machine readable format is a separate discussion

Steve: use case example: we are measuring the trafiic, polution, all kind of things
... these data can become open data

<fkyanai> I agree with Vagner_BR

Steve: there should be metadata that came from there

<JoaoPauloAlmeida> but if we say is must be machine readable, and we don't agree on what machine readable means then what we say seems vacuous

HadleyBeeman: are we asking if anyone else that creates metadata should make it machine readable

<PhilA> Semantic Sensor Network Ontology, *may* be standardised in near-future WG

PhilT: the best practices should add value in all use cases
... can you mesaure the impact of best practices
... in case of all them should be machine readable?
... there might have use cases that data are not machine readable

<JoaoPauloAlmeida> machine readable just means not in natural language? have a minimum level of structuring?

PhilT: we could understand that it could be regonized with any open standard

Steve: we often talked about the data we get from NYC are .pdfs

<PhilA> On streaming - XSLT 3 includes transformations for streaming data see http://www.w3.org/TR/xslt-30/ for more

Steve: we have to build new type of ??? that can understand metadata

PhilA: stop using pdf

<gatemezi> PhilA: that's one of our bp message... ;)

PhilT: if you follow this example. Some organization will publish information with pdf. You could use internal standards to read the document with metadata

Steve: I guess the question is: does the metadata follow the document or the repository follows it?

<Zakim> markharrison, you wanted to point out that also many companies need guidance on using (unfamiliar) Linked Data technologies - which tools?, which formats?, which vocabularies? how

HadleyBeeman: or can we say that because the data is on the web it does matter what format is inside or we should consider metadata

markharrison: wheter to put the metadata inline or rpovide it as a block
... sometimes doing inline makes it more difficult

<laufer> +1 mark

yaso: I just want to make a question
... a requirement that each resource has its metadata to be data on the web
... I understand that is to naif, but can we say: having metada is a requirement?
... if you publish a pdf should you make medatada about this content?

<HadleyBeeman> ?

yaso: can this group recommend to use metadata in this case?

<Zakim> PhilA, you wanted to comment on PDFs

laufer: yes

PhilA: of course .pdf is going to be around for a long time
... we had jimmy from adobe during the workshop
... he said what you can do with a pdf
... of course no one does it
... as long as anyone can use it should be there
... perhaps what we can say is taht if your pdf include tables, please use metadata
... give people an explation why pdf in its own it is only usable for humans

Steve: maybe a recommendation is when you scrape pdf it should have metadata

PhilA: somebody publishes a pdf and I spend the next 3 weeks reading it and I create a table based on that

<ericstephan> tracking...

PhilA: then I have to refer beack to the pdf
... and say that refers to the metadata I refered

<HadleyBeeman> scribenick: carlosiglesias

<HadleyBeeman> ericstephan: I'll take that back. We have a careful delineation in the CSV on the Web working group — if you scrape data from a PDF file,

<HadleyBeeman> … if you put it into a tabular format — that is the same as taking it from a database.

<HadleyBeeman> … I'll document this discussion. It fits with our other use cases where we're pulling data from an external source.

deirdrelee: any BPs editors yet?
... we are discussing a lot about that
... would be useful to have somebody nominated

phila: any volunteers?

<ericstephan> I'd like to help out. Sounds attractive :-)

<gatemezi> I suggest Deirdre ...

<deirdrelee> thanks Ghislain!!

<HadleyBeeman> Notes page for best practice https://www.w3.org/2013/dwbp/wiki/Best_practices_notes

adler1: we should include not only best practices but also examples of why they are useful to give background

<HadleyBeeman> Thank you so much for writing down our relevant comments on that wiki page, ericstephan. You're amazing!

<deirdrelee> PROPOSED: There should be metadata

<yaso> +1

<HadleyBeeman> +1

<JoaoPauloAlmeida> +1

<laufer> +1

<BernadetteLoscio> +1

<nathalia> +1

<markharrison> +1

<PhilA> +1

<Ig_Bittencourt_> +1

<gatemezi> +1

<adrianov> +1

<MakxDekkers> +1

<ericstephan> +1

+1

<jeremy> +1

<Vagner_Br> +1

<antoine> +1

<fkyanai> +1

<newton> +1

deirdrelee: machine readable?

everyone: already agreed

<PhilA> Resolved: There should be metadata

deirdrelee: should then include human readable requirement?

<PhilA> PROPOSED: That metadata should be human readable

<nathalia> +1

<yaso> +1

<fkyanai> +1

<ericstephan> +1

<JoaoPauloAlmeida> -1

<HadleyBeeman> -1

<MakxDekkers> -1

<gatemezi> -2

<BernadetteLoscio> +1

<yaso> use pdf to describe metadata pdf to describe metadata pdf…. that’s a (infinite) loop :-)

<deirdrelee> RESOLVED: There should be metadata

<MakxDekkers> can we say something about minimal metadata: who, what, when, where?

<MakxDekkers> who=the responsible org

<yaso> +1 to MakxDekkers

<MakxDekkers> what=at least a short description or name

hadleybeeman: lot of metadata is encoded
... not human readable

<ericstephan> Is it a best practice or common practice?

<MakxDekkers> when=date of publication

hadleybeeman: but still useful

<MakxDekkers> where=downlaod link

<MakxDekkers> Sorry no voice connection

hadleybeeman: not to mandate to be human readable

<Zakim> markharrison, you wanted to say that metadata needs to provide context (geographic scope, time range, type of data, domain-specific vocabularies used) so that similar / comparable

<HadleyBeeman> acck me

<BernadetteLoscio> +q

<JoaoPauloAlmeida> the key point is that metadata should be defined in a format that is well described

adler1: can encourage human readability

<JoaoPauloAlmeida> there must be rules for interpretation

adler1: not require

<JoaoPauloAlmeida> Hadley just exemplified that

<JoaoPauloAlmeida> She used integers with clear interpretation rules (1 for school, 2 for ...)

antoine: if it is not human readable it won't be reused

<JoaoPauloAlmeida> it doesn't have to be human readable it has to be MEANINGFUL (sorry to shout I am far away :-))

<MakxDekkers> if it is machine-readable, the machine can make it human-readable

<gatemezi> Is html document is human readable ?

<ericstephan> good point Makx

discussion on what human readable means

<JoaoPauloAlmeida> are we going to discuss human readable? we haven't finalized the discussion about machine readable? :-0

<JoaoPauloAlmeida> ... (the definition)

<HadleyBeeman> carlos: I think what we mean is that metadata should be comprehensible.

<JoaoPauloAlmeida> +1 to carlos that's UNDERSTANDABLE

<JoaoPauloAlmeida> what I meant with MEANINGFUL

<HadleyBeeman> … At some point in the chain, that metadata should be expressed such that humans can read it.

carlosiglesias: human-readable vs. comprehensive metadata

adler1: these are different things

BernadetteLoscio: it's more about metadata documentation and not human readability

<laufer> +1

BernadetteLoscio: metadata description

<JoaoPauloAlmeida> it's not an issue of cognitive limitations, it's an issue of having minimum descriptions that allows one to interpret it

<Caroline_> +1 to Bernadette

<markharrison> Encourage development of tools that make machine-readable metadata understandable to humans (even non-technical humans that don't read XML)

<JoaoPauloAlmeida> to map the data to situations in reality, to interpret it

laufer: human understandable

<nathalia> +1 to Bernadette

<MakxDekkers> the more requirements you put on metadata, the less you are going to get

BernadetteLoscio: it's about metadata documentation

<HadleyBeeman> +1 to makxdekkers

<JoaoPauloAlmeida> thanks, Vagner_Br

BernadetteLoscio: ... with description of metadata

<Ig_Bittencourt_> I think it is just like add an rdf:about to that metadata

<markharrison> Note that not all of the requirements on metadata need to fall on the data / metadata publishers - third-party tool developers can help to make metadata more understandable.

deirdrelee: human readable? well documented? easy to understand? comprehensible?

<ericstephan> Easy to understand or well defined?

adler1: nobody will understand the human-readable thing

<deirdrelee> PROPOSAL: Metadata should be well-documented and easy to understand

+1

<deirdrelee> PROPOSED: Metadata should be well-documented and easy to understand

<yaso> +1

<nathalia> +1

<JoaoPauloAlmeida> we could mention: in such a way that it can be interpreted

<Caroline_> +1

<PhilA> 0

<ericstephan> +1

philt: metadata should also be relevant

<MakxDekkers> -0

<laufer> +1

<adrianov> +1

<MakxDekkers> -1

<MakxDekkers> -1 to mark

<HadleyBeeman> makxdekkers, why?

philt: ... documentation should be relevant to the data it describes

vagner: machine readable and well documented

<MakxDekkers> general point: limit the reqs on metadata to the minimum

<JoaoPauloAlmeida> isn't that obvious (relevant?)

<JoaoPauloAlmeida> if someone publishes data that is not relevant, what are they doing?

<PhilA> I am sympathetic JoaoPauloAlmeida

overall metadiscussion about metadata discussion

<MakxDekkers> it is in the interest of the publisher to provide information that helps people understand what it is!

<yaso> https://www.lib.umn.edu/datamanagement/metadata

<Caroline_> +1 to JoaoPauloAlmeida

<JoaoPauloAlmeida> could you hear me?

<markharrison> (batteries failing on laptop again)

<BernadetteLoscio> +1

<HadleyBeeman> joaopauloalmeida: no, we couldn't. Sorry!

<adler1> +1

<gatemezi> +1

hadleybeeman: @mark and @joao please follow-up by irc

phila: like the purpouse but don't like the wording
... thinking on dublin core

<JoaoPauloAlmeida> easy to understand is not good; we should not use terms that suggest any level of cognitive effectiveness

<JoaoPauloAlmeida> scientific data is very hard to understand

<JoaoPauloAlmeida> +1 to PhilA

phila: it is something included in the metadata that makes it easy to understand

<Ig_Bittencourt_> +1 to PhilA

<MakxDekkers> PhilA isn't that obvious?

<ericstephan> If you understand the vocabulary you have a better chance of understanding the dublin core records

<JoaoPauloAlmeida> the key issue is that there should be sufficient documentation such that the intended audience can extract the meaning

<MakxDekkers> E.g a Chines publisher will provide a description in Chinese if that is the audience for it

phila: several potential audiences
... dna metadata may be not so easy to understand

<JoaoPauloAlmeida> I disagree that there is no intended audience.

<HadleyBeeman> I'm with you, joaopaulo

adler1: if we want metadata to be used and the history of the data be understood

<JoaoPauloAlmeida> You could have a very wide intended audience ("the public")

adler1: then metadata should be easy to understand

<ericstephan> I still have trouble reading the metadata about the ingredients on my cereal box

adler1: same data can be reused by different audiences

<HadleyBeeman> ericstephan: I have a food allergy, so I've put many years into becoming an expert on that :)

<JoaoPauloAlmeida> cognitive effectiveness or "ease to understand" must not be unqualified without referring to the audience

<Zakim> markharrison, you wanted to make proposal that machine readable metadata should include or refer to (*OR* be capable of being automatically transformed into) human readable

<ericstephan> :-) Getting better over time Hadley :-)

adler1: make it possible to everyone if that's useful for them

<Vagner_Br> +1 to JoaoPauloAlmeida metadata should be easy to understand to the "intendend audience"

adler1: specially important for governments

<yaso> http://irsa.ipac.caltech.edu/applications/DDGEN/Doc/ipac_tbl.html

philt: include documentation with metadata
... to avoid broken pointers

<HadleyBeeman> yaso: that link ^ is well-documented metadata

<PhilA> +1 to PhilT talking about persistent data and equally persistent documentation (I paraphrase)

<JohnGoodwin> another useful document potentially http://www.agi.org.uk/storage/standards/uk-gemini/MetadataGuidelines1.pdf

yaso: we have standards for documentation and we can make use of them

<Zakim> HadleyBeeman, you wanted to talk about intended users

<Caroline_> +1 to Hadley

<PhilA> HadleyBeeman: I struggle to think in terms other than with a user in mind. You need a target audience in mind (developer/ public etc)

hadleybeeman: it is important to know who you will be speaking to while writing documentation

<MakxDekkers> maybe also useful https://joinup.ec.europa.eu/asset/dcat_application_profile/description

<Caroline_> +1 to the 1st proposal

<deirdrelee> Option 1: Metadata should be well-documented and useful for the intended audience

<MakxDekkers> which option?

<JoaoPauloAlmeida> can you put these in the IRC?

<JoaoPauloAlmeida> thanks

<HadleyBeeman> I like option 1

<deirdrelee> Option2: Metadata should be capable of being automatically transformed into human readable documentation

<yaso> +1 to the 1st option

<MakxDekkers> option 1: define welll-cdocumented

<HadleyBeeman> I feel like option 2 is more "it would be nice if". I suspect we'll have fewer use cases for it though.

<gatemezi> +1 for option 1

<jeremy> +1 for option 1

<MakxDekkers> option 2: not necessary

<PhilA> Option 3 - Metadata should include or refer to documentation useful for the intended audience

<nathalia> +1 for option 1

<laufer> +1 for option 1

<JohnGoodwin> +1 option 1

<markharrison> +1 for option 1, +1 (nice to have) for option 2

<JoaoPauloAlmeida> I prefer option 1

antoine: if it is machine readable it will be easy to have human readable documentation

<adrianov> +1 for option 1

<MakxDekkers> metadata IS a form of (structured) documentation

<ericstephan> +1 for Option 2

<BernadetteLoscio> +1 option 1

<newton> +1 option 1

antoine: drop option 2, not necessary if we have machine-readability

<JoaoPauloAlmeida> can we generalize this to data (which would include metadata)?

<JoaoPauloAlmeida> any data (including metadata) should be structured in such a way that it is possible for the intended audience to extract its meaning; one way of doing this is to supply documentation

<antoine> +1 option3, it matches well philT's point

<JoaoPauloAlmeida> what is option 3?

philt: every description should include the provenance

phila: that will be coming

<gatemezi> Provenance should be part of the metadata..

phila: the dataset should have metadata

<MakxDekkers> +1 for provenance. it is the WHO I suggested earlier

<JoaoPauloAlmeida> PhilA, that's why I think we should generalize

<ericstephan> Provenance is a type of metadata

<JoaoPauloAlmeida> data or metadata must be meaningful to the intended audiences

phila: and on other hand data should be document

<gatemezi> And provenance used to cover most of 75% of metadata ....

<JoaoPauloAlmeida> Why don't we generalize? we don't need to go infinitely meta

deidrelee: its more about metadata that could have different interpretations
... to avoid ambiguity need good description, label is enough

<JoaoPauloAlmeida> Labels helps the intended audience to extract the meaning of data

vagner: metadata is also data

<JoaoPauloAlmeida> thanks for reading the comment (I am sorry I don't know your name)

vagner: the description of metadata should also follow data best practices

phila: when talking about metadata we don't restrict ourself about the vocabularies in scope of the wg
... but also any other

<JoaoPauloAlmeida> thanks Vagner_Br, I didn't realize it was you as the image in the hang out is fuzzy

hadleybeeman: can jump on the following challenges and come back to metadata later

<MakxDekkers> can you record what deirdre just said?

<deirdrelee> option 4: Metadata vocabulary, or values if vocabulary is not standardised, should be well-documented

<Zakim> antoine, you wanted to ask about ourrole for defining metadata

<MakxDekkers> option 4 ++++

<PhilA> +1

<deirdrelee> PROPOSED: Metadata vocabulary, or values if vocabulary is not standardised, should be well-documented

<HadleyBeeman> +1

<PhilA> +1

<laufer> +1

<JohnGoodwin> +1

<adler1> +1

<ericstephan> +1

<Caroline_> +1

<markharrison> +1

<newton> +1

<JoaoPauloAlmeida> +1 to coffee

<gatemezi> +1

<antoine> +1

<adrianov> +1

<jeremy> +1

<Vagner_Br> +1

<nathalia> +1

<MakxDekkers> +1 again

<Vagner_Br> +1 happy

<Ig_Bittencourt_> +1

<fkyanai> +1

<deirdrelee> RESOLVED: Metadata vocabulary, or values if vocabulary is not standardised, should be well-documented

<JoaoPauloAlmeida> I hope somewhere we can produce a more general statement: Data must be produced using conventions that enable the intended audience to extract its meaning; usually, this is achieve through documentation