W3C

- DRAFT -

SV_MEETING_TITLE

01 Apr 2014

See also: IRC log

Attendees

Present
Regrets
Chair
SV_MEETING_CHAIR
Scribe
Caroline, JohnGoodwin_, markharrison

Contents


<Caroline> Scribe: Caroline

BernadetteLoscio_: do we fo trhough the list or we choose some?

Ig_Bittencourt: let's start with the subjects related only with BP. If we have time to discuss the others that are also related with Q&G we will do it later

BernadetteLoscio_: We can skip metadata since we have discussed it yesterday

<nathalia> I agree

laufer_: we can check what we have written about it

<nathalia> long discussion about it yesterday

BernadetteLoscio_: check https://docs.google.com/spreadsheet/ccc?key=0AhTZf3B9yQ3odGVvU3pBazFsY3pyUVppNDFSZGtyQkE&usp=sharing#gid=5

<nathalia> which tab are we looking at?

the one above

sorry

this one https://docs.google.com/spreadsheet/ccc?key=0AhTZf3B9yQ3odGVvU3pBazFsY3pyUVppNDFSZGtyQkE&usp=sharing#gid=6

laufer_: how can we define real time?
... if we have an update time do we have the old data archived?

<JoaoPauloAlmeida> to me a challenge seems to be that data is often about phenomena in reality which change

<JoaoPauloAlmeida> so data may be added or change to reflect that

<JoaoPauloAlmeida> I can't hear you guys on the hang out

BernadetteLoscio_: real time
... if we have a dataset we can have it in a catalogue, it might be in an API

laufer_: we can have this in dataset that are not in real time
... we have a close time to update the data

<JoaoPauloAlmeida> now we can hear!!!

<Ig_Bittencourt> +1 to CarlosIglesias view.

<JohnGoodwin_> According to Wikipedia...Real Time Data: Real-time data denotes information that is delivered immediately after collection. There is no delay in the timeliness of the information provided. Real-time data is often used for navigation or tracking.

BernadetteLoscio_: real time is to update the data

markharrison_: very real time can be observation data

<JoaoPauloAlmeida> There are two issues: the data/time when the observation was made

<JoaoPauloAlmeida> The time it takes for data to reach the intended audience

<JoaoPauloAlmeida> If the the time if takes for data to reach the intended audience (from observation) is known, then, the data/time when the observation was made can be derived

laufer_: in the position of the consumer: I need data. Some data with one week of update is okay

<JoaoPauloAlmeida> To = the data/time when the observation was made

laufer_: they say they will update weekyl

<nathalia> why this a best practice?

<JoaoPauloAlmeida> Td = The time it takes for data to reach the intended audience

<JoaoPauloAlmeida> The problem is that Td is usually unknown

laufer_: if the data is 3 weeks without update could be a problem

<gatemezi> It all depend on the type of data...

<JoaoPauloAlmeida> So, the best practice is how to deal with the fact that we may need to know To

laufer_: if it is olny one data set we must garantee this data set is updated

<nathalia> yes, agree gatemezi

BernadetteLoscio_: this is one requirement
... the question is: one of the use cases is on real time.

<JoaoPauloAlmeida> Data may have to be indexed in time

BernadetteLoscio_: this data can be available

<Zakim> CarlosIglesias, you wanted to say bp should be just provide data in a timely manner and then elaborate defs or whatever on that basis

<nathalia> someone is making noise

<gatemezi> In weather domain, you might need more frequent update (10 minutes?) , while geodata for districts can be updated each year ?§

CarlosIglesias: we should try to define what best practices are
... defining what is real time

<JoaoPauloAlmeida> Zakim can't track the sound of hangout

<CarlosIglesias> http://sunlightfoundation.com/policy/documents/ten-open-data-principles/

CarlosIglesias: if you have real time data you must update it
... some data have vaule only on real
... we could have at least the titles of the best practices

<Zakim> markharrison_, you wanted to say that metadata should express frequency of updates, timestamp of last update and to say that in addition to specifying expected update frequency for

markharrison_: the metadata should express the frequency
... the expectation to provide the data with some frequency

<BrianMatthews> markharrison +1

<JoaoPauloAlmeida> yes!

<nathalia> +1 markharrison_

<JoaoPauloAlmeida> Data is a requirement: data may have to be indexed in time, in order to cope with the fact that we do not know Td

Vagner_Br: I want to support CarlosIglesias
... and to add that in terms of requirements the point is that data and metadata should be available in a timely manner
... we should change the title

BrianMatthews: the data should be availave in a determined frequency

<JoaoPauloAlmeida> I meant "that is a requirement"

<gatemezi> Is it possible to have metadata without the data is about?

laufer_: so the publisher has the obligation to do it on time

<markharrison_> Useful to declare update frequency in metadata to avoid the need to poll more frequently than the update frequency

Ig_Bittencourt: if you have data from the stock market you must update it in every 5min
... in this case, the best practices should stablished that the publisher should release the data with a certain frequency

<JoaoPauloAlmeida> but when you upadate you may: (i) add new time-indexed entries or (ii) change the content of data [no time-indexing]

<JoaoPauloAlmeida> these are two different approaches

CarlosIglesias: we can write also general best practices regarding this issue

gatemezi: Is it possible to have metadata without the data is about?

JoaoPauloAlmeida: but when you upadate you may: (i) add new time-indexed entries or (ii) change the content of data [no time-indexing]
... these are two different approaches

laufer_: I think this is an issue of archiving
... you must mantain the old data

BrianMatthews: I don't think we need to worry about the tec are being used
... we should stick to the method
... what policy and frequency the publisher

<gatemezi> +1 to BrianMatthews point

BrianMatthews: not worry about tec

<Zakim> markharrison_, you wanted to ask about support for standing query (publish/subscribe) capabilities for streaming data feeds? and to respond to laufer - it depends whether the

markharrison_: we never delete anything regarding data

laufer_: we now have to make a sentence summaring all this

<JoaoPauloAlmeida> markharrison_, if the data is not time-indexed then we have to delete!

<JoaoPauloAlmeida> so, there should be best practices for time-indexing, this is my point

CarlosIglesias: we could make an action to people detail it

+1 João Paulo

<nathalia> ok, nice

<markharrison_> PROPOSAL: Metadata should declare 1) expected/scheduled frequency of update, 2) if the dataset is journalled (i.e. no deletions, only append), 3) if the dataset is timestamped (can request data for a specific time interval), 4) actual timestamp of last update

<JoaoPauloAlmeida> +1

<Ig_Bittencourt> +1

<gatemezi> Just to understand the 2) point..you mean adding in a different URI?

<JoaoPauloAlmeida> This is a good proposal, I think we should just also note that there should be guidelines/best practices for the specification of time

<gatemezi> ..when you "append"...

<JoaoPauloAlmeida> if you point the laptop to the person with the floor, it will help us a lot (sorry to ask you guys that)

markharrison_: ir doesn't change

<JoaoPauloAlmeida> thanks

<gatemezi> No just to understand laufer_

<markharrison_> gatemezi: URI / access method for the dataset should not change, in my opinion

laufer_: we are not saying how the data will be provided to the consumer

<gatemezi> because imagine you were already consuming data in time to

laufer_: you can have an URI or an API

<gatemezi> ok

laufer_: we don't know how the publisher will define the scheme

<gatemezi> +1 then

<JohnGoodwin_> +1

<laufer_> +1

<BrianMatthews> +1

<nathalia> +1

<Ig_Bittencourt> +1

+1

<markharrison_> +1

<CarlosIglesias> +1

<Vagner_Br> +1

<nathalia> Why we don't use wiki for this?

<nathalia> Ok Carol, I understand

https://docs.google.com/spreadsheet/ccc?key=0AhTZf3B9yQ3odGVvU3pBazFsY3pyUVppNDFSZGtyQkE&usp=sharing#gid=6

it is in the Group Challenges

Caroline: Can someone put on the wiki later?

<gatemezi> RESOLVED: Metadata should declare 1) expected/scheduled frequency of update, 2) if the dataset is journalled (i.e. no deletions, only append), 3) if the dataset it timestamped (can request data for a specific time interval), 4) actual timestamp of last update

<nathalia> I can put

<scribe> ACTION: nathalia will put RESOLVED 1 on the Wiki (RESOLVED: Metadata should declare 1) expected/scheduled frequency of update, 2) if the dataset is journalled (i.e. no deletions, only append), 3) if the dataset it timestamped (can request data for a specific time interval), 4) actual timestamp of last update) [recorded in http://www.w3.org/2014/04/01-dwbpbestpractices-minutes.html#action01]

Caroline: lets talk about "tools"

<nathalia> I'm not seeing you

Vagner_Br: who could explain better what is the idea about "tools"

Tools

laufer_: we must look at the use cases to understand what are tools

<nathalia> the camera is looking to the roof

<nathalia> much better now

Ig_Bittencourt: Berna, can you explain about the tools?

<scribe> scribe: Caroline

Bernadete: is related to skill and expertise

Vagner_Br: when you talk about tools as a challenge
... how can we generalize that as a challenge?

Bernadette: I saw this in the use case from Recife

laufer_: NYC uses Socrata for example

Vagner_Br: tools about catalogue?

Bernadette: in general

CarlosIglesias: provide a single or centralized access point for the data
... could be a ckan catalogue
... or another kind
... matching access to data

Bernadette: what could be a best practices for this

Ig_Bittencourt: we should be agnostic
... a question on documentation: if we have components with APIs, we should provide documentation

laufer_: the publisher might have a best practice to publish the data
... he must choose a tool that can do what the publisher wants
... the choise of the toll will remain on what the publisher wants
... if there is a tool that can do what he or she expects
... people who have excell needs a kind of tool

<gatemezi> I am wondering if we should recommend a tool here...

laufer_: if the pulisher thinks that tool is good, we can recommend what is the best practice to choose a tool

<Ig_Bittencourt> No gatemezi. According to the charter, we need to be agnostic.

laufer_: I think we have to think about the consumer and the publisher
... the publisher has to choose a tool that can do what we want

Vagner_Br: I agree with Ig_Bittencourt and laufer_

<nathalia> agree with Laufer

Vagner_Br: it is hard to find any kind of requirements

<gatemezi> so maybe we can skip this and come back later ?

Vagner_Br: even if we must be agnostic
... if we say any kind of tools you use should be well documented
... if you don't want to say that just drop this topic

<gatemezi> A tool can also be an implementation of our bp

Caroline: we should try to resolve this
... at least to have a way to go

BrianMatthews: we can write a recommendation about a method description about a dataset
... we can say a dataset could publish a data description

<JoaoPauloAlmeida> I can't hear a thing

laufer_: this is a requirement of the publisher
... the tool he will choose it will understand what he needs

<nathalia> +1 João Paulo

laufer_: we can make a recommendation that he should use
... you should interoperate with the tools in a standard way

Caroline: has talked already

<Zakim> CarlosIglesias, you wanted to say the tool is just a mean, the bp is provide centralized access to data and to compare this discussion with the a11y use case

CarlosIglesias: the best practices does not require any tool
... sometimes it will be an API, sometimes data catalogue, or another thing

<Ig_Bittencourt> BrianMatthews, you meant we could make a recommendation about metadata about the tool used to publish the data?

CarlosIglesias: the BP could provide different set of tools
... we could do something similiar to??
... one one side you have a content for a city guidance
... on the other side after creating it we can have a data tools guidelines

<Zakim> markharrison_, you wanted to say that Tools are very useful for helping data publishers to check that translation of data into different formats retain their meaning

markharrison_: tools are very useful for data publishers to expose data in multiple formats

<CarlosIglesias> similar case to WCAG and UAAG use case

markharrison_: but then can also be useful to check that the meaninful of the data is not lost

Vagner_Br: we are not defining any kind of requiremnts, only few recommendations

laufer_: we don't have to say what is the tool
... but a best practice to use the tools

<CarlosIglesias> one is general best practices and the other is about how tools should implement best practices

<CarlosIglesias> we can follow here a similar approach

<gatemezi> ok, waiting for the proposal..

<CarlosIglesias> proposal: bp is to provide a single access point for data

<JoaoPauloAlmeida> ?

<JoaoPauloAlmeida> I don't understand the proposal

<gatemezi> Me neither

<CarlosIglesias> is a general bp for providing an access point (i.e data catalog, api, sparql endpoint, etc.)

<CarlosIglesias> technology agnostic

<JoaoPauloAlmeida> but "single" is quite strong

<CarlosIglesias> centralized?

<CarlosIglesias> is that better?

<JoaoPauloAlmeida> centralized to me is not good, ... the web has a distributed nature

<nathalia> I think centralised is not good

<nathalia> agreed to JoaoPaulo

<Ig_Bittencourt> +1 to JoaoPauloAlmeida

<Vagner_Br> New text is coming out

<markharrison_> Proposal: Data might be provided via various access mechanisms including (but not limited to) Data catalogues, APIs, SPARQL endpoints, REST interfaces, dereferenceable URIs - and best practice is that data publishers should make use of available tools to support multiple access mechanisms

<JoaoPauloAlmeida> ok

<nathalia> much better

<JoaoPauloAlmeida> now I get it

CarlosIglesias: "single' or "centralized" means to catalogue

ack

BrianMatthews: could we provide a mechanism vocab?
... regarding centralized issue
... if you can find a description of the dataset you can find them in different places

laufer_: specification HYDRA is a way to specify APIs

<laufer_> hydra

<BrianMatthews> HYDRA

<Ig_Bittencourt> http://www.hydra-cg.com/spec/latest/core/

<JohnGoodwin_> could we extended VOID http://www.w3.org/TR/void/#access

<laufer_> is a way of describing web apis

<Vagner_Br> ack meq?

JohnGoodwin_: maybe we could extented to VOID

+w

+q

<gatemezi> In Dcat,http://www.w3.org/TR/vocab-dcat/ there are different ways to access a dcat:Distribution

<gatemezi> ..like dcat:accessURL , API and so on

<markharrison_> Proposal 2: There is value in provision of a small number of well-known data catalogues - and registration of data with such catalogues (or auto-discovery and indexing/classification by such catalogues based on published metadata) - so that data can be found easily

Caroline: we have another proposal "2"

Vagner_Br: I don't think makes sense to talk about centralized data
... I agree with JoaoPauloAlmeida that centralized is aginst the spirit of the Web

<gatemezi> Second proposal 2 does not mention "tools" ?

<JoaoPauloAlmeida> +1 to Vagner_Br and markharrison_'s first proposal

<nathalia> I don't understand this proposal

<markharrison_> Proposal 2 is additional - to try to address concerns expressed by CarlosIglesias

<nathalia> ok

<gatemezi> +1 for the first proposal

ack to vote

<nathalia> proposal 2 is not related to tools

<JoaoPauloAlmeida> the text of the 2nd proposal is obscure

BrianMatthews: proposal 2 is like saying we should not have more than a few catalogues

+1 JoaoPauloAlmeida and Brian

<nathalia> it is not related to the discussed topic

<JoaoPauloAlmeida> is not about tools, but about discoverability

CarlosIglesias: we should think more about the feredations concept

laufer_: we are talking about tools to provide access
... how we are organizing this information

<BrianMatthews> Reworded: Proposal 2-a: The registration of data within data-set catalogues (or auto-discovery and indexing/classification by such catalogues based on published metadata) should be supported so that data can be found easily.

laufer_: maybe we could change "multiple" for "federated"

<Vagner_Br> Proposal 1: Data might be provided via various access mechanisms including (but not limited to) Data catalogues, APIs, SPARQL endpoints, REST interfaces, dereferenceable URIs - and best practice is that data publishers should make use of available tools to support multiple access mechanisms

<markharrison_> +1 to BrianMatthews proposal 2a

<nathalia> I think the text is better now

<CarlosIglesias> +1

<Vagner_Br> +1

+1

<gatemezi> +1

<markharrison_> +1 to Proposal 1

<laufer_> +1

<BrianMatthews> +1

<nathalia> +1 to proposal 1

<Ig_Bittencourt> +1 to Proposal 1

<CarlosIglesias> proposal: to further discuss the federation concept in relation with previous proposal

<JohnGoodwin_> +1

RESOLUTION: ata might be provided via various access mechanisms including (but not limited to) Data catalogues, APIs, SPARQL endpoints, REST interfaces, dereferenceable URIs - and best practice is that ddata publishers should make use of available tools to support multiple access mechanisms

<laufer_> +1 to carlos

<CarlosIglesias> +1

<Vagner_Br> for voting now proposal 2: to further discuss the federation concept in relation with previous proposal

<CarlosIglesias> +1

<Vagner_Br> +1

0

<JoaoPauloAlmeida> ok, after the break will the whole group reconvene?

<JohnGoodwin_> +1

<nathalia> According to the agenda

<Vagner_Br> For voting now Brian's proposal Proposal 2-a: The registration of data within data-set catalogues (or auto-discovery and indexing/classification by such catalogues based on published metadata) should be supported so that data can be found easily. [07:39] <@Caroline> ... maybe we could change "multiple" for "federated"

<markharrison_> +1

+1 to proposal 2-a

<nathalia> we continue the discussion in groups

<laufer_> +1 to brian

<BrianMatthews> +1

<Vagner_Br> +1 for prosal 2a

<Ig_Bittencourt> +1 to Proposal 2-a

RESOLUTION: The registration of data within data-set catalogues (or auto-discovery and indexing/classification by such catalogues based on published metadata) should be supported so that data can be found easily. [07:39] <@Caroline> ... maybe we could change "multiple" for "federated"

<Vagner_Br> RESOLVED to further discuss the federation concept in relation with previous proposal

<Vagner_Br> RESOLVED: to further discuss the federation concept in relation with previous proposal

<nathalia> it is time to break?

<HadleyBeeman> Yes, nathalia — I think they are getting coffee

<nathalia> ok

<scribe> Scribe: JohnGoodwin_

<HadleyBeeman> Your minutes are here: http://www.w3.org/2014/04/01-dwbpbestpractices-irc

<JoaoPauloAlmeida> yes

Privacy/security

<JoaoPauloAlmeida> I am a total newbie in this topic

Caroline: is everybody awake?

<Ig_Bittencourt> Capability URLs: http://www.w3.org/TR/capability-urls/

Ig_Bittencourt: we could look at capabiliy URLs as a means to hide data on the web

<JoaoPauloAlmeida> I think we should start by reviewing the challenges in the spreadsheet

<JoaoPauloAlmeida> as listed in the spreadsheet it is not about confidentiality-integrity and availability of the data itself

<JoaoPauloAlmeida> it is about the content of data

markharrison: these are a form of obfscuation?

Ig_Bittencourt: yes

laufer_: we can't have private data published on the web

Ig_Bittencourt: can provide access to a group of people

<Zakim> markharrison, you wanted to say that capability URLs only work for one-time access in a limited time window

markharrison: capabilities URLS only work for one time access, but not for repeated access of data
... we have to respect data protection legislation

Ig_Bittencourt: example - data from health areas
... medical history of individual people

laufer_: no use cases for publishing such personal data due to issues of privacy
... what is the metadata about privacy and security?

Ig_Bittencourt: also related to quality!?

<Zakim> CarlosIglesias, you wanted to comment on his vision on this (1) published data by default with only limitation of privacy/security (2) data combination and merging (3) personal

CarlosIglesias: three things here

<Zakim> markharrison, you wanted to say that often the solution / best practice is to publish aggregated data at coarser granularity, so that the original sensitive raw data cannot be

Vagner_Br: privacy and security is data publisher responsibiliy - no our concern to make any recommendation/requirement?

markharrison: if publisher is aware raw data is sensitive then they have responsibility whether to publish fine grained data or aggregate so cannt identify individuals

<Caroline> +q

BrianMatthews: are people aware of P3P? W3C initiative closed some years ago.

<BrianMatthews> http://www.w3.org/P3P/

BrianMatthews: guidelines about privacy etc.
... P3P could provide useful material to consult

<Caroline> +q

Laufer: best practice is that publishers should provide mechanisms to control privacy

<markharrison> +q to say that an example of commercially sensitive data is serial-level traceability data (which can reveal inventory volumes, trading relationships, flow patterns) - such data is shared on a 'need-to-know' basis but consumers might be interested in a high-level summary of this data, without needing access to every observation event

<BrianMatthews> P3P is more about gathering and using private data - focussed on data consumers rather than data providers

Caroline: proposes two directions: 1) Principles for person data, and also consider technical issues e.g. P3P

<Zakim> markharrison, you wanted to say that an example of commercially sensitive data is serial-level traceability data (which can reveal inventory volumes, trading relationships, flow

markharrison: need permissions
... it's complicated

<Zakim> CarlosIglesias, you wanted to say minimum requirement is legislation

CarlosIglesias: minimal requirement is to comply with data legislation

<Zakim> need, you wanted to respect Data Protection legislation - especially for personally identifiable data

<Zakim> markharrison, you wanted to ask if there is a Freedom of Information aspect to request not only what data a company collects about an individual - but to ask whether - and where

markharrison: is there consequency for FOIs for publishing data on the web?

<laufer_> provide different permissions to access data

laufer_: why/how can we give permission to access data?

<nathalia> I'm not listening you

<Zakim> Vagner_Br, you wanted to say Data Security/Privacy is a matter of eithe public legislation or internal policy. Shouldn't we avoid any requirements? Should we make any

Vagner_Br: data security and privacy are topics for government legislatoin and interal policy of organisation

<nathalia> \o/

BrianMatthews: are we in a good position today to make concrete recommendations - should we park this discussion for now?

+1

<markharrison> +1

<Ig_Bittencourt> +1

<Zakim> markharrison, you wanted to suggest considering the concept of security realms - to indicate in metadata what security credentials should be presented in order to gain access to

<laufer_> +1

<nathalia> agree with markharrison

<Ig_Bittencourt> An RDF Schema for P3P 1.0: http://www.w3.org/TR/p3p-rdfschema/

<Ig_Bittencourt> and http://www.w3.org/TR/P3P/

<markharrison> Proposal: Acknowledging that much further discussion is needed on security, metadata could include information about security realms (see OASIS SAML/XACML) that apply to restricted-access data on the web. Realms indicate which security credentials need to be presented in order to be considered for access to the data.

+1

<JoaoPauloAlmeida> `=1

<nathalia> +1

<JoaoPauloAlmeida> +1

<laufer_> +1

<Caroline> +1

<CarlosIglesias> +1

<markharrison> +1 and also note that other technologies such as http://www.w3.org/TR/P3P/ may also be relevant to consider in metadata

<Vagner_Br> +1

<Caroline> RESOLVED: Acknowledging that much further discussion is needed on security, metadata could include information about security realms (see OASIS SAML/XACML) that apply to restricted-access data on the web. Realms indicate which security credentials need to be presented in order to be considered for access to the data.

<BrianMatthews> +1

<markharrison> Proposal: Lunch now?

<Ig_Bittencourt> 0

<markharrison> +1

<Ig_Bittencourt> +1

+10

<nathalia> +1 to breakfast

<nathalia> tks

<laufer_> +1

<Vagner_Br> +11111

<Caroline> s/houe/hour

<nathalia> good breakfast to you JoaoPaulo

<nathalia> see you

<HadleyBeeman> We are moving on to skills and expertise.

<HadleyBeeman> scribenick: hadleybeeman

Laufer: A best practice is to study.

<laufer> +laufer

<Ig_Bittencourt> http://www.w3.org/TR/2014/NOTE-ld-bp-20140109/

IG_bittencourt: The W3C has some best practices about that.

…: For example, best practices on publishing linked data. Not useful for any kind of data, but useful for linked data.

<Zakim> CarlosIglesias, you wanted to add again that, again, we should not focus only on data offer but include also demand

Carlosiglesias: We should broaden the focus to include data demand too. Engage with data resuers, help them to acquire the skills needed for data reuse. Universities with business, the civil society organisations, etc. Not just about the skills of the government; all those in the ecosystem.

… We should help data reusers as well. Skills and collaboration. Better data culture within society, etc.

Laufer: Searching for people who are interested in the same things, and participating in a community will encourage the reuse of the data.

CarlosIglesias: Will also help address the problems faced by data reusers. For example, civil society organisations often don't have skills in IT. Most people don't have these skills. It's the entire data chain, involve them all in the process of opening the data.

<CarlosIglesias> Some useful references for this point:

<CarlosIglesias> http://www.businessofgovernment.org/report/designing-open-projects-lessons-internet-pioneers

<BrianMatthews> +1 markharrison

markharrison: We should recommend to publishers of data on the web to include simple examples of how to use their data. Gives data reusers confidence to try.

<BrianMatthews> And other good documentation

<CarlosIglesias> http://www.timdavies.org.uk/2012/01/21/5-stars-of-open-data-engagement/

<CarlosIglesias> were already mentioned first f2f day

<Ig_Bittencourt> +1 markharrison

Skills/Expertise

<hadley> google hangout https://plus.google.com/hangouts/_/72cpi53b5160goccfvbb33ho18

sure

<nathalia> it is ok here too

<Caroline> Srcribe: markharrison

<Caroline> Scribe: markharrison

<scribe> scribenick: markharrison

<Caroline> Thank you Hadley!

<JohnGoodwin> +q

CarlosIglesias: was discussing about capability / capacity of data publisher and also data re-users / potential re-users.

<JohnGoodwin> should we consider the '5-stars of data engagement' http://www.timdavies.org.uk/2012/01/21/5-stars-of-open-data-engagement/ and build on these

CarlosIglesias: Not only provide the data best practices for data provider. Data providers should encourage re-use by building capacity for external re-users

laufer: useful to incentivise community of data providers and data users - so both sides can enhance their skills and understand each other's needs

JohnGoodwin: 5-star scheme of data engagement - see link above (Tim Davies)

<Ig_Bittencourt> http://www.businessofgovernment.org/report/designing-open-projects-lessons-internet-pioneers

CarlosIglesias: and also see link http://www.businessofgovernment.org/report/designing-open-projects-lessons-internet-pioneers

Vagner_Br: concern about burdening data publishers with burden of encouraging re-use
... more of a recommendation than requirement

CarlosIglesias: perhaps a SHOULD not a MUST

<gatemezi> One way of encouraging reuse is to have metrics for data, by "promoting" them each time they are reused and reported

CarlosIglesias: also include for each best practice, some real-world example of how it is being done
... serves as a proof of implementation of best practices

laufer: so it's a use case we can point to?

Vagner_Br: just wanted to say that we should take care not to only consider the perspective / responsibilities of the data provider
... expectations in terms of expertise, opening up the data. Would also like to consider the perspective of the consumers and re-users of the data
... they should encourage government / publishers to open up the data - it's a two-way street

laufer: CarlosIglesias also raised these points - and the need for synergies between data providers and data re-users / consumers - and incentivisation and feedback

<gatemezi> +1

Vagner_Br: entire data ecosystem need to play a role

<nathalia> I'm not hearing you

Caroline: should we explain which roles / actors are in the ecosystems? - data publishers, data re-users, end-consumers of data

CarlosIglesias: also talk about collaboration and co-operation

<Vagner_Br> http://blog.okfn.org/2011/03/31/building-the-open-data-ecosystem/

<CarlosIglesias> Participation and collaboration lessons from Internet pioneers:

Caroline: also mention about the value of the communities that are already engaged

<CarlosIglesias> 1 - Let everyone play

<CarlosIglesias> 2 - Play nice

Caroline: to encourage existing communities to grow

<CarlosIglesias> 3 - Tell what you are doing while you are doing it

<CarlosIglesias> 4 - Use multiple communication channels

<CarlosIglesias> 5 - Give it away

+1 to CarlosIglesias

<CarlosIglesias> 6 - Reach for the edges

<CarlosIglesias> 7 - Take advantage of all organizations

<CarlosIglesias> 8 - Design for participation

<CarlosIglesias> 9 - Increase network impact

<CarlosIglesias> and 10 - Build platforms

<Caroline> +1

<CarlosIglesias> from http://www.businessofgovernment.org/report/designing-open-projects-lessons-internet-pioneers

<Ig_Bittencourt> +1 to CarlosIglesias

<CarlosIglesias> more details there

+q W3C already maintain lists of tools - but do we need more commentary to say why these are useful

+q to say W3C already maintain lists of tools - but do we need more commentary to say why these are useful

<CarlosIglesias> on a side note

<Zakim> markharrison, you wanted to say W3C already maintain lists of tools - but do we need more commentary to say why these are useful

<CarlosIglesias> may be worth also looking at http://www.webfoundation.org/wp-content/uploads/2013/06/OGD-Indonesia-FINAL-for-publication.pdf

<CarlosIglesias> and http://data.worldbank.org/sites/default/files/1/od_readiness_-_revised_v2.pdf

<CarlosIglesias> to see what are the dimensions usually associated to (open) data

e.g. provide more background in addition to what is already at https://www.w3.org/2001/sw/wiki/Tools

laufer: interaction within ecosystem is a way to improve skills and expertise of all actors
... and to provide feedback and incentivisation

Ig_Bittencourt: even if we provide links to tools, we need to provide more info to guide which tools to use for specific purposes
... e.g. could also be helpful to provide benchmarking

<Zakim> Vagner_Br, you wanted to say Are we saying that: The intarction between the ecosystem's actors is the way to increase the expertise and skill among them?

Ig_Bittencourt: if we consider all actors, may be interesting to provide some step-by-step guide for publishing data / linked data

<gatemezi> I guess the interaction among the actors in the ecosystem could help increase the quality of the data (detecting and reporting errors, etc)

<nathalia> I think it is importante consider the consumers too

<Caroline> +1 to gatemezi and nathalia

CarlosIglesias: In all projects, a big part is to consider participation - via hackathons, collaboration with entrepreneurs - we can provide some high-level guidance on this

<nathalia> how people can reuse the data for another proposals

CarlosIglesias: also including educational materials for school and university at every level
... think about engagement techniques - discuss further later

Vagner_Br: perhaps not a step-by-step guide. Many already exist. See our role is to provide real examples of interactions among actors.

<gatemezi> CarlosIglesias: not only hack events, but also direct access to the platform (one click registration) and sharing reuse cases and applications.

<Ig_Bittencourt> Agree that is not to build a stet-by-step but to link them, e.g. http://www.w3.org/TR/2014/NOTE-ld-bp-20140109/

<nathalia> +1 to Ig e Vagner

<Zakim> Vagner_Br, you wanted to say we may suggest examples of interaction among actor not how to

<Vagner_Br> Like this? --- the interaction among the actors in the ecosystem could help increase the skils among them and the value of the data (detecting and reporting errors, etc)

<JoaoPauloAlmeida> we should make sure that we state certain practices that when followed lead to better interaction among actors in the ecosystem

<Caroline> +1 to JoaoPauloAlmeida

<JoaoPauloAlmeida> this should be our mission, to them increase the value of the whole ecosystem

<JoaoPauloAlmeida> if you follow this advice (best practices) then the data you publish can be more valuable to others

+1

<JoaoPauloAlmeida> I mean in general to all our best practices

<JoaoPauloAlmeida> our mission is to produce advice to make this ecosystem viable and valuable

JoaoPauloAlmeida: we can only read you - we cannot hear you by audio on Google Hangout

<JoaoPauloAlmeida> ok, I am also only able to followw in text. So, perhaps my comment is a bit out of context. I understood that Vagner_Br was reasoning on the mission of the group, ...

laufer: collective effect of collaboration improves the value of the data by improving expertise of all actors

<gatemezi> JoaoPauloAlmeida: I think what we are saying is that there should be a "way" for different actors to come together and "speak about" the data, the way they are used, etc.. And this can be achieved via many channels (hackathon, other events, etc..)

+1 to gatemezi

<JoaoPauloAlmeida> thanks gatemezi that clarifies

- need feedback loop from users to publishers

<Caroline> who on the hangout is listening us?

<JoaoPauloAlmeida> I can't...

<nathalia> i'm not listening well

<Caroline> is it too noisy or the sound is low/.

<Caroline> ?

it's noisy - we're all in the same room

<nathalia> the sound is low

<Caroline> of course not! Your participation is very imporant :)

<laufer> don´t do that joao

It's not you annoying us

<Ig_Bittencourt> JoaoPauloAlmeida, your comments have been very useful and it would be great if you could continue.

It's just problematic that we don't have separate break-out rooms without background noise from other discussions in parallel

<Caroline> +1 to markharrison

<JohnGoodwin> +1

we're checking if we have WiFi outside...

maybe we can move outside to reduce background noise

<Caroline> I think we have a good wifi outside

<nathalia> JoaoPauloAlmeida +1

<Ig_Bittencourt> +1

<Caroline> do you want to come here?

<Caroline> there is no noisy

We'll move outside

<nathalia> now it is much better

<JoaoPauloAlmeida> ok

<JoaoPauloAlmeida> the connectivity is clearly worse now

<JoaoPauloAlmeida> we are being let down by technology. please go on guys...

<nathalia> the image is worse now but I prefer listen to them to see them well

<nathalia> it is a trade off

Caroline: we were trying to make a proposal

(ignore the video image)

<nathalia> ok

Who writes the proposal?

<nathalia> I'm ignoring

<JoaoPauloAlmeida> ok go on

<Caroline> ok

<Caroline> I turned off the video

<nathalia> good

(no video means more bandwidth for audio - may be better anyway)

<nathalia> maybe the sound will be better

<Caroline> can you hear us now?

<JoaoPauloAlmeida> yes but please remember than the microphone is senstive only to those in front of the laptop

<nathalia> yes

We are in a semi-circle around the laptop

<Vagner_Br> PROPOSAL: the interaction among the actors in the ecosystem could help increase the the value of the data (detecting and reporting errors, etc) and the skills among them.

+1

<nathalia> +1

<JoaoPauloAlmeida> +1

<Caroline> +1

+1 : not only detecting and reporting errors but also providing feedback on potential improvements

<CarlosIglesias> +1

<Ig_Bittencourt> +1

<BrianMatthews> +1

<laufer> +1

<Vagner_Br> ... we are saying by interaction a "way" for different actors to come together and "speak about" the data, the way they are used, etc.. And this can be achieved via many channels (hackathon, other events, etc..)

proposal accepted

<Caroline> RESOLVED: the interaction among the actors in the ecosystem could help increase the the value of the data (detecting and reporting errors, etc) and the skills among them.

<Caroline> NOTE: we are saying by interaction a "way" for different actors to come together and "speak about" the data, the way they are used, etc.. And this can be achieved via many channels (hackathon, other events, etc..)

proposal accepted with note (15:29 Vagner_Br )

<Vagner_Br> +1\

<nathalia> 2 minutes?

<nathalia> ok

<JoaoPauloAlmeida> ok

<gatemezi> ok

<gatemezi> +1

<JoaoPauloAlmeida> I will break for lunch and probably when I am back you have finished the meeting

<JoaoPauloAlmeida> thanks to you all and talk to you soon in the teleconferences

<Caroline> Sorry, we took more than that! We are coming back

back now

Revenue

CarlosIglesias: Best use of teleconference is to review prepared material rather than discussing one challenge per week
... need to share load for drafting ideas about best practice on topics - then vote, refine and distribute, review on conference calls
... use teleconferences as 'check-points'

Caroline: let's distribute actions among us now

CarlosIglesias: will provide ideas about best practices - but will need some time

<nathalia> +1 to Caroline idea

discussing next steps- not revenue

please see irc chat scribe notes

CarlosIglesias: start populating wiki with notes for best practice on topics

<gatemezi> markharrison: I am on irc, that's why i am asking

<nathalia> I think it is important to list the tasks now and put responsible for which one

we have stopped before discussing revenue - discussing what to do over next few weeks to divide up the work

agreed

Caroline: suggest to make edits on the wiki, then use mailing list to highlight the edit you've made?

<nathalia> I can put all the today resolutions on the wiki

Thanks nathalia

<laufer> +1

<Caroline> +1 to nathalia

+1

<Ig_Bittencourt> +1 to nathalia

<Caroline> ACTION: nathalia will put all the today resolutions on the wiki [recorded in http://www.w3.org/2014/04/01-dwbpbestpractices-minutes.html#action02]

<nathalia> after that I send a e-mail to the list

agreed not to create a specific task force for this group - CarlosIglesias

<nathalia> can someone scribe the discussions? We are not listening in Hangout

I'll read through the proposals we agreed on, when we re-group with the whole group in 5 minutes

scribe: to provide a summary of our work today and progress so far

<nathalia> ok

<nathalia> I'm listening only ghost voices

<nathalia> something like that

<Ig_Bittencourt> We are not discussing anymore.

<Ig_Bittencourt> we are just waiting to re-group.

<nathalia> What are the agenda now?

<nathalia> ah ok

<Caroline> we will go back to discuss with everyone

<nathalia> ok

<nathalia> can we use zakim again?

<Caroline> what do you mean by using Zakim again?

<gatemezi> If we should change the channel from #dwbpbestpractices to #dwbp

<nathalia> https://www.w3.org/2013/dwbp/wiki/Data_on_the_Web_Best_Practices#Challenges

<Caroline> me/ thank you nathalia

<nathalia> I don't know that is the best place to put

<nathalia> let me know if it is not

<BrianMatthews> On Academic Data Citation this report is useful: : https://www.jstage.jst.go.jp/article/dsj/12/0/12_OSOM13-043/_article

<Caroline> I don't know either

<Caroline> maybe you just ask now after markharrison finishes the presentation

<Caroline> Thank you, BrianMatthews. Should we add this on the wiki?

<BrianMatthews> Sorry, meant to put in on teh #DWBP irc - which I now have :-)

Summary of Action Items

[NEW] ACTION: nathalia will put all the today resolutions on the wiki [recorded in http://www.w3.org/2014/04/01-dwbpbestpractices-minutes.html#action02]
[NEW] ACTION: nathalia will put RESOLVED 1 on the Wiki (RESOLVED: Metadata should declare 1) expected/scheduled frequency of update, 2) if the dataset is journalled (i.e. no deletions, only append), 3) if the dataset it timestamped (can request data for a specific time interval), 4) actual timestamp of last update) [recorded in http://www.w3.org/2014/04/01-dwbpbestpractices-minutes.html#action01]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014/04/01 16:18:17 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.138  of Date: 2013-04-25 13:59:11  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/data and metadata should be available/data and metadata should be available in a timely manner/
Succeeded: s/it timestamped/is timestamped/
Succeeded: s/tolls/tools/
Succeeded: s/,,,/.../
Succeeded: s/data said/dataset/
Succeeded: s/tool/tools/
Succeeded: s/IDRA/HYDRA/
Succeeded: s/Bria/Brian/
Succeeded: s/ata/data/
Succeeded: s/Brina/Brian/
Succeeded: s/griups/griups/
Succeeded: s/griups/groups/
Succeeded: s/Vagner_Br/Laufer/
FAILED: s/houe/hour/
Succeeded: s/amont/among/
Succeeded: s/muche/much/
Succeeded: s/form/from/
Found Scribe: Caroline
Inferring ScribeNick: Caroline
Found Scribe: Caroline
Inferring ScribeNick: Caroline
Found Scribe: JohnGoodwin_
Inferring ScribeNick: JohnGoodwin_
Found ScribeNick: hadleybeeman
Found Scribe: markharrison
Found ScribeNick: markharrison
Scribes: Caroline, JohnGoodwin_, markharrison
ScribeNicks: Caroline, JohnGoodwin_, hadleybeeman, markharrison

WARNING: No "Present: ... " found!
Possibly Present: Bernadete Bernadette BernadetteLoscio_ BrianMatthews CarlosIglesias_ Carlosiglesias Caroline HadleyBeeman Ig_Bittencourt JoaoPauloAlmeida JoaoPauloAlmeida_ JohnGoodwin JohnGoodwin_ NOTE PROPOSAL Reworded SLA Srcribe Vagner_Br access an basis be but consumers data dataset declare event every expectation extracted frequency gatemezi hadley high-level honour https interested is journalled laufer laufer_ licencing management markharrison markharrison_ metadata might nathalia need-to-know needing not observation of or patterns published recommendation reverse-engineered scribenick shared should such summary that there this time-stamped to traceability update web whether without
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy


WARNING: No meeting title found!
You should specify the meeting title like this:
<dbooth> Meeting: Weekly Baking Club Meeting


WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Got date from IRC log name: 01 Apr 2014
Guessing minutes URL: http://www.w3.org/2014/04/01-dwbpbestpractices-minutes.html
People with action items: nathalia

[End of scribe.perl diagnostic output]