CEO-LD Face to face Day 2

30 Sep 2015

See also: IRC log


chunming, phila, jianhui, jitao, MaikRiechert, jtandy, adina, geoffrey_boulton
chunming, phila, Jeremy Tandy


<phila> Notes from the meeting

<phila> https://www.w3.org/2015/ceo-ld/wiki/London_Kick_Off_meeting#Structure

phila: notes that we're working in plenary to write our report in note form at the address above
... so there will be limited notes in the IRC channel from this morning's session

<chunming> [discussion on principles]

<chunming> scribe: chunming

GB: be open, contributing something distinctive to W3C/OGC WG
... No wheel reinvention. Something should be deliverable/actionable, not an academic exercise
... minimum cost

jtandy: focusing on data sharing, not links between satallite and station.

Phil: [show HTML5 spec]
... overtime to produce a document.
... another example is Spatial data on the web use case & req doc
... we have formal snapshot publicaitons
... this is a step of publishing a formal doc, WG could produce any number of WD
... hope this group will have a kind of recommendation
... implement, test. we need at least two full impl for everything
... when you can improve that, comments and answers, people satisfied with the feedback
... when prove that, we could call director, to push WD to Last Call (we have evidence, impl, we done)
... then we need to push to W3C membership, as well as OGC

jtandy: there is a e-vote process at OGC
... simple majority

phlia: similar process in w3c.
... then we had a recommendation, OGC call standards.
... if ISO wants to, it could take our results.

phila: the doc are copyright in w3c's 4 hosts, and OGC
... we have strict site policy, use for free, and make sure there are no patent issue in the document
... that's the process.

jianhui: each member from diff organization join the group, how to show the contributions in spec?

phila: you can join in represents you yourself, or your organization.
... but the organization make the decision on IPR issue.

[moving to Issues section ] https://www.w3.org/2015/ceo-ld/wiki/London_Kick_Off_meeting#Issues

scribe: how do you describe coverage data
... how do we use the geospatial data
... in term of access, discovery is important
... the schema.org, if you want to be discoveryed

jtandy: schema.org and W3C have overlapping membership
... http://schema.org
... Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.
... inacurate metadata, and in-complete metadata is harmful for search engine
... schema.org makes you to provide more metadata
... so search engine will get more infor from your page, that's the incent
... therefore, if you want your data to be found, you could use schema.org
... i do agree the idea.

phila: schema.org is extensible, we can extend it.
... we could discuss in w3c community group.
... you can provide extensions based on proofing the needs

phlia: need to access a subset, slices
... then we need to represent the data
... RDF data cube, csv, geotiff, and other ways
... number of choices

MaikRiechert: is datacube support ordered data

phila: yes. it does.

jtandy: you may think of using data cube to describe data slices, or any of format like geotiff.

jianhui: how to use RDF to describe data gradularity?
... we can use rdf to describe some slices
... then it will be useful, but if we describe data in detail, we need to cut big data into slices

phlia: slices as images?

jianhui: yes

phila: geospatial data could be described as linked data
... a wikipage: well-known_text


Well-known text (WKT) is a text markup language for representing vector geometry objects on a map, spatial reference systems of spatial objects and transformations between spatial reference systems.

<jtandy> (the RDF Data Cube could be used to describe the structure of a multi-dimensional dataset, 2d slices of this dataset could be encoded as imagery, e.g. GeoTIFF, that are referenced from the Data Cube description ... I called this a hybrid approach)

jtandy: APIs?

GB: the possibility of doing test?

<jtandy> (the hybrid approach suggested above needs to be verified ... I've not seen this used in practice)

MaikRiechert: for MELODIES, @@@
... if you define RDF describing coverge data
... for MELODIES, for small subsets of data, it is ok
... but if we talk about big data, what's the users of the RDF data?
... even if we just expose it like that, it is still in big size.

phila: one test would be, is it ok for a big data subset to do that?
... would need a usecase like this.

MaikRiechert: there would be a query like , giving a tempreture, and get back of all the points with the temprature, it would be a large datasets (as result datasets).
... it depends how you use the data
... we could do statistics, i don't ever find a view on using that.
... that's my concern/

jtandy: are people taking satallite data as a big set, and processing it locally to provide services?
... this is interesting, not providing a coverage data to browser to show something, but tell me the possible coverages under a condition.

MaikRiechert: some need server processing
... but not only server processing, there're cases to integrate different data (coverage data, measure data) and calculate not in server.

<phila> ACTION: PhilA to ask BillR about potential use cases for subsets of a coverage [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action01]

jtandy: there are js implentations, you have a online software agent, you may expose other services. you have a coverage space, the API could use coverage data, may query to other dataset relevant.

<phila> meeting; CEO-LD Kick off day 2

phila: we need usecases you will happy with :-)
... the possible tests.

MaikRiechert: observe property, @@@

phila: but you don't do it in the browser, you do it in serverside

MaikRiechert: think about Json-LD
... describe lots of things you don't need

jtandy: there are subset of cases, we use code coverage data in the format.
... we need discovery what in the coverage
... and in some of cases, we need to download the dataset and handle that.

MaikRiechert: for MELODIES project, scientist may just download the datasets.
... then may link to some apis not in RDF form.

phila: need to find @@ to have an example.
... scalability, is another issue.

<phila> https://portal.opengeospatial.org/files/?artifact_id=56866

phila: open search geo and time extensions (from OGC)

jtandy: in terms of coverage,you need the query related to a area( 2d geo extension)

<phila> jtandy: Geo temporal Open Search only works in 2D. but there's an overlap in terms of what it returns

phila: that looks like a potential start point for this. (simple query)

jtandy: if we get more layers, 2d geo extension is not enough. need to be extended
... rfc 7111

<phila> RFC 7111

URI Fragment Identifiers for the text/csv Media Type

These fragment identifiers make it possible to refer to parts of a text/csv MIME entity identified by row, column, or cell. Fragment identification can use single items or ranges.

<phila> http://example.com/data.csv#row=5-7

jtandy: this could help you refer to part of the data

<phila> Again, that's only 2D, i.e. a single table

MaikRiechert: how about the subset in native coordinates

chunming: this goes back to the balancing between serverside and browser

jtandy: lots of the geospatial data services provide this kind of complex mappings (into native coordinates).

Adina: the image come from satellite is just pixels
... the processing will choose which coordinates will be used

<chunming_> jtandy: CEO-LD should provide a view on Issue-28 - whether SDW WG should define a default CRS

<chunming_> http://www.w3.org/2015/spatial/track/issues/28

jtandy: what you heard of people using the data

<phila> GB: Pitches idea of gamification of coverages usage

<scribe> scribe: phila

Adina: Talked about software packages they use.
... ESRI has an image handling group within ArcGIS
... There are others and several open source options
... Commercial missions are chagning from brining the data to the people to bringing the people to the data. Creative environments whetre you can access the data, play with it and do stuff there.
... And you start paying when you have created your product.
... That's a trend aming the commercial services

<scribe> scribe: phila

Adina: ENVI and PCI are the otehr commercial packages we tend to use
... There's one called Cloud EO funded by Russia
... different packages for different types of data (radar, imagery)

<scribe> ACTION: Adina to provide list of relevant software packages [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action02]

<jtandy> scribe: Jeremy Tandy

<jtandy> scribenick: jtandy

Next steps

phila: We need to talk about
... i) creating impact in china
... ii) the capacity of this group
... iii) TPAC in Sapporo
... iv) our next F2F in Beijing
... can chunming talk about the F2F meeting

chunming: one constraint is to hold before the W3C Advisory Committee meeting

[discussion]: concludes week of 29 Feb?

Denise: notes that mid-March is bad for OGC- TC meeting in Washington and World Bank Land & Poverty conference

phila: China is 'closed' (more of less) for the Spring Festival (?)

<chunming> Spring Festival Feb 8.

jianhui: Spring Festival is early next year

[discussion] concludes that w/c 22 Feb is workable

(general agreement)

<chunming> plan 1: Feb 22-23, 2016

Denise: can we avoid 24 Feb please

<chunming> plan 2: Feb 25-26, 2016

phila: (looks at his diary) ...

Denise: that week is Mobile World Congress
... can we go back to w/c 29Feb?

GB: early that week please
... RDA plenary is 1-3 March in Japan

phila: so - 2-day meeting in Beihang University, Monday 29 Feb, Tue 1 March

GB: are we prevented from meeting on Sunday?

jianhui: good to avoid the RDA conference

phila: Sunday 28- Monday 29th
... then people can leave to go RDA plenary (in Tokyo)

Denise: (looks online) and notes that RDA starts on 29th

phila: RDA and CEO-LD together makes for easier travel request

RESOLUTION: next F2F meeting Sunday 28-Monday 29 Feb, hosted by Beihang University

phila: asks Jianhui if he can attend TPAC?

jianhui: not enough time to get visa

<chunming> TPAC 2015: http://www.chinaw3c.org/member-meetings.html#tpac

jianhui: because previous meeting in Paris (?)

phila: (describes TPAC)
... 2-day working group + 1-day plenary + 2-day working group

<chunming> http://www.w3.org/2015/10/TPAC/Overview.html

Spatial Data on the Web meets on the monday and tuesday

phila: SDW will spend (at least) half day on the coverages deliverable
... you can attend other groups too as observer
... it's not free; USD85/day
... travel is not funded by CEO-LD
... so it will be expensive
... If you are going, please book _NOW_ as hotels prices will increase very soon
... W3C Members meet 2x per year ... at TPAC and at the Advisory Committee (AC)
... Working Groups meet as often as they need

jianhui: If you're an AC member do you still have to pay

phila: yes - W3C is not a rich organisation
... next year TPAC is in Lisbon, Portugal

jianhui: can we invite more participants?

phila: yes - subject to OGC & W3C membership criteria
... the budget is constrained

GB: the current budget covers four attendees from UK ... but I'd be surprised if we are limited to four in reality

phila: Next sub-topic ... how do we increase the impact of this work in China
... we have a small amount of time and money to make progress
... but what next?
... we've talked today about Gamification, commercial interest [and GEO etc.]
... but how does this work in China?

chunming: we could have a local meeting in China once we return to introduce more people to these topics
... we can identify other people to get involved
... and introduce them into the meetings

jianhui: impact? do you mean funding agencies?

phila: in time, yes ... but the UK Foreign Office want to increase collaboration between UK and China
... in this case focused on exploitation (?) of satellite data

<chunming> Guo Huadong

jianhui: cites a potential contact

<chunming> general director of RADI

(see above)

<chunming> Li Wei (former president of Beihang University)

GB: lists other people who he knows who could be influential
... there are 4 or 5 people at an influential level in CAS and Universities who we can engage
... once we have a plan in plain english
... we can think about developing the rhetoric about this project
... stating the benefits- including economic

jianhui: can we make a high-level haf-day work shop in Beijing ahead of the F2F meeting?

GB: could we get president of CAS to host such a meeting?

jianhui: on return to Beijing, I will need to coordinate head of CAS, as well as Ministry of Science & Technology (MOST)

Denise: from government side, who are the key people to engage?
... notes that UN GGIM is co-chaired by China ... would be good ensure that we coordinate there
... and by Government I really mean Public Service
... GGIM has regional groups ... see GGIM AP (regional group for Asia Pacific)
... that would be the group to engage with in China

Adina: 21AT is a company in China that would be worth engaging (21st Century Aerospace Technology)

GB: Invitation letters are best issued from CODATA

phila: From the chinese office CODATA?

GB: likely from Simon H
... the key is that this is not just a show case- but an awareness exercise requesting support

<phila> jtandy: We (met Office) has extensive relations with Chinese met office and CAS etc.

<phila> ... I'm not involved with them but I know the colleagues who are

<phila> ... Newton project (coordinated by Royal Society?)

<phila> ... Working on science expertise, not technology expertise

<phila> ... I note that there are (political) constraints on exchange of tech expertise

phila: so we have a long list of people to invite- but, right now, no strong reason for their attendance

GB: so the sequence is:
... 1) get our report out - including the high level message
... 2) use that as the basis for engaging high-level invitees from China

jianhui: then we can organise the first half-day event

chunming: how do the half-day and two-day events relate?

GB: the half-day event happens first
... best to have them in the same location
... [for continuity]

Phila: moves onto the next subtopic
... what will we have achieved by the F2F meeting in Feb?
... all this group is chartered to do is to write a document as input to SDW
... we need to keep expectations low
... but we also have an offer from Maik to build something tangible

GB: what could be built?

MaikRiechert: depends on alignment with MELODIES

phila: so what are you building for MELODIES?

MaikRiechert: lists a few things ... WCS-like APIs, JSON-encoding, harmonising the coverage metadata
... [...]

phila: Thinking of Catalogues [...] using an OGC CSW, do you have one page per dataset?

MaikRiechert: we want to use DCAT in MELODIES - setting up a CKAN instance that uses the GeoDCAT and RDF capabilities

phila: that alone makes the data more discoverable - answering Adina's primary concern

MaikRiechert: yes- that's a minimum that we need

phila: so- we're limited by time (Feb 2016)

GB: the key concern is to communicate why we're doing [this]
... we need to focus on [where the value is] to people remote from us rather than impressing ourselves
... is it right that this work will also enhance MELODIES?

MaikRiechert: yes

phila: and this will be good for the MELODIES product to increase impact

MaikRiecherdt: I can do the catalogue stuff
... when the metadata is ingested into the catalogue it needs to link to the data
... describing the API or media type etc. for the data
... I want to use a new feature of CKAN to provide a "preview" of the data
... in the catalogue
... This is realistic by February

The question of how to represent the data is a bit more tricky

MaikReicherdt: The question of how to represent the data is a bit more tricky

phila: In semantic web you can refer to stuff that doesn't exist ... you can talk about subsets even if you can't [directly] access the subset resource
... asks Jitao if he has capacity to write the document for CEO-LD
... and then to incorporate that into the Coverages deliverable for the SDW WG
... we want to get an advance draft of that document by end March
... we have to be able to show something;
... the best way is to show the CEO-LD output doc, and then how this is used in the broader SDW deliverable

jitao: indicates that he has capacity as a co-editor to work on this document

phila: we may be able to get another editor (a third one) if the need arises
... logistics? do we need telecos?

GB: email until the need for a telco arises

[general agreement]

<phila> Mailing list

phila: I have set up a group mailing list - but this has not been used yet
... everyone is on the list - except Adina ... would you like to be?

Adina: yes please- with caveat that it won't be a massive engagement

phila: notes that the list is publically visible
... you can get to this email list via the wiki

[Phila spams everyone with a test email]

[oops- mailing list needs some work]

<scribe> ACTION: phila to add everyone to the mailing list [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action03]

phila: any other tasks?

GB: yes- to get a feel for ongoing funding [...]
... earlier this year the responses to this proposal were enthusiastic, now I think they would be even more enthusiastic because we have something more tangible
... this is based on the report from this meeting

Cross referenceing

<phila> jtandy: We need to focus on... on the kick off meeting list


<phila> ... I'm looking under the structure section, item 3

<phila> ... identifiers for datasets and distributions

<phila> ... we need to recognise that that best practice is being defined elsewhere (DWBP)

<phila> ... however, we do need to make some recommendations about how to identify slices, subsets,samples etc.

<phila> ... How to refer to those

<phila> s,samples/, samples/

<phila> jtandy: We want to be able to talk about parts of the whole thing

<phila> ... How to discover the dataset - relates to what metadata is needed for that coverage

<phila> ... Maik will be building a catalogue based on CKAN that will eat some GeoDCAT that might be helpdul for describing the structure of hte coverage

<phila> MaikRiechert: Our dataset is not a coverage, it's a lot of coverages

<phila> ... we don't describe the metadata of an individual coveage with the same things as the collection

<phila> ... we don't include the structure

<phila> jtandy: I think we are going to need to make a recommendation on what structural metadata shoujld be provided

<phila> ... Once we've defined that, we need to determine how we're going to test that. if not MELODIES< who?

<phila> ... what is useful for coverage data?

<phila> MaikRiechert: If there's a rec from this group, how to represent metadata of a coverage, then if it's useful we'll do it in MELODIES

<phila> ... What you search for in our CKAN instance will be the higehr level groupings, not the individual coverages

<phila> jtandy: So there may be a follow on piece of work for once you;ve discovered the dataset, you can browse to an individual coverage and find out more about that.

<phila> MaikRiechert: The thing that's missing is how do you describe the links between coverages

<phila> ... If you model GeoTIFF as a coverage,

<phila> ... how do you link that to your collection of... in a searchable way

<phila> jtandy: Does that fit within the MELODIES project?

<phila> scribe: phila

MaikRiechert: Not completely sure

<scribe> ACTION: Maik to work with MELODIES partners to spec out what can be produced and hwo it fits in with CEO-LD [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action04]

MaikRiechert: If the metadata is right, it should just work. (Famous last words)

jtandy: rather than the metadata, the coverage data itself - I think that we should provide a rec on the Web-friendly formats that you can use to encode coverage and when their use is sensible
... We have had a long discussion when it is not sensible to use JSON arrays for coverage data (it's too big). Do we think we need to provide guidance on the non-Web-friendly formats
... things that can't be used in a browser or other use agent

MaikRiechert: Just say that they exist

jtandy: Maybe we should have a list of common formats that we see
... I think it would be interesting to do some work on using PROV-O to describe the processing that has been done to create the coverage
... As far back as the public data allows

Adina: You can usually have a block diagram of the steps used

jtandy: That's likely to be useful
... we also need examples of how to describe the observation context, the observation platform etc. For that we need to Semantic Sensor Network vocab formalised
... We can't start on this until SSN is under way
... we Talked about using the data Quality Vocabulary

-> http://w3c.github.io/dwbp/vocab-dqg.html data Quality Vocabulary

MaikRiechert: In CHARME they use Open Annotation Ontology
... Not sure how they refer to a subset of a dataset
... Sounds like a good use case to me

-> http://www.w3.org/annotation/ Web Annotation WG

jtandy: Last thing on the list... what APIs should look like for working with coverage data, especially for the non-expert user. e.g. get stuff by goemapping, by time, by observed property
... We could start by looking at what access patterns people are using.
... One of the themes in the SDW BP is exposing data through APIs and how to do that so we can wait for the broader group to make some recommendations
... SO we can pick off things we can start now without trying to do everything. So for now we should talk about:

- The metadata for describing a coverage structure

- metadata for it

MaikRiechert: That's what you'll index

jtandy: And understand how tools will use that to heklp people understand what the coverage contains
... The other one we can progress with immediately is how to identify a subset of a coverage (slice, cell, region)
... If we can do that then we can add value straight away
... That covers SSTL's priority for example

MaikRiechert: Can you have a DCAT dataset that contains other datasets

jtandy: VOID has a subset relationship
... Those are tangible things that we can deliver

MaikRiechert: Is there a definition of a coverage/ Can you have multiple range sets

jtandy: Yes

phila: If we send a mail to the ML asking a question, who is going to answer.
... Asks for Chinese input

chunming: I think we can choose some of the platforms or projects to link to this
... Find use cases and issues

MaikRiechert: How much do you use ontologies and RDF in generally

jianhui: We use them but we don't generally define them
... We use RDF to handle genomic data but not geospatial
... It's possible that when I get back, my colleague in charge of system development can maybe work with Maik and do some demonstrations withy you

MaikRiechert: Yes, we're just experimenting with smeantics etc.

Meeting adjourned

<scribe> meeting: CEO-LD Face to face Day 2

Summary of Action Items

[NEW] ACTION: Adina to provide list of relevant software packages [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action02]
[NEW] ACTION: Maik to work with MELODIES partners to spec out what can be produced and hwo it fits in with CEO-LD [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action04]
[NEW] ACTION: phila to add everyone to the mailing list [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action03]
[NEW] ACTION: PhilA to ask BillR about potential use cases for subsets of a coverage [recorded in http://www.w3.org/2015/09/30-ceo-ld-minutes.html#action01]
[End of minutes]