Spatial Data on the Web Working Group Teleconference -- 23 Mar 2016

<scribe> Meeting: SDW coverages

<phila> issue-1?

<trackbot> issue-1 -- Blah blah test -- closed

<trackbot> http://www.w3.org/2015/spatial/track/issues/1

<scribe> scribe: kerry

<scribe> scribeNick: Kerry

<billroberts> https://www.w3.org/2016/03/09-sdwcov-minutes

<phila> PROPOSED: Accept last week's minutes

<billroberts> +1

<jtandy> +1

<eparsons> +0 Not there

Proposed: accept last weeks minutes https://www.w3.org/2016/03/09-sdwcov-minutes

<phila> +1

RESOLUTION: Accept last week's minutes

<billroberts> https://www.w3.org/2015/spatial/wiki/Patent_Call

<billroberts> https://www.w3.org/2015/spatial/wiki/Meetings:Coverage-Telecon20160323

Review of use cases

welcome to editor Maik Reichert

topic: Review of use cases (action)

<billroberts> https://www.w3.org/2015/spatial/wiki/Coverage_UCR_notes

billroberts: at bottom in summary see that subsetting came out a lot
... assign an identifier to a subset of a coverage of a dataset
... also for provenance so you can point to how the processing happened
... the question of delivering a full coverage is a special case of delivering a subset -- if we address addressing and formatting it will be solved
... also some use cases for poihnt cloud and time series -- need to keep these in mind
... also note that the region of interest might be complicated, not just a bounding box, may be polygon or tunnel underground
... any comments?

jtandy: they are the things I can recall

phila: note the way subsetting tumbles out becuase we are struggling in dwbp to say something that is *not* spatailly-specific
... dwbp does not have good use cases for this

billroberts: also we have time subsets and variable subsets

<Zakim> jtandy, you wanted to query predefined subsets or on-the-fly query

<eparsons> jtandy

jtandy: we had a long email thread on subsetting for BP
... one kind is subsetting for useful chunks to be manageable (a predefined set)
... other kind is an on-the-fly query chunk
... we need both
... rdf datacube does predefined type but not query type

billroberts: datacube can be used for query-type but perhaps less flexible

jtandy: when i assign an identifier to a subset it could be anythinh
... but a query type identifier is also an api, effectively

<phila> kerry: I hate us calling it subsetting given all the different dimensions that we need to talk about

kerry: does not like "subsetting"

<phila> Discussion between phila and kerry about whether audience for Coverages doc is only spatial folks

<phila> kerry: How about 'sub coverage?'

<phila> billroberts: That makes sense to me

<phila> phila: Doesn't like 'sub coverage'

<scribe> ACTION: kerry to present some suggestions for renaming "subsetting" [recorded in http://www.w3.org/2016/03/23-sdwcov-minutes.html#action01]

<trackbot> Created ACTION-152 - Present some suggestions for renaming "subsetting" [on Kerry Taylor - due 2016-03-30].

<BernadetteLoscio> yes!

<billroberts> http://w3c.github.io/dwbp/bp.html#EnableDataSubsetting

DWBP subsetting

BernadetteLoscio: we have a proposal as in the irc, but it is difficult to test
... it is generic and important but there are different approaches
... e.g. apis, queries
... we are not sure whether we should have this as a bp or to just describe it
... what would be helpful to you and how would it be testable?

<Zakim> jtandy, you wanted to ask if you could cover 'subsetting' as an example operation in your API

jtandy: when I look at subsetting I think it is one example of the way you could work with data... there are other BP about offering an API in DWBP
... data subsetting makes a lot of sesne for slices for statistical, etc, but when I look more generically it is really just an operation you provie thru an API
... could be just an illustrative example
... but it makes a lot of sense for time series and satellite data (somehow differently)

bernadette: : should we also talk about subsetting for download

jtandy: you should also talk about the data you take away after downloading
... I would suggest when working with large datasets a typical use case would be an api to select parts of that dataset
... difficult for you to reference what we do, but I suggest just describe an illustrative example of a convenience API

BernadetteLoscio: perhaps we can talk about subsetting along with downloads as another example

billroberts: the problem with api/query is that it is futile to specify upfront what it should look like in general
... maybe all we can do is say "you need an API" or esle we end up inventing yet another query language
... needs to be up to the data provider

jtandy: agrees

phila: <moved us with his absent speech>

<phila> phila: Requirement no. 1 can assign an identifier to a subset of a coverage dataset

phila: we have been saying "you just give it a uri", although a uri *is* an api
... for bulk download is it useful to say you can use the api and you can give it an example of its own, e.g. meteorological data for the last week
... should this go in dwbp or sdw?
... should dwbp do this ... your first ucr says you need to asign an identifier to a subset

billroberts: yes it would be useful

jtandy: it makes sense to for dwbp to provide some advice -- if you have data that is too big for a web application then providew a mechanism to get hold of bits of it
... eg. using predefined slices or an API
... test by "here is a massive dataset -- can you work with it in a browser app?

billroberts: use cases where this emerged was wanting to attach some metadata to it, something that is the full set, not a subset

<phila> is that helpful newton_dwbp?

<newton_dwbp> I liked jtandy point

billroberts: need to look again at email thread on this, any otehr comments?

BernadetteLoscio: we like jtandy's idea and will bring to our dwbp discussion. thank you very much

RDF datacube action

billroberts: which aspects of rdf datacube would be good for defining subsets?

<Zakim> jtandy, you wanted to note qb:slice

billroberts: bill will write note on pros and con of datacube and mechanisms that would be helpful for subsets

dmitrybrizhinev: ... we are a group of students working on an example implementation for coverages, we are worried about verbosity of datacube
... flipside is taking a subset with lots of granularity with a sparql query is useful butused verbose
... i have been converting the coveragesjson to rdf but this is the query... is there a best of both worlds

billroberts: please share anything written up

jtandy: agree about way too verbose, jonblower keeps saying this cannot be used to carry the data, but the metadata might be useful

<jtandy> https://www.w3.org/TR/vocab-data-cube/#slices

jtandy: for describing subsets there is qp:slice and also a mechanism for creating arbitrary groups in the spec
... leaving the data in a desne array is arguable no different to the way we deal with goespatail stuff all the time, eg geometry objeects in WKT or in GML
... becuase we want to treat the whole geometry as an object (we don't break it up), the same can apply to a dense array of data
... in the same way the geosparql can provide operations on data, when we are working with coverage data in a webby form we ned to provide some additional mechanism for querying inside

billroberts: e.g.75th point of array needs to be accessible, and you need some coordinates that stick with the points... that kind of conciseness is needed for whole grid but when there are only bits it may work well
... datacube couldwork well itself for a small subset if not the entire grid

jtandy: if you just want ith column and jth row ...

<phila> kerry: were you suggesting, Bill, that the QB model could be used as a response format for a query over a bigger set

<phila> billroberts: Not precisely, but that structure of an observation

<phila> ... If you just have one data point, you need all the dimensional info and the metadata. Some metadata applies to the whole dataset, some to a specific point.

<phila> ... If you have a grid, you don't need all the coords 'cos you can work them out but a point cloud does need them.

billroberts: the structure of an observation is very useful for datacube way

jtandy: index space querying , natural coord subsetting, more work to do here...

phila: what proportion of coverage data is on a regular grid?
... I am thinking of those with only 2 or 3 lines with regular definition and you can work the rest out
... in such cases a template uri could be generated that does identifiy a "slice"
... so we could say "of you have a regular grid pattern this is how you generate the uri template"

<Zakim> jtandy, you wanted to respond to phila's question about regular grids

jtandy: yes it is a large fraction by volume and number of datasets, eg satellite imagery,
... but there are other important cases such as in-situ observations by radiospondes or buoys or gliders irregular coverages happen more (like opendap/netcdf index-based subsetting)

eparsons: aggrees. my meta-question is , where is the stuff with a more semantic approach -- are we just reinventing the wheel of tools in other places?

jtandy: phil had said data is easy, metadata is challenge. the metadata is the bit to get the advantage of linked data such as what you are measuring etc
... metadata as linked data is key, then something else for dense arrays of data

eparsons: so lets not get worked up on data size then as ther are other approaches

billroberts: index array of data is good approah for some stuff, but we need to think harder about others

Jeremy Stole our time slot

billroberts: jeremy stole out timeslot
... we had been proposing to follow the main group for time changes

meeting times

kerry: kick them off!

s/Jermey/Jeremy/

<phila> May I suggest a Doodle poll?

billroberts: moans about "the australian issue"

+1 to bill's idea

<phila> 16:00 CEST, 15:00 BST etc.

billroberts: meeting close

<billroberts> bye everyone

Spatial Data on the Web Working Group Teleconference

23 Mar 2016

Attendees

Contents