Re: DQV, DAQ and Data Cube graphs

Hi Antoine, Riccardo, Christophe,

Sorry for my late reply but I got stuck on other things. I’ll try to group replies from all emails in one.

(1) Compatibility between daQ and DQV

If we resolve ISSUE-180 [1] by not re-using directly the DaQ elements, we can solve both issues at once. Here are two proposals from me:

PROPOSAL 1: Replace daq:QualityGraph by a new class (say, dqv:QualityMeasureGraph)
PROPOSAL 2: Drop daq:QualityGraph and represent quality measures in graphs of the same class as the other quality metadata (ie., graphs of type dqv:QualityMetadata)

I think if dqv:QualityMetadata and daq:QualityGraph are equivalent classes, then both will be compatible because they will have the same set of individuals. Please correct me if I’m wrong.

We still have the issue  of backward compatibility between DQV and DAQ. I guess this issue might be solved by defining   daq:QualityGraph as  subclass of  dqv:QualityDataset, we might discuss this with DAQ designers, ( @Jeremy, what do you think? Does it work?)

I don’t think subclasses will work because instances will only be compatible one way (i.e DQV -> DAQ or DAQ -> DQV).

Therefore, in order to keep the data cube feature without using rdfg:Graphs, one possible solution is as Riccardo proposed:

Proposal 4:   define  in DQV the new class dqv:QualityDataset which replaces daq:QualityGraph,   is defined as   subclass of qb:DataSet, but it is not a subclass of RDFg:Graph.

but making dqv:QualityDataset a subclass of qb:DataSet using the cube data structure definition defined in daQ (daq:dsd).

If what I’m saying is correct, then I think it will make daQ and DQV compatible with each other. What do you think?

On the other hand, if you think that the compatibility from DQV to daQ is not important, then subclassing should be enough.


(2) Usage of rdfg:Graph in daQ and Provenance

You are right,  we don't need to keep the daq:QualityGraph as a rdfg:Graph, but we still need a subclass of qb:DataSet as  range of the property  qb:dataSet
to facilitate the visualisation of the data as RDF cube.

The idea of storing quality metadata as graphs was in order to make a distinction between the data itself and the metadata. It also makes things easier for crawling and querying in my opinion, whilst also tracking the provenance of quality observations  Each daQ observation is also a prov-o entity [1].
I agree that provenance at different granularity levels - as Nandana wrote in his reply - is required.

Cheers,
Jer

[1] http://purl.org/eis/vocab/daq

On 27 Aug 2015, at 01:12, Antoine Isaac <aisaac@few.vu.nl<mailto:aisaac@few.vu.nl>> wrote:

Dear all,

While preparing the last mail, Riccardo and I started a longer discussion on which the group's input would be welcome.

This is about the following issues:

http://www.w3.org/2013/dwbp/track/issues/181 - Should we have only the existing class daq:QualityGraph or keep the new class dqv:QualityMetadata?
http://www.w3.org/2013/dwbp/track/issues/182 - The label of daq:QualityGraph does not fit well with the current model -



Riccardo has pointed out that we should keep in mind also issue 191 [2]
DaQ has been made consistent with RDF Data Cube (qb: namespace [5]): daq:QualityGraph is a sub-class of qb:DataSet, so that results of the quality measures can be visualised by RDF-cube visualizer (see [3]). This is very useful feature and he thinks we should preserve it in DQV. And I agree.
So when dropping daq:QualityGraph, we have to think where to put the qb:DataSet subclassing and to the rearrange qb:dataSet property in our graph at [4]

Riccardo suggests another (orthogonal) proposal:

PROPOSAL 3: define dqv:QualityMetadata as a subclass of daq:QualityGraph.

With this we keep compatibility with DaQ and Data Cube. We don't have nested graphs anymore - only a graph grouping together all the measures and annotations, and whose provenance can be easily tracked.

The problem is semantics: qb:Dataset is defined as "collection of statistical data" and a daq:QualityGraph "contain all metadata about quality metrics on the dataset". So these are rather numerical observations, while our dqv:QualityMetadata can include more diverse metadata, for example textual annotations.

What do you think?

Best,

Antoine

[1] http://www.w3.org/2013/dwbp/track/issues/180
[2] http://www.w3.org/2013/dwbp/track/issues/191
[3] http://eis-bonn.github.io/Luzzu/papers/semantics2014.pdf
[4] http://www.w3.org/TR/vocab-dqv/#vocabulary-overview
[5] http://www.w3.org/TR/vocab-data-cube/

Received on Thursday, 3 September 2015 11:24:30 UTC