[dxwg] Two things that our profiles are not (#976)

aisaac has just created a new issue for https://github.com/w3c/dxwg:

== Two things that our profiles are not ==
Wrt [Action 342](https://www.w3.org/2017/dxwg/track/actions/342) from the [July 2 call](https://www.w3.org/2019/07/02-dxwg-minutes.html), here's a super quick attempt to introduce two "profile" notions that should be excluded from our scope. 
Comments are really welcome - I had to rush this as I probably won't have time to work on it between now and next week.

1. "Syntactic profiles".
These are variations of the data structure that do not alter its semantics at all. The same properties and classes are used, only their arrangement differs, and the use of abbreviations. The MIME Type does not even change: i.e there can be different profiles of exactly the same data (statements) in JSON.
[JSON-LD profiles](https://www.w3.org/TR/json-ld11/#forms-of-json-ld) are an example of this.
It is revealing that while JSON-LD has kept the name of its parameter as "profile", the specification focuses more on the term "form".

2. "Data profiling" as a data analysis activity (cf [Wikipedia](https://en.wikipedia.org/wiki/Data_profiling))
This activity aims at providing insight into data, by examining what it contains, often producing statistics about it.
Data profiling can produce listings of classes and properties and even infer axioms that hold for the data that is examined. And it may conclude that data conforms to a given data specification or profile (according to the DXWG definitions for these terms). However data profiles produced by the activity here are post-hoc. The profiling activity works on data distributions (in the DCAT sense) that are given as input and try to reconstruct the data models behind, so to say. It could be that data profiling recognizes that the given data conforms to a pre-existing specification. It could also be happen that a data publisher would use such conclusion to include a new CONNEG option to their portfolio, arguing that the data they serve is compatible with yet another specification. But this is then a by-product of the data profiling process, so to say. Basically data profiling is not prescriptive about what's in the data, while the DXWG profiles and specifications are artifacts that probably always have this ambition. 

Please view or discuss this issue at https://github.com/w3c/dxwg/issues/976 using your GitHub account

Received on Tuesday, 2 July 2019 21:30:50 UTC