Re: ISSUE-7 (whyAccessUrl): Drop dcat:accessUrl, use the URI of the dcat:Download resource instead [DCAT] from Sarven Capadisli on 2012-02-10 (public-gld-wg@w3.org from February 2012)

From: Sarven Capadisli <sarven.capadisli@deri.org>
Date: Fri, 10 Feb 2012 04:53:19 +0000
To: Richard Cyganiak <richard@cyganiak.de>
CC: Government Linked Data Working Group WG <public-gld-wg@w3.org>, Government Linked Data Working Group Issue Tracker <sysbot+tracker@w3.org>
Message-ID: <4F34A2BF.4050000@deri.org>

On 12-02-09 08:56 PM, Richard Cyganiak wrote:
> On 9 Feb 2012, at 15:31, Sarven Capadisli wrote:
>> I agree with Ed Summers' proposal to drop dcat:accessURL and simply use dcat:distribution.
>
> The problem is that accessURL is optional – some datasets are not distributed online but through other means. In that case, what would be used as the node representing the dcat:Distribution? A blank node? A made-up URI? So there needs to be a special case in both production and consumption code to deal with this case, and that just increases the probability that implementers with less RDF experience get it wrong.

I find Ed's example [3] fairly straight forward:

ex:dataset1
     a dcat:Dataset ;
     dcat:distribution <http://example.gov/downloads/1> .

<http://example.gov/downloads/1>
     dcat:format "text/csv" .

No bnode or made-up URI. As far as I understand, this approach can still 
be used to access any distribution method.

>> With the exception of allowing dcat:distribution to have an HTML page as one of the ranges, it is actually similar to void:dataDump [1].
>>
>> On that note, and perhaps this is a separate issue in on of itself, we should drop the HTML page for the downloads as a possibility.
>
> The problem is that a number of catalogs don't distinguish between “direct download link” and “link to a web page that explains where and how to get the data”. A notable example is data.gov.uk. If we forbid the use of a HTML page as accessURL, then data.gov.uk cannot use dcat without re-annotating their entire catalog.

That's a fair concern, and I agree that it is important to allow several 
ways to get to the data. I retract the idea to only allow data dumps.

>> VoID's dataDump definition has a note on this that's worth considering:
>>
>> "The void:dataDump property should not be used for linking to a download web page. It should only be used for linking directly to dump files. This is to ensure that the link can be used by automated spiders that cannot find their way through an HTML page. If a publisher desires to provide a link to a download page as well, then they should use the foaf:page property instead."
>>
>> If the possible range is unclear or confusing for the publisher (as raised by Fadi Maali [2]), we should reconsider the term. I personally find something along the lines of "dataDump" to be far more clear for its intention than "download", "access" or "distribution".
>
> There are many uses of accessURL that simply are not covered by the term “data dump”. Consider RSS feeds, SPARQL endpoints, PDF reports, …

The context in which I've brought up "data dump" was about direct 
download links and it was being compared to other terms.

Here is an example for a SPARQL service:

ex:dataset1
     a dcat:Dataset ;
     dcat:distribution <http://example.gov/sparql> .

<http://example.gov/sparql>
     a dcat:WebService .

[3] http://lists.w3.org/Archives/Public/public-egov-ig/2010May/0060.html

-Sarven

> Best,
> Richard
>
>
>
>>
>> [1] http://www.w3.org/TR/void/#dumps
>> [2] http://lists.w3.org/Archives/Public/public-egov-ig/2010Jun/0004.html
>>
>> -Sarven
>>
>

Received on Friday, 10 February 2012 04:53:51 UTC