dct:language range WAS: ISSUE-2 (olyerickson): dct:language should be added to DCAT [Best Practices for Publishing Linked Data]

On 9 December 2011 00:39, Richard Cyganiak <richard@cyganiak.de> wrote:
> On 8 Dec 2011, at 20:36, Stasinos Konstantopoulos wrote:
>> I also find the use of URIs more agreeable than plain literals.
>>
>> Notice that lingvoj.org recommend using the lexvo.org vocabulary over
>> their own. lexvo.org have published URIs derived from ISO-639 codes,
>> eg http://www.lexvo.org/page/iso639-3/ell for Modern Greek. These are
>> instances of http://lexvo.org/ontology#Language which is, conveniently
>> enough, a subClassOf http://purl.org/dc/terms/LinguisticSystem [1].
>
> So what do we say in dcat then? Do we say that Lexvo URIs should be used in dcat?
>
> That would bring us right back to some of the issues we discussed in the call today. Should we ask governments to rely on a service that is provided by an individual without any kind of organisation behind him?
>
> Another option would be to recommend that a URI be used to identify languages, but leaving it to each publisher what URI to use. This would probably mean that many catalogs mint and use many different URIs for the same thing, e.g., the English language.
>
> One more idea would be to rely on the xsd:language datatype of XML Schema, for example:
>
>    [] dcterms:language "en"^^xsd:language.
>
> The lexical representations of this datatype are again taken from BCP 47 (in XSD 1.1; it was RFC 3066 in XSD 1.0).
>
> The datatype would ensure that there is no doubt about how to interpret the string "en".
>
> Best,
> Richard

There are various alternatives that the group might want to consider,
depending on how specific we want to be and how badly (if at all) we
strive to define dct:language semantics in RDFS or a natural language
definition is also acceptable. Some ideas:

1. Specify explicitly that dct:language rdfs:range
http://lexvo.org/ontology#Language . This would be, IMHO, ideal if
this vocabulary were maintained by ISO themselves but that is not the
case. There are other alternatives, such as ISO 639 RDF [1] and SIL
[2], but I think lexvo is most appropriate.

2. Specify only that dct:language rdfs:range
http://purl.org/dc/terms/LinguisticSystem , which is less restrictive
and would allow one to immediatelly switch to whatever supercedes
lexvo but leaves a software agent not much wiser about what to expect
as a value of dct:language.

3. Specify, in natural language, that the range can be any vocabulary
with a direct and obvious mapping to ISO 639. Very flexible, but again
not very informative for software agents; unless there is a way to
provide this mapping in a machine-friendly notation like a regular
expression.

4. Specify our own vocabulary in W3C space, combining the advantages
of (1) and persistent URIs. W3C has started this discussion some years
back [3] but there seems to be no concrete outcome; please shout if I
have overlooked something.

Some further thoughts on combining (3) and (4): ISO 639 is actively
maintained, and the RDF vocabulary needs to be updated every time a
language is added. Since (again, please shout if I am wrong) ISO do
not publish an RDF version of ISO 639, it seems to me that it makes
more sense for this group to define a persistent languages namespace
within W3C and a regular expression that maps from that namespace to
ISO 639 and never need to change anything in order to be up to date
with ISO 639.

Stasinos



[1] http://downlode.org/Code/RDF/ISO-639

[2] http://www.ethnologue.com/language_index.asp This is very rich LOD
but lacks an RDF formalization, so there is no language class defined
or anything, but there is a semi-structured mapping of macrolanguages
such as "Arabic" to their constituents
(http://www.sil.org/iso639-3/macrolanguages.asp) and there are also
cross-links between 639-2 and 639-3 codes (cf, eg,
http://www.sil.org/iso639-3/documentation.asp?id=gre and
http://www.sil.org/iso639-3/documentation.asp?id=ell) and the
Ethonologue database
(http://www.ethnologue.com/show_language.asp?code=ell)

[3] http://www.w3.org/wiki/Languages_as_RDF_Resources

Received on Friday, 9 December 2011 03:12:59 UTC