Best practises - previous notes
Some annotations and background material for the topic's discussion
Naming and dereferencing
Topics
- IRIs vs URIs
- Opaque URIs vs descriptive URIs
- Namespace selection
- Language based content negotiation
Patterns for Naming
ID | Name | Description | Example | Arguments in favor | Arguments against | References |
---|---|---|---|---|---|---|
P1.1 | Descriptive URIs | Descriptive URIs | http://example.org/Armenia |
|
|
|
P1.2 | Opaque URIs | Opaque URIs | http://example.org/#I23AX45 |
|
|
Examples: http://lemon-model.net/lexica/uby/wn/WN_Lexicon_0
Princeton WordNet, EuroWordNet, Agrovoc, FRBR models and ISBD in the bibliographic domain |
P1.3 | Full IRIs | Full IRIs | http://օրինակ.օրգ/#Հայաստան |
|
|
|
P1.4 | Internationalized paths only | Internationalized paths only | http://example.org/#Հայաստան, http://example.org/Հայաստան |
|
|
Local DBpedias |
P1.5a | Include language in host name of URI | Include language in URIs | http://hy.example.org/#Հայաստան |
|
|
Local DBpedias |
P1.5b | Include language in path of URI | Include language in URIs | http://example.com/en/Armenia
http://example.com/Armenia.en http://example.com/Armenia?lang=en |
|
Related links:
See thread about "Fragment issues in ITS/HTML/XML mapping to NIF" at http://lists.w3.org/Archives/Public/public-bpmlod/2013Sep/0002.html
See interesting e-mail from Berners-Lee on this subject http://lists.w3.org/Archives/Public/public-lod/2011Apr/0282.html
QUESTIONS:
Which pattern for naming do we propose as prefered in multilingual scenarios?
Which other patterns are acceptable and under which conditions?
Patterns for Dereferencing
ID | Name | Description | Example | Arguments in favor | Arguments against | References |
---|---|---|---|---|---|---|
P1.6 | No language negotiation | Return always the same triples without taking into account the HTTP Accept-language header. No language negotiation | You have all the information. In RDF if you have the language tag, you do not need content negotiation | |||
P1.7 | Language content negotiation | the server attends the language preferences of the user agent, presented in the Accept-language header and returns different data for each language preference. Language content negotiation | See example here | Save bandwith | You can loose information, specially if there is multilingual content | |
P1.8 | Language content redirection | the server attends the language preferences of the user agent, presented in the Accept-language header and returns a 303 (see also) redirect to a resource with triples in that language | Example. If the URI to dereference is http://example.org and the Accept-language header is es, the server returns 303 (see also) to the URI http://example.org/?lang=es which contains triples with spanish content | Maintains the difference between the generic representation of a resource in any language and the representation of that resource in a given language | Not feasible for all the resources to have representations in different languages. |
Related Links
W3C Internationalization Activity's Best Practice on selecting language tags
Comments
In principle we see few arguments in favour of language content negotiation in the context of Linked Data (http://www.w3.org/2014/02/21-bpmlod-minutes.html)
Textual Information
Topics
- Labels with language tag
- Labels without language tag
- Longer descriptions
- Lexicalizations and linguistic information
- Localization information, see related discussion on localization workflows at 7 March 2014 call
Patterns for Textual information
ID | Name | Description | Example | Arguments in favour | Arguments against | References |
---|---|---|---|---|---|---|
P2.1 | Use rdfs:label for Everything | Linked data datasets should provide labels with the property rdfs:label for all resources: individuals, concepts and properties, not just the main entities. |
:juan rdfs:label "Juan" . :Professor rdfs:label "Professor"@en . :position rdfs:label "Position"@en ; rdfs:label "Posición"@es . |
|
|
Label Everything |
P2.2 | Multilingual labels | In a multilingual setting, it is necessary to attach language tags to textual information, in order to identify the appropriate label for localized applications. |
:juan :position "Professor"@en ; :position "Catedrático"@es . |
Multilingual labels are part of the RDF standard and well supported by semantic web tools. | To do a SPARQL query over a dataset with multilingual labels is a little more difficult than without language tags. For example, the following SPARQL query would return no results:
SELECT * WHERE { ?x ex:position "Professor" . } It is necessary to specify the language. So the SPARQL query that works is: SELECT * WHERE { ?x ex:position "Professor"@en . } or if the language is unknown, it can be expressed as: SELECT * WHERE { ?x :position ?p . FILTER ( str(?p)="Professor" ) } |
Multilingual labels |
P2.3 | Labels without language tags | Apart of language-tagged labels, one can also associate plain labels without a language tag. |
:juan :position "Professor"@en ; :position "Catedrático"@es ; :position "Professor" . |
This pattern can facilitate SPARQL queries when one does not know the language of the literals. | In general, this is perceived as a bad practice. In which language should we write the label without language tag? | Labels without language tags |
P2.4 | Annotate long descriptions | Annotate long descriptions using new resources that can be represented with shorter labels or lexical entities. |
:juan :job "Professor at the University of León"@en . would turn into :juan :job "Professor at the University of León"@en ; :position :professor ; :workPlace :unileón ; :professor rdfs:label "Professor"@en . :uniLeón rdfs:label "University of León"@en . |
It facilitates their future translation to other languages. | It is not always feasible to find the right resources.
Sometimes the long description can be replaced by the annotations. |
divide longer descriptions |
P2.5 | Provide lexical information | Provide lexical information in an external lexicon, for instance using lemon model |
:unileón a lemon:LexicalEntry ; lemon:decomposition ( [ lemon:element :University ] [ lemon:element :Of ] [ lemon:element :León ] ); rdfs:label "University of León"@en . :University a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:commonNoun ; rdfs:label "University"@en ; rdfs:label "Universidad"@es . :Of a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:preposition ; rdfs:label "of"@en ; rdfs:label "de"@es . :León a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:properNoun ; rdfs:label "León" |
Providing lexical metadata for a resource can help linked data applications to visualize and manage textual information. | It can also add a complexity overhead to the dataset that may be undesired. | Labels Provide lexical information.
Lexical annotations can also be done using NIF to reference parts of the strings. |
P2.6 | Structured literals | Literals in the RDF model can also be structured values in XML or HTML. Using structured literals. |
:unileón :desc "<p>University of <span translate="no">León</span>, Spain.</p>"^^rdf:XMLLiteral . |
It is possible to offer longer descriptions leveraging the Internationalization practices that have already been proposed for those languages. | Abusing of structured literals can miss the advantages of semantic modeling in RDF | Structured literals |
Related Links
W3C Internationalization Activity's Best Practice on selecting language tags
Comments
Whatever naming convention is used, it is important to provide descriptions or explanations of what the terms actually mean. In particular, it is a good idea to provide titles and if possible descriptions in multiple languages - ideally languages with different structures.
Linking
See http://oa.upm.es/8848/1/Multiling.pdf and http://www.weso.es/MLODPatterns/catalog.html
Patterns for Linking at the Conceptual level
ID | Name | Description | Example | Arguments in favour | Arguments against | References |
---|---|---|---|---|---|---|
P3.1 | Cross-lingual identity links | Use `owl:sameAs` to link resources expressed in different languages | Suppose we have information about Armenia in English which is identified by http://hy.example.org#Հայաստան while the URI
<http://hy.example.org#Հայաստան> owl:sameAs <http://en.example.org#Armenia> . |
owl:sameAs is a well-known property which is supported by several linked data applications.
|
The semantics of owl:sameAs has some implications which may be undesirable. For
example, it could be that the information about Armenia in the different languages comes from different
sources and thus, contains different data. Using |
Inter-language Identity links |
P3.2 | Cross-lingual soft links | Use a soft property to state that two resources are inter-language linked (e.g., rdfs:seeAlso, skos:closeMatch, skos:exactMatch ).
|
The example in pattern P3.1 can be expressed as:
<http://hy.example.org#Հայաստան> rdfs:seeAlso <http://en.example.org#Armenia> . |
Soft links are weaker regarding semantic implications than an
|
Using a custom property like dbo:wikiPageInterLanguageLink can provide
more freedom but those properties are usually not well recognized by automated software agents.
Thus, the use of more common properties with similar semantics (i.e. |
Soft inter-language links |
P3.3 | Cross-lingual taxonomical relations | For instance rdfs:subClassOf, skos:broader, etc. thus considering not only identity links as in P3.1 but any other possible taxonomical relationship [This pattern somehow subsumes P3.1] |
ontology1:Person rdfs:label "person"@en . ontology2:Hombre rdfs:label "hombre"@es . ontology1:Hombre rdfs:subClassOf ontology2:Person . |
|||
P3.4 | Domain dependent relations | That is, using properties coming from other ontologies. E.g., foaf:currentProject, dbpedia-owl:capital, mo:artist, etc. |
ontology1:Москва rdfs:label "Москва"@ru . ontology2:Russia rdfs:label "Russia"@en . ontology1:Москва dbpedia-owl:capital ontology2:Russia . |
|||
P3.5 | Linkage by using common background knowledge | In case related ontology entities are linked to a common external ontology, dataset (e.g., BabelNet, DBpedia) or lexicon, this background knowledge can be used as pivot for inferring a relation between such ontology entities. |
:bench-en a lemon:LexicalEntry ; lemon:form [lemon:writtenrep "bench"@en] . :bench-en-sense_1 a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology1:bench . :bench-en-sense_2 a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology2:banco. |
Patterns for Linking at the Linguistic level
Here the links would not be established between the concepts (or instances) themselves but between their associated linguistic information. This sort of mappings can be very useful when keeping uncoupled the conceptual and linguistic information is a major requirement. In order to allow two ontologies to interoperate at the linguistic level, mappings would be established between the linguistic descriptions of their concepts, which are not necessarily exact equivalents but the closest correspondences between culture-specific concepts. Gracia et al. 2012
ID | Name | Description | Example | Arguments in favour | Arguments against | References |
---|---|---|---|---|---|---|
P3.6 | Implicit translations | Let us suppose that entities in the ontology point to lexical entries in different monolingual lexicons. If lexical entries in different lexicons share the same (or equivalent) ontological referent, a translation can be inferred between them. | The following example "bench"@en is the lexical realisation of two different ontology entities
:lexiconEN lemon:term :bench-en . :bench-en a lemon:LexicalEntry ; lemon:form [lemon:writtenrep "bench"@en] . :lexiconES lemon:term :banco-es . :banco-es a lemon:LexicalEntry ; lemon:form [lemon:writtenrep "banco"@en] . :bench-en-sense a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology1:bench . :banco-es-sense a lemon:LexicalSense ; lemon:isSenseOf :banco-en ; lemon:reference ontology1:bench. |
|||
P3.7 | Linkage by using explicit translations | When the lexical information of the ontology is represented in an external lexicon, explicit translations can be declared among their senses. See http://purl.org/net/translation | In the following example "bench"@en and "banco"@es are represented using lemon and their senses linked through a Translation object.
:bench-en-sense a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology1:bench . :bench-en a lemon:LexicalEntry ; lemon:Form [lemon:writtenrep "bench"@en] . :banco-es-sense a lemon:LexicalSense ; lemon:isSenseOf :banco-es ; lemon:reference ontology2:banco . :banco-es a lemon:LexicalEntry ; lemon:Form [lemon:writtenrep "banco"@es] . :bench_banco-trans a tr:Translation ; tr:translationSource :bench-en-sense ; tr:translationTarget :banco-es-sense . |
This pattern allows to represent translations explicitly and, as the relation is reified, additional information can be attached to it (provenance, confidence, etc.). | On the other hand it adds complexity. | A dataset using this representation mechanism can be found at http://linguistic.linkeddata.es/apertium/ |
Ontologies and vocabularies
Patterns for Vocabulary Reuse
ID | Name | Description | Example | Arguments in favour | Arguments against | References |
---|---|---|---|---|---|---|
P4.1 | Monolingual vocabularies | Define vocabularies with terms defined in a single language, usually English | Many popular vocabularies and ontologies for the semantic web (FOAF, Dublin Core, OWL, RDF Schema, etc.) are monolingual in English, both for labels and comments. | Easy to control the vocabulary evolution and avoid the appearance of bad translations or ambiguities between language versions | Translation is needed if the ontology is used in a multilingual setup. | Monolingual vocabularies |
P4.2 | Multilingual vocabularies | Define vocabularies and ontologies where the concepts contain translations for several languages. |
:position a owl:DatatypeProperty ; rdfs:domain :UniversityStaff ; rdfs:label "Position"@en ; rdfs:label "Puesto"@es . :UniversityStaff a owl:Class ; rdfs:label "University staff"@en ; rdfs:label "Trabajador universitario"@es . Some multilingual ontologies are Agrovoc and Eurovoc. |
|
|
Multilingual vocabularies |
P4.3 | Localize existing vocabularies | Enrich existing vocabularies with local translations, externally to the original vocabulary. | A linked data application in Spanish may use the Dublin Core vocabulary to indicate the contributors of a given work. The end-user should see the labels in his own language. To that end, one can add a localized label to dc:contributor as:
dc:contributor rdfs:label "Colaborador"@es . |
|
|
Localize existing vocabularies |
P4.4 | Create new localized vocabularies | This pattern is about creating new localized properties and classes and relate them to existing ones using the owl:sameAs, owl:equivalentProperty or owl:equivalentClass properties. |
dc:contributor owl:equivalentProperty :colaborador . :colaborador rdfs:label "Colaborador"@es . |
This pattern gives freedom to vocabulary creators to tailor the vocabulary according to their exact needs. | However, it can be more difficult for both humans and software agents to recognize and consume these new properties and classes. | Create new localized vocabularies |
P4.5 | Use Lemon to enrich the multilingual semantics of existing vocabularies | This pattern is about using well established ontologies, such as Lemon, to add multilingual semantics and translations to existing vocabulary terms. |
:colaborador a lemon:LexicalEntry ; rdfs:label "Colaborador"@es ; lemon:language "es" ; lemon:sense :colaborador_sense . :colaborador_sense lemon:reference dc:contributor . |
Users can provide translations to vocabularies using well established multilingual standards (Lemon) which also add finer-grained semantics to the terms. | However, agents not familiar with Lemon may find it harder to interpret the translations. | lemon - The Lexicon Model for Ontologies |