Some annotations and background material for the topic's discussion

Naming and dereferencing

Topics

IRIs vs URIs
Opaque URIs vs descriptive URIs
Namespace selection
Language based content negotiation

Patterns for Naming

ID	Name	Description	Example	Arguments in favor	Arguments against	References
P1.1	Descriptive URIs	Descriptive URIs	http://example.org/Armenia	Readability (but English/latin alphabet) Tool support The only way of describing the concept in ontologies that lack labels and comments	Unreadable for non-Latin alphabet users Difficult to be descriptive enough in a URI in certain contexts (biomedical, financial, ...), descriptive names are sometimes "criptical"
P1.2	Opaque URIs	Opaque URIs	http://example.org/#I23AX45	Independence between content and language Changes in textual descriptions do not imply changes in URIs Suitable for automatic LD generation from existent resources	Non human-readable Worse for developers	Examples: http://lemon-model.net/lexica/uby/wn/WN_Lexicon_0 Princeton WordNet, EuroWordNet, Agrovoc, FRBR models and ISBD in the bibliographic domain
P1.3	Full IRIs	Full IRIs	http://օրինակ.օրգ/#Հայաստան	Readable (for one language)	Security issues (spoofing) Unreadable for speakers of other languages Issues with right-to-left languages like Arabic or Hebrew
P1.4	Internationalized paths only	Internationalized paths only	http://example.org/#Հայաստան, http://example.org/Հայաստան	Less security risks Path readable (for one language)	Unreadable for speakers of other languages Problem: where the namespace does not end in / or # is difficult to see where the term is (eg.: namespace is http://w3.org/html and full name is http://w3.org/htmldiv). You should define prefixes that end with / or #	Local DBpedias
P1.5a	Include language in host name of URI	Include language in URIs	http://hy.example.org/#Հայաստան	practical reasons: divide in different datasets, as in DBpedia	Where do we put it? (beginning, end, ..) Dialects Actually, "es" in DBpedia uri identifies NOT the language but the source May be technically challenging (requires DNS entry for each language) Confuses server/data distinction	Local DBpedias
P1.5b	Include language in path of URI	Include language in URIs	http://example.com/en/Armenia http://example.com/Armenia.en http://example.com/Armenia?lang=en	Compatible with content negotiation

QUESTIONS:

Which pattern for naming do we propose as prefered in multilingual scenarios?

Which other patterns are acceptable and under which conditions?

Patterns for Dereferencing

ID	Name	Description	Example	Arguments in favor	Arguments against
P1.6	No language negotiation	Return always the same triples without taking into account the HTTP Accept-language header. No language negotiation		You have all the information. In RDF if you have the language tag, you do not need content negotiation
P1.7	Language content negotiation	the server attends the language preferences of the user agent, presented in the Accept-language header and returns different data for each language preference. Language content negotiation	See example here	Save bandwith	You can loose information, specially if there is multilingual content
P1.8	Language content redirection	the server attends the language preferences of the user agent, presented in the Accept-language header and returns a 303 (see also) redirect to a resource with triples in that language	Example. If the URI to dereference is http://example.org and the Accept-language header is es, the server returns 303 (see also) to the URI http://example.org/?lang=es which contains triples with spanish content	Maintains the difference between the generic representation of a resource in any language and the representation of that resource in a given language	Not feasible for all the resources to have representations in different languages.

Comments

In principle we see few arguments in favour of language content negotiation in the context of Linked Data (http://www.w3.org/2014/02/21-bpmlod-minutes.html)

Textual Information

Topics

Labels with language tag
Labels without language tag
Longer descriptions
Lexicalizations and linguistic information
Localization information, see related discussion on localization workflows at 7 March 2014 call

Patterns for Textual information

ID	Name	Description	Example	Arguments in favour	Arguments against	References
P2.1	Use rdfs:label for Everything	Linked data datasets should provide labels with the property rdfs:label for all resources: individuals, concepts and properties, not just the main entities.	:juan rdfs:label "Juan" . :Professor rdfs:label "Professor"@en . :position rdfs:label "Position"@en ; rdfs:label "Posición"@es .	rdfs:label is a well established property which is supported by most of the tools	There may be some tools that don't support rdfs:label It may be difficult to associate a rdfs:label to some resources, specially automatically generated resources	Label Everything
P2.2	Multilingual labels	In a multilingual setting, it is necessary to attach language tags to textual information, in order to identify the appropriate label for localized applications.	:juan :position "Professor"@en ; :position "Catedrático"@es .	Multilingual labels are part of the RDF standard and well supported by semantic web tools.	To do a SPARQL query over a dataset with multilingual labels is a little more difficult than without language tags. For example, the following SPARQL query would return no results: SELECT * WHERE { ?x ex:position "Professor" . } It is necessary to specify the language. So the SPARQL query that works is: SELECT * WHERE { ?x ex:position "Professor"@en . } or if the language is unknown, it can be expressed as: SELECT * WHERE { ?x :position ?p . FILTER ( str(?p)="Professor" ) }	Multilingual labels
P2.3	Labels without language tags	Apart of language-tagged labels, one can also associate plain labels without a language tag.	:juan :position "Professor"@en ; :position "Catedrático"@es ; :position "Professor" .	This pattern can facilitate SPARQL queries when one does not know the language of the literals.	In general, this is perceived as a bad practice. In which language should we write the label without language tag?	Labels without language tags Rules of thumb by Richard Cyganiak
P2.4	Annotate long descriptions	Annotate long descriptions using new resources that can be represented with shorter labels or lexical entities.	:juan :job "Professor at the University of León"@en . would turn into :juan :job "Professor at the University of León"@en ; :position :professor ; :workPlace :unileón ; :professor rdfs:label "Professor"@en . :uniLeón rdfs:label "University of León"@en .	It facilitates their future translation to other languages.	It is not always feasible to find the right resources. Sometimes the long description can be replaced by the annotations.	divide longer descriptions
P2.5	Provide lexical information	Provide lexical information in an external lexicon, for instance using lemon model	:unileón a lemon:LexicalEntry ; lemon:decomposition ( [ lemon:element :University ] [ lemon:element :Of ] [ lemon:element :León ] ); rdfs:label "University of León"@en . :University a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:commonNoun ; rdfs:label "University"@en ; rdfs:label "Universidad"@es . :Of a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:preposition ; rdfs:label "of"@en ; rdfs:label "de"@es . :León a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:properNoun ; rdfs:label "León"	Providing lexical metadata for a resource can help linked data applications to visualize and manage textual information.	It can also add a complexity overhead to the dataset that may be undesired.	Labels Provide lexical information. Lexical annotations can also be done using NIF to reference parts of the strings.
P2.6	Structured literals	Literals in the RDF model can also be structured values in XML or HTML. Using structured literals.	:unileón :desc "<p>University of <span translate="no">León</span>, Spain.</p>"^^rdf:XMLLiteral .	It is possible to offer longer descriptions leveraging the Internationalization practices that have already been proposed for those languages.	Abusing of structured literals can miss the advantages of semantic modeling in RDF	Structured literals

Comments

Whatever naming convention is used, it is important to provide descriptions or explanations of what the terms actually mean. In particular, it is a good idea to provide titles and if possible descriptions in multiple languages - ideally languages with different structures.

Linking

See http://oa.upm.es/8848/1/Multiling.pdf and http://www.weso.es/MLODPatterns/catalog.html

Patterns for Linking at the Conceptual level

ID	Name	Description	Example	Arguments in favour	Arguments against	References
P3.1	Cross-lingual identity links	Use `owl:sameAs` to link resources expressed in different languages	Suppose we have information about Armenia in English which is identified by `http://hy.example.org#Հայաստան` while the URI `http://en.example.org#Armenia` contains information about Armenia in English. We can declare that both URIs refer to the same thing by asserting: <http://hy.example.org#Հայաստան> owl:sameAs <http://en.example.org#Armenia> .	`owl:sameAs` is a well-known property which is supported by several linked data applications.	The semantics of `owl:sameAs` has some implications which may be undesirable. For example, it could be that the information about Armenia in the different languages comes from different sources and thus, contains different data. Using `owl:sameAs` can then render inconsistencies.	Inter-language Identity links
P3.2	Cross-lingual soft links	Use a soft property to state that two resources are inter-language linked (e.g., `rdfs:seeAlso, skos:closeMatch, skos:exactMatch`).	The example in pattern P3.1 can be expressed as: <http://hy.example.org#Հայաստան> rdfs:seeAlso <http://en.example.org#Armenia> .	Soft links are weaker regarding semantic implications than an `owl:sameAs` link.	Using a custom property like `dbo:wikiPageInterLanguageLink` can provide more freedom but those properties are usually not well recognized by automated software agents. Thus, the use of more common properties with similar semantics (i.e. `rdfs:seeAlso`, `skos:related`, etc) should be considered.	Soft inter-language links
P3.3	Cross-lingual taxonomical relations	For instance rdfs:subClassOf, skos:broader, etc. thus considering not only identity links as in P3.1 but any other possible taxonomical relationship [This pattern somehow subsumes P3.1]	ontology1:Person rdfs:label "person"@en . ontology2:Hombre rdfs:label "hombre"@es . ontology1:Hombre rdfs:subClassOf ontology2:Person .
P3.4	Domain dependent relations	That is, using properties coming from other ontologies. E.g., foaf:currentProject, dbpedia-owl:capital, mo:artist, etc.	ontology1:Москва rdfs:label "Москва"@ru . ontology2:Russia rdfs:label "Russia"@en . ontology1:Москва dbpedia-owl:capital ontology2:Russia .
P3.5	Linkage by using common background knowledge	In case related ontology entities are linked to a common external ontology, dataset (e.g., BabelNet, DBpedia) or lexicon, this background knowledge can be used as pivot for inferring a relation between such ontology entities.	:bench-en a lemon:LexicalEntry ; lemon:form [lemon:writtenrep "bench"@en] . :bench-en-sense_1 a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology1:bench . :bench-en-sense_2 a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology2:banco.

Patterns for Linking at the Linguistic level

Here the links would not be established between the concepts (or instances) themselves but between their associated linguistic information. This sort of mappings can be very useful when keeping uncoupled the conceptual and linguistic information is a major requirement. In order to allow two ontologies to interoperate at the linguistic level, mappings would be established between the linguistic descriptions of their concepts, which are not necessarily exact equivalents but the closest correspondences between culture-specific concepts. Gracia et al. 2012

ID	Name	Description	Example	Arguments in favour	Arguments against	References
P3.6	Implicit translations	Let us suppose that entities in the ontology point to lexical entries in different monolingual lexicons. If lexical entries in different lexicons share the same (or equivalent) ontological referent, a translation can be inferred between them.	The following example "bench"@en is the lexical realisation of two different ontology entities :lexiconEN lemon:term :bench-en . :bench-en a lemon:LexicalEntry ; lemon:form [lemon:writtenrep "bench"@en] . :lexiconES lemon:term :banco-es . :banco-es a lemon:LexicalEntry ; lemon:form [lemon:writtenrep "banco"@en] . :bench-en-sense a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology1:bench . :banco-es-sense a lemon:LexicalSense ; lemon:isSenseOf :banco-en ; lemon:reference ontology1:bench.
P3.7	Linkage by using explicit translations	When the lexical information of the ontology is represented in an external lexicon, explicit translations can be declared among their senses. See http://purl.org/net/translation	In the following example "bench"@en and "banco"@es are represented using lemon and their senses linked through a Translation object. :bench-en-sense a lemon:LexicalSense ; lemon:isSenseOf :bench-en ; lemon:reference ontology1:bench . :bench-en a lemon:LexicalEntry ; lemon:Form [lemon:writtenrep "bench"@en] . :banco-es-sense a lemon:LexicalSense ; lemon:isSenseOf :banco-es ; lemon:reference ontology2:banco . :banco-es a lemon:LexicalEntry ; lemon:Form [lemon:writtenrep "banco"@es] . :bench_banco-trans a tr:Translation ; tr:translationSource :bench-en-sense ; tr:translationTarget :banco-es-sense .	This pattern allows to represent translations explicitly and, as the relation is reified, additional information can be attached to it (provenance, confidence, etc.).	On the other hand it adds complexity.	A dataset using this representation mechanism can be found at http://linguistic.linkeddata.es/apertium/

Ontologies and vocabularies

Patterns for Vocabulary Reuse

ID	Name	Description	Example	Arguments in favour	Arguments against	References
P4.1	Monolingual vocabularies	Define vocabularies with terms defined in a single language, usually English	Many popular vocabularies and ontologies for the semantic web (FOAF, Dublin Core, OWL, RDF Schema, etc.) are monolingual in English, both for labels and comments.	Easy to control the vocabulary evolution and avoid the appearance of bad translations or ambiguities between language versions	Translation is needed if the ontology is used in a multilingual setup.	Monolingual vocabularies
P4.2	Multilingual vocabularies	Define vocabularies and ontologies where the concepts contain translations for several languages.	:position a owl:DatatypeProperty ; rdfs:domain :UniversityStaff ; rdfs:label "Position"@en ; rdfs:label "Puesto"@es . :UniversityStaff a owl:Class ; rdfs:label "University staff"@en ; rdfs:label "Trabajador universitario"@es . Some multilingual ontologies are Agrovoc and Eurovoc.	Good to support multilingual applications. Ontologies easier to understand by more people (users, developers, ...).	Ontology harder to maintain. Some concepts are difficult to translate and there may appear ambiguities in the translations. For example, the label Professor may be translated to Profesor in Spanish. However, the meaning of those concepts is different (in Spanish it is usually preferred as Catedrático).	Multilingual vocabularies
P4.3	Localize existing vocabularies	Enrich existing vocabularies with local translations, externally to the original vocabulary.	A linked data application in Spanish may use the Dublin Core vocabulary to indicate the contributors of a given work. The end-user should see the labels in his own language. To that end, one can add a localized label to dc:contributor as: dc:contributor rdfs:label "Colaborador"@es .	A multilingual linked data application could transparently select the tagged literals in its preferred language. Easier to maintain than the multilingual ontology solution	The data is not centralised in a single ontology, thus being more difficult to discover Polluting well known vocabularies with localized literals may be controversial and should be handled with caution.	Localize existing vocabularies
P4.4	Create new localized vocabularies	This pattern is about creating new localized properties and classes and relate them to existing ones using the owl:sameAs, owl:equivalentProperty or owl:equivalentClass properties.	dc:contributor owl:equivalentProperty :colaborador . :colaborador rdfs:label "Colaborador"@es .	This pattern gives freedom to vocabulary creators to tailor the vocabulary according to their exact needs.	However, it can be more difficult for both humans and software agents to recognize and consume these new properties and classes.	Create new localized vocabularies
P4.5	Use Lemon to enrich the multilingual semantics of existing vocabularies	This pattern is about using well established ontologies, such as Lemon, to add multilingual semantics and translations to existing vocabulary terms.	:colaborador a lemon:LexicalEntry ; rdfs:label "Colaborador"@es ; lemon:language "es" ; lemon:sense :colaborador_sense . :colaborador_sense lemon:reference dc:contributor .	Users can provide translations to vocabularies using well established multilingual standards (Lemon) which also add finer-grained semantics to the terms.	However, agents not familiar with Lemon may find it harder to interpret the translations.	lemon - The Lexicon Model for Ontologies

Best practises - previous notes

Naming and dereferencing

Topics

Patterns for Naming

Related links:

QUESTIONS:

Patterns for Dereferencing

Related Links

Comments

Textual Information

Topics

Patterns for Textual information

Related Links

Comments

Linking

Patterns for Linking at the Conceptual level

Patterns for Linking at the Linguistic level

Ontologies and vocabularies

Patterns for Vocabulary Reuse

Quality