Re: ISSUE-137: Proposal to add sh:langShape

Taking this and Andy's input into consideration, maybe sh:langShape is 
an overkill and all we really need is a new parameter such as 
sh:languageIn which takes a node and, if it has a language tag, verifies 
that it matches one of the provided languages following the SPARQL 
langMatches semantics. For example:

ex:MyShape
     a sh:Shape ;
     sh:property [
         sh:predicate skos:prefLabel ;
         sh:or ( [ sh:datatype xsd:string ] [ sh:datatype rdf:langString 
] ) ;
         sh:langMatches ( "en" "fr" "de" ) .
     ] .

langMatches could be for just a single language, but having a list is 
shorter for this (apparently) common case in multi-lingual countries 
such as Belgium. I didn't know the RFC supports wildcards - this should 
hopefully flexible enough to cover all given use cases, but others may 
need to confirm.

Regards,
Holger

PS: Andy, I prefer sh:datatype rdf:langString because it would be one 
thing less to check (by form builders etc), and furthermore I believe 
the semantics of sh:langMatches needs to be that it only does something 
if the literal really has a language tag. Otherwise it would be harder 
to express mixed cases of either string or langString (which I believe 
is quite common).


On 9/09/2016 23:02, Dimitris Kontokostas wrote:
> What Holger proposes is flexible and we have the option to reuse some 
> existing constructs but I  have some concerns about this design
>
> the reason is that we currently have focus node constraints and 
> property (path) constraints
> with this approach we create a new construct only for languages that 
> is not clear what it is and how it operates e.g.
>  - if there are any differences in the meaning of e.g. sh:in when it 
> is used in a language context and when not
>  - how sh:langShape inter-operates with the extension mechanism and
>  - what does it mean to have e.g. sh:class in a sh:langShape (does all 
> constraints apply in all places?)
>
> I would prefer the creation of a few new constraint components e.g. 
> sh:languageIn that allows us to enable (if we want) the RFCs Andy 
> suggested.
>
> Another option would be to generalize the mechanism Holger suggested 
> and provide transformation functions on the focus nodes / values a 
> shapes selects
> This way we would be able to e.g. create a sets/lists of language 
> tags, unwrap RDF lists, etc and apply the shacl core components on the 
> transformed values
> However, I think it is a bit late to try something in this direction
>
> Best,
> Dimitris
>
> On Fri, Sep 9, 2016 at 2:58 AM, Holger Knublauch 
> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
>     I was given the task of writing up sh:langShape today. I already
>     did a few months back:
>
>     https://lists.w3.org/Archives/Public/public-data-shapes-wg/2016Mar/0262.html
>     <https://lists.w3.org/Archives/Public/public-data-shapes-wg/2016Mar/0262.html>
>
>     From the list of requirements at
>
>     https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-137_Missing_constraint_for_language_tag
>     <https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-137_Missing_constraint_for_language_tag>
>
>       * In SKOS, there can be only one prefLabel per language tag
>
>     Already exists: sh:uniqueLang true
>
>       * Constrain the valid language tags to a provided set, e.g.
>         (@en, @de, @fr)
>
>     See my email, sh:langShape [ sh:in ( "en" "de" "fr" ) ]
>
>       * Require that all literals have/do not have a language tag
>
>     Already exists: sh:datatype rdf:langString
>
>       * Require that a particular property have a set of literals, one
>         each language tag, e.g. "there must be 3 instances of
>         dct:abstract; the values must be literals; there must be one
>         literal for each valid language code (@en, @de, @fr)"
>
>     Can be expressed through a combination of sh:minCount = 3,
>     sh:maxCount = 3, sh:uniqueLang. (What are "instances of
>     dct:abstract"?)
>
>       * Check that the language tag is 2-letter | 3-letter | does/does
>         not have hyphens
>
>     sh:langShape [ sh:minLength 2 ; sh:maxLength 2 ; or: sh:pattern
>     "... regex ..." ]
>
>       * Check that the 2 or 3-letter tag is valid
>
>
>     Assuming that the list of valid tags is stored somewhere, e.g. in
>     an rdf:List iso:ValidLanguages:
>
>     sh:langShape [ sh:in iso:ValidLanguages ]
>
>     I don't think maintaining such a list ourselves is within the
>     scope of the WG, yet it could be expressed in the Core language.
>
>
>     PROPOSAL: Add sh:langShape as outlined. Meaning: if a value node
>     has a language tag then the string of the language tag itself
>     needs to have the given sh:Shape.
>
>
>     Holger
>
>
>
>
> -- 
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig & DBpedia 
> Association
> Projects: http://dbpedia.org, http://rdfunit.aksw.org, 
> http://aligned-project.eu
> Homepage: http://aksw.org/DimitrisKontokostas
> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>

Received on Sunday, 11 September 2016 23:30:34 UTC