Wikidata data-quality

Wikidata-data-quality

Wikidata data-quality

Eric Prud'hommeaux <eric@w3.org>

migrant ShEx worker

Wikidata-data-quality

Overview

Wikidata-data-quality

query countries

  • complex and idiomatic
SO screenshot
Wikidata-data-quality

query countries

  • complex and idiomatic
  • reflects both
    • geopolitical realities and
    • arbitrary modeling choices
# List of present-day countries and capital(s)
SELECT DISTINCT ?country ?countryLabel ?capital ?capitalLabel
WHERE
{
    ?country wdt:P31 wd:Q3624078 .
    # not a former country
    FILTER NOT EXISTS {?country wdt:P31 wd:Q3024240}
    # and no an ancient civilisation (needed to exclude ancient Egypt)
    FILTER NOT EXISTS {?country wdt:P31 wd:Q28171280}
    OPTIONAL { ?country wdt:P36 ?capital } .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
ORDER BY ?countryLabel
Wikidata-data-quality

contributor sisyphean task

Wikidata-data-quality

documentation by example

no central place to establish/review community practice

Wikidata-data-quality

documentation by definition

Wikidata-data-quality

ShEx features

  • structural validation
  • executable; verifies
    • instance data
    • schema
  • import and extend other schemas for specialization
IMPORT <E91>

start=@<#FLOSSemulator>
<#FLOSSemulator>
  EXTENDS @<E91#emulator>
  EXTRA p:P31  {
    # copyright license
    p:P275 @<license> +;
    # source code repository;
    p:P1324 { ps:P1324 IRI } ?  ;
    # Framalibre ID;
    wdt:P4107 xsd:string ? ;
    # Pro-Linux.de DBApp ID; 
    wdt:P6665 xsd:string ? ;
    # SWH Release ID; 
    wdt:P6138 xsd:string ? ;
    …
}

<#license> {
     p:P31 [wd:Q3943414]
}