See also: IRC log
<phila> scribe: phila
<scribe> scribeNick: phila
betehess: We've released a lot of
schema.org org, planning to release more.
... have a few things we'd like to see in schema.org
danbri: what about W3C infrastructure etc.
betehess: I don't know what it means to operate with W3C in this case.
danbri: It seemed a shame not to
get together. We used to have very energetic hacky
meetings.
... Over time SWIG stopped meeting at TPAC... fizzled out into
a bunch of mailing lists.
... schema.org has had a weird relationship with W3C. We use a
W3C mailing list, now a CG
... series of conversations with W3C what it might do in this
space.
... schema.org has 400 open issues. I'd really like to fix some
of those if we can.
... Maybe given who's here we should talk about process
issues.
jtandy: Introduces self.
... I want to surface my data on the web. If it's not in a
search engine, it's not really on the Web. In my/geospatial
community, people publish through Web services (WFS etc.)
... These aren't indexed. I'd like that data to be
accessed.
... There are shortfalls in schema.org
... Also want to understand what users have to do...
... I'm not doing it to create structured data on the
web.
... I personally don't believe we're in a situation where
machines can automagically infer things found on the Web.
... It's about consistent naming etc.
... I've worked in the WMO to publish a lot of their controlled
vocabulary. I've been using SKOS-based registry
-> http://codes.wmo.int WMO Codes Registry
jtandy: I have a weather schema, can we try and unpack the black art of data description, data feeds
<Ralph> PhilA: I'm interested particularly in what W3C should do to be better at [vocab support]
<Ralph> ... in hindsight it was a bad decision for W3C to not get involved in supporting big vocabulary development
<jtandy> [ weather schema open issue: https://github.com/schemaorg/schemaorg/issues/362 ]
<Ralph> ... the REC Track process doesn't suite many vocabs
<Ralph> ... there's lack of understanding of the difference between the specification and the namespace document
<Ralph> ... we had discussions this week on some namespace questions
fsasaki: I've been in
multilingual area at W3C. I'm a fellow of DFKI language tech
centre
... I want to use schema.org for cross lingual access
<Ralph> PhilA: Doug Schepers has a demo consisting of an SVG document containing structured data
<fsasaki> (and for using structured data that is off the web for generation schema.org on the web)
fsasaki: In weather reports, these are often auto-generated from data that is not on the Web. What's generated is just text. The text is already structured
<Ralph> ... the structured data allows for a screen reader to provide an incredibly rich description of the data in the file
phila: Talks about Doug Sheppers' accessibility SVG demo
Ralph: I'd like to see how the CG
can/is working with other bits of W3C. If there are un met
needs in terms of process. We'd like to meet those unmet
needs.
... I think the schema.org experience can be replicated for
others. Tooling, process etc.
... I'd like top understand what we can do.
Francesco: Working in Florence
research institute. Want to know more about how schema.org
works
... Where I work, about 18 months ago we were thinking of
adopting schema.org to integrate all the research data in our
various databases.
danbri: Do you find yourself developing schemas?
Francesco: No. We couldn't find a
lot of support and couldn't see the benefit of doing this on
our site.
... I'm here as a user more than a contributor.
danbri: schema.org grew quickly
because for some people there were clear benefits. It may be
being used deep within Google as well.
... In the context of W3C, one of our roles ... if you publish
it, we think people on your side should be able to use it. If a
widget in your browser tells you about what's there you'd see
problems.
Francesco: People are interested
in schema.org because it's an SEO booster. But what are the
restful architecture advantages
... We're still very interested.
Danbri: We've struggled to get people to adopt this for 15 years, we have components...
ericP: How do you optimise the
balance between chaos and process to make sure that the output
is useful.
... The more process, the more restrictions, but the less you
get the more chaos you have.
... There are two different poles and people tend to cluster
around one or the other.
fabgandon: I'm a leader of a
research team at INRIA, working since 1999. So as part of that
we have a stable namespace server. We publish all those with LD
principles.
... Multilingual schemas
... Should multilingualism be important. We annotate, we
publish in LOV. We publish the French chapter of dbpedia.
Latest this is the whole history of the French dbpedia released
as LD.
... We are interested in every step of the lifecycle.
danbri: Who do you feel about W3C's role? What should happen?
fabgandon: My initial perception is that W3C shouldn't be involved in vocabs not related to the Web. LDP clearly is, for example.
danbri: Hosting of namespaces?
fabgandon: Hmm... I don't want it
to be a bottleneck.
... Can be tricky in terms of governenance.
ericP: You mean process bottleneck or tech
fabgandon: I wouldn't want all the namespaces hosted at one place.
<Zakim> jtandy, you wanted to ask what "fully compliant" means
jtandy: When I look at schema.org, I don't imagine making my websites fully compliant. Is it a little smattering? Is it a full thing?
<Zakim> Ralph, you wanted to comment on decentralized namespaces
Francesco: Standards help with
accessibility, integration etc.
... But it's usefeul to have under control, what's the
structured data in a university?
<Zakim> Ralph, you wanted to comment on decentralized namespace
Ralph: Phil clairifed one part of
Fabien's point - we're not seeking to centralise
namepsaces
... Recent experience suggests that developers don't find the
proliferation of namespaces helpful
... So the attraction of schema.org is that it's one place.
fabgandon: If the issue is namespaces. OGP didn't want the webmasters to have to add 10 namespaces to their pages
<jtandy> http://www.w3.org/2011/rdfa-context/rdfa-1.1
Ralph: The context file in JSON-LD is meant to help that.
danbri: A lot of this dates back
to 1997 when DC held their first workshop.
... They saw themselves as one of a future many metadat
schemas.
... When we did RDFS, we didn't need people to separate out
different aspects. We haven't got a social process to match the
distributed nature.
... JSON-LD context files are meant to be easy for
webmasters.
betehess: If you want to put
schema.org on your website you have to think about what you're
trying to do.
... It's important to bear in mind validation. You have to use
Google's tools, follow their assumptions etc.
... Adoption is always a problem. There are many people know
nothing about Sem Web so schema.org helps in that
education.
... The use of JSON-LD by Google has got several new people
looking at JSON-LD
... They know JSON, so they're already comfortable.
... [Something about OG]
... The perception of og is ... FB only uses a fraction of
it.
danbri: People don't use schema.org because it looks good but because it's useful.
betehess: It's becoming a way for our website to talk to others.
danbri: So people are consuming your schema.org?
betehess: Yes
danbri: That would be good to
document.
... It's an unfortunate pressure on schema.org that people see
it as a Google-only thing. Bing, Yandex and Yahoo are there
too.
<Zakim> jtandy, you wanted to ask how to get schema.org can play nicely with other vocabs
jtandy: How do you get schema.org
to play nicely with other vocabs
... The big one people want to use... we're happy making SKOS
concept schemes - it might be useful to create a SKOS-like
thing in schema.org
danbri: We did SKOS on the back
of SWAD Europe, it had much the same purpose as schema.org.
Straurctured data for people who don't think in triples
... What we've done well in RDF is establish SKOS as widely
deployed
... In schema.org we have a thing on job postings. The
definition includes links to various other things including
some spreadsheets that look a lot like SKOS.
... It would be good if that data were exposed in a more
accessibile way. schema.org od not the place to define
heirarchy of restaurent cuisines.
ericP: We don't want to be a
central point of evil, let alone a central point of
failure.
... W3C has various failure avoidance mechanisms. We're a
pretty safe place to do styuff.
... If you want a central place, we're a reasonably good place.
But there's no complacency
... Typical question - is it rdf:Class or rdfs:Class? The more
we have a single namespace the better.
... We can also improve tooling to help of course.
... Working on FHIR, people like schema.org so maybe we can
share health records using schema.org markup
... Few people want to chare health records online
... issues around namespaces
danbri: Much as I love the RDF
community... it's just a name for a technology... Semantic Web
rebranding brought in some new people... then we had anotehr
rebranding for a subset.
... So we ended up with two communities at two poles.
... Both ends naive. LD thought you could find production grade
data on the open webn.
... We're never going to get to the stage where you just query
a SPARQL endpoint and use it.
[Discussion of clean data, DBPedia, pharma resources]
danbri: BBC teams found DBPedia useful but it drifts and can break. You need to keep track
[Demo from Felix]
<fsasaki> http://fsasaki.github.io/stuff/tekom2016/
[Discussion]
fsasaki: Take a term like snow in
EN, schnee in DE
... They all have a certain meaning
... many cultures have lots of meaning for a term
... At a high level they refer to the same concept of
snow
... want a language-agnostic level
... Concept, meaning, expression is the heirarchy
danbri: Big discussion with the
i18n group about RDF in the old days
... Where is the RDF world now with JSON?
... I just joined the WPWG to try and clean up microdata
fsasaki: Whatever you do you're
screwed. You have the option of a separate field for a string
and its language
... Currently discussing Activity Streams, Web Annotations...
OK, some people say do the separate field solution. Better in
JSON-LD. All have their own ways to inpterpret
... Spans with language info don't work in JSON
... People want some control characters inside the string. It's
a no-go
... Another thing is directionality
... No one is happy...
danbri: Microdata is OK but when you go to JSON-LD it breaks
fsasaki: The micropub spec builds on microformats which is i18n bad
<fsasaki> [background on the json / i18n metadata issue, see this document: http://w3c.github.io/i18n-discuss/notes/json-bidi.html ]
newton: Would like to talk about translations
VeraMeister: Introduces topic of
CMS adding structured data automatically. Not all concepts are
available in schema.org.
... Course and Course Instance are in pending
danbri: Do you like the definitions?
VeraMeister: Yes. It makes sense.
It's a kind of thinking. We also use the concept of
CreativeWork
... A year ago there was a request for an education CG, but her
way to think is more commercial. We're more concerned with
organisational side of universities.
danbri: That led to a separate CG looking at courses etc. We think it's finished and I expect it to be in the next release.
<jtandy> scribe: Jeremy Tandy
<jtandy> scribenick: jtandy
betehess: there are a number of
issues around tooling
... when putting the data out there
... we need to validate
... we need tools to do the validation
... samething with facebook validator
... there was no good way of doing this
... so we started developing our own tools
... using schema [schema.org] validation
... it's all done "manually" - which is to say that we write
bespoke code to do this
... just like the W3C validator, the tool needs to tell you
what's wrong
... if the community was to use SHACL or [SHEX?]
... to describe the rules, then [we wouldn't need to start from
scratch]
danbri: we phrased this in terms of validation - but also we need to think about meeting the needs of particular consumers
betehess: agrees - [different outcomes require use of different schema.org terms]
phila: I also see this when I was
doing some SEO stuff
... all was "CreativeWork" - but I wanted subtypes
... (but I was too tired to figure that out at 3AM)
... even in your own system, you [implicitly] use profiles
danbri: the structured data
testing tool from Google does several things
... it will check syntax
... then it will look up the latest version of schema.org and
try to validate against that
... errors are reported like "red ink on the page"
... it's intimidating
... we need to move toward saying that "you've passed the basic
tests"
... we have triples coming through
... even if they won't fit into Google tool x & y
phila: that's what SHACL is for -
ericP: another question is how do
I find what systems / applications can consume this data
... if you're already marking your data up, it would be useful
to say what systems / components you know can use the data
betehess: [@@]
ericP: you wouldn't add extra triples [to describe]
betehess: but we don't really know how people are using data
phila: and we never will
know
... I tend only to use microdata because that's the only format
universally used
betehess: things have moved on
danbri: yandex has json validator
... most read RDFa now
... but I wrote a page that describes how Google uses the
structured data
phila: it would be useful to
provide SHACL rules that describe the profiles used by each of
the services (Bing, Yandex, Google search) and other
tools
...
danbri: we do something close to
SHACL when we submit to the schema.org repo
... mostly it's SPARQL queries - these aim to ensure that
what's in the repo is well structured
... some of these tests are about policies for managing terms;
e.g. inverse properties being redundantly asserted
phila: on that sanity checking,
how much effort would it be to add "stable", "testing" etc.
categories to each term in schema.org?
... marking terms as "stable" would give confidence to user
danbri: this is pretty much doable ... testing and [@@] are "pending" ... stable is stable ... everything else is on the spectrum
betehess: it's useful to see the usage numbers for each term- the more people that use, the more stable the term
phila: so long as the 3M websites used the term correctly
danbri: talks about some new
features being prepped for US election; it's only on a few
sites
... we want to avoid objective rules like "its only stable if
it's on a 1000+ sites"
... we don't delete things
... we've just introduced the schema.org "attic" which is where
we can hide stuff we don't use anymore
... we're reticent to say that we won't ever change things any
more
... [talks about changes to schema:Person]
... it would be useful to collect evidence of the use of terms
in formal settings
phila: we're looking for
mechanisms to say "this term is stable"
... ericp was suggesting that if a term is used in a REC it
must be stable
... this is quite a good idea
RESOLUTION: This is not a formal resolution but it creates a link in the page:
<phila> If we let schema.org know that a standard references a term, that term can refer back to the standard. That makes it harder/less likely that schema.org will make changes
[and this is a way of asserting the stability of a term]
fabien: you might want to amend
the definitions of terms to reflect the way that people
actually use the terms
... we do this in the LOV community (?)
<fabgandon> ... http://lov.okfn.org/
vera: question - can someone explain [@@]
danbri: it came out of the
experience of matching tidy RDFa into scruffy HTML /
WHATWG
... the microdata "fork" of RDFa made a lot of concessions to
simple publication
... but at the cost to machine readability
... schema.org tries to re-use terms in many places
> domainIncludes & rangeIncludes ... is a little looser than the "neat and tidy" OWL
betehess: back on subject about
SHACL
... schema.org reflects what people are actually doing
... but many people don't refer to the text
<fabgandon> paper "Analyzing Schema.org" http://iswc2014.semanticweb.org/raw.githubusercontent.com/lidingpku/iswc2014/master/paper/87960257-analyzing-schemaorg.pdf?raw=true
betehess: SHACL could be used to show how you are _using_ the ontology in a given context
danbri: suspects that SHACL and
SHEX could be part of the critical infrastructure in the next
couple of years
... schema.org retains a lot of flexibility
<phila> schema.org Data Model
<betehess> [[ We also expect that often, where we expect a property value of type Person, Place, Organization or some other subClassOf Thing, we will get a text string, even if our schemas don't formally document that expectation. In the spirit of "some data is better than none", search engines will often accept this markup and do the best we can. ]]
danbri: we don't promise that these types go with these properties - this might evolve over time
[ericP does the adapter dance ... HDMI to VGA ... sigh]
betehess: shares his screen
<betehess> https://www.apple.com/newsroom/2016/09/highlights-from-apple-music-festival-10.html
betehess: looking at the source
for the above page
... line 1324
... if we'd had a "shape" for this block of code, then we could
have asserted some additional rules about properties like
datePublished, dateModified
... we have additional rules about how these terms are used
based on our policies
ericP: notes that you are using
schema.org's JSON-LD context
... does that cause problems
betehess: no - schema.org's
context does what we need here
... including "width" and "height"
ericP: so you're saying all the properties are in the @context - but Apple only wanted to constrain the use of those propeties?
betehess: yes
danbri: ecosystem question
... for a long time we didn't use the @context
... we = Google BTW
... but we do now
... we still don't use other people's contexts
... would people here like to see Google consuming multiple
contexts?
... everything [more or less] gets converted into triples;
right now we only support the schema.org context - but we
aspire to do more
ericP: does something wrong in Google parsing happen if more context are used?
danbri: there are two cases
... i) people override the schema.org context with some extra
stuff e.g. facebook.schema.org (??)
ii) people referencing multiple contexts ... see https://github.com/schemaorg/schemaorg/issues/1186
newton: I would like to use schema:Person and foaf:Person - how does Google decide which one to use
danbri: Google made a decision a while ago, we used to use things from multiple namespaces ... but now we try to use just one "big" namespace
betehess: so what next?
ericP: do you want a tutorial?
<newton> +1
> folks seem to be happy for a 10-minute aside on SHACL
fabgandon: my research team have a validator and would love to work with interesting use cases ... like these
<betehess> betehess: is it an implementation of SHACL? or ShEx? or custom thing?
fabgandon: but [there's lots of moving parts]
phila: hopefully the SHAPEs stuff is moving to just one spec
ericP: does his data-shapes/SHACL tutorial ... [not minuted]
Ed Draft spec is at https://w3c.github.io/data-shapes/shacl/
and https://w3c.github.io/data-shapes/shacl-abstract-syntax/
<phila> schema.org's context file
<phila> danbri: This file is very big
<phila> issue 1186
https://github.com/schemaorg/schemaorg/issues/1186
<ericP> schema.org JSON-LD
<phila> danbri: We wanted to be able to say for each property, whether it expects strings or URIs
<phila> ... http://gs1.org/voc/cheeseFirmness
<phila> danbri: They have lots of terms...
<phila> ... The JSON-LD Contexts allows us to flatten it all down to one file
<phila> ... Found that using two simple contexts leads to mistakes/ambiguities
<phila> ... Because you don't know whether Person or Product came from which namespace
<phila> ... Change the order of the context files you get different triples
<phila> ... Same discussion around XML 10 years ago
<phila> ... People were parsing RDF/XML files with XSLT etc.
<phila> ... Web devs won't parse RDF/XML and handle triples
<phila> ericP: Do devs care about any of this?
<phila> danbri: That's the problem.
<phila> ... We want to decentralise. To plug in GS1 and wikidata, we have to make our context file a lot bigger.
<phila> ... Talks about Foreign Fetch
<phila> ... Can be used to access local copy held in your service worker, rather than having to get the original. It's cache plus logic. It can work across multiple sites.
<phila> [Lunch]
<ericP> schema.org JSON-LD
<danbri> see also https://github.com/schemaorg/schemaorg/issues/894 from Richard Wallis
<danbri> http://pending.schema.org/partOfEnumerationValueSet
<phila> [Unscribed session looking at schema.org issues]
<phila> https://github.com/schemaorg/schemaorg/issues/894
<danbri> <http://schema.org/codeValue> <http://www.w3.org/2000/01/rdf-schema#comment> "The actual code." <http://health-lifesci.schema.org/#3.2> .
<danbri> <http://schema.org/code> <http://www.w3.org/2000/01/rdf-schema#label> "code" <http://health-lifesci.schema.org/#3.2> .
<danbri> Consider http://health-lifesci.webschemas.org/code http://health-lifesci.webschemas.org/codeValue
<danbri> health-lifesci.schema.org
<danbri> https://twitter.com/danbri/status/763391811603861505
<phila> WE think that RJW's proposal can be handled using existing schema.org terms. Make code into MedicalCode, make code a super property
<phila> ... then use codeValue and codingSystem
<danbri> phila, can you click on http://pending.webschemas.org/identifier ?
<danbri> on http://schema.org/JobPosting http://schema.org/occupationalCategory
<danbri> http://schema.org/director
<danbri> http://schema.org/servesCuisine
<danbri> http://schema.org/recipeCuisine
<danbri> بيتزا
<danbri> https://www.w3.org/wiki/WebSchemas/Accessibility
<danbri> via http://schema.org/CreativeWork -> http://schema.org/accessibilityFeature
<danbri> https://github.com/schemaorg/schemaorg/wiki/Issue-Reorg
<danbri> https://github.com/schemaorg/schemaorg/1
<danbri> https://github.com/schemaorg/schemaorg/issues/1
<phila> [General Discussion about schema.org]
<phila> ADJOURNED - that's all from Lisbon