ISSUE-35: Named entity syntax could use microdata and RDFa in HTML5, and a dedicated syntax in XML only

named-entity-syntax-could-use-rdfa-and-microdata

Named entity syntax could use microdata and RDFa in HTML5, and a dedicated syntax in XML only

State:
CLOSED
Product:
MLW-LT Standard Draft
Raised by:
Felix Sasaki
Opened on:
2012-07-05
Description:
Hi Tadej esp. and all,

today I looked at some automatic annotation output I got from Michael. It was created by Enrycher. As I understand it this is experimental, but I wanted to bring one aspect to your attention. The below is simplified:

Input:

<p>After a century of near domination from the likes of Italy and Germany, international soccer is entering the era of the Cinderella. Russia's Yuri Zhirkov ...</p>

Output:

<p>After a century of near domination from the likes of <span itsx-lexicalizes="dbr:Italy" itsx-entity-type="http://schema.org/Place">Italy</span> and <span itsx-lexicalizes="dbr:Germany" itsx-entity-type="http://schema.org/Place">Germany</span>, international soccer is entering the era of the Cinderella. Russia's <span itsx-lexicalizes="dbr:Yuri_Zhirkov" itsx-entity-type="http://schema.org/Person">Yuri Zhirkov</span> ...</p>


It strikes me that we are probably re-inventing the wheel: large parts of the web community are now heading towards RDFa (light) and microdata for named entities, and we are inventing a new syntax.

So I am wondering whether we shouldn't just describe a best practice to create something like this out of an automatic annotation process:

<p>After a century of near domination from the likes of <span itemscope='' itemtype="http://schema.org/Place" itemprop="name">Italy</span> and <span itemscope='' itemtype="http://schema.org/Place" itemprop="name">Germany</span>, international soccer is entering the era of the Cinderella. Russia's <span itemscope='' itemtype="http://schema.org/Person" itemprop="name">Yuri Zhirkov</span> ...</p>


For this, we then already can expect uptake from search engines, and lot's of tools http://schema.rdfs.org/tools.html

I still see a use case for a dedicated "named entity" data category, but rather in a localization chain and in XML, in a workflow like this:

1) HTML is enriched with the microdata result described above, or its RDFa 1.1. light counterpart.

2) We specify dedicated local markup for entities only in XML, e.g. its:entityType

3) To "glue" 1) and 2) together, we when have a mapping rule like

<its:namedEntityRule selector="//*[@itemtype]" entityTypePointer="@itemtype"/>

No. 1) would also help us with our charter issue, btw.

This approach would also relate to

ISSUE-2 microdata mapping, since we won't map for named entities to microdata and RDFa - they would be available as these from the beginning.
ISSUE-18 dropping RDFa, since: we won't drop it, but actually do it, at least RDFa light 1.1.
ISSUE-29 ITS and RDF, since we do

Thoughts?
Related Actions Items:
No related actions
Related emails:
  1. Feedback (Re: [all] Call for consensus on disambiguation [ACTION-181]) (from fsasaki@w3.org on 2012-09-06)
  2. Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181] (from hellmann@informatik.uni-leipzig.de on 2012-08-09)
  3. Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181] (from fsasaki@w3.org on 2012-08-09)
  4. Re: [all] updated agenda for August 9, 2012, 14:00 UTC (from David.Filip@ul.ie on 2012-08-09)
  5. Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181] (from fsasaki@w3.org on 2012-08-09)
  6. Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181] (from tadej.stajner@ijs.si on 2012-08-09)
  7. [all] agenda for August 9, 2012, 14:00 UTC (from David.Filip@ul.ie on 2012-08-08)
  8. Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181] (from fsasaki@w3.org on 2012-08-03)
  9. Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181] (from tadej.stajner@ijs.si on 2012-08-03)
  10. Re: [all] Call for consensus on Entity - feedback integrated [ACTION-181] (from tadej.stajner@ijs.si on 2012-08-02)
  11. Re: [all] Call for consensus on Entity - feedback integrated [ISSUE-181] (from tadej.stajner@ijs.si on 2012-08-02)
  12. [all] Important Update to Agenda Re: AGENDA MLW-LT Call 2 August, 2 p.m. UTC (from David.Filip@ul.ie on 2012-08-02)
  13. AGENDA MLW-LT Call 2 August, 2 p.m. UTC (from David.Filip@ul.ie on 2012-07-31)
  14. [all] Call for consensus on Entity (from tadej.stajner@ijs.si on 2012-07-26)
  15. Re: mlw-lt-track-ISSUE-35 (named-entity-syntax-could-use-rdfa-and-microdata): Named entity syntax could use microdata and RDFa in HTML5, and a dedicated syntax in XML only [MLW-LT Standard Draft] (from tadej.stajner@ijs.si on 2012-07-10)
  16. mlw-lt-track-ISSUE-35 (named-entity-syntax-could-use-rdfa-and-microdata): Named entity syntax could use microdata and RDFa in HTML5, and a dedicated syntax in XML only [MLW-LT Standard Draft] (from sysbot+tracker@w3.org on 2012-07-05)

Related notes:

[fsasaki_]: http://www.w3.org/2012/07/19-mlw-lt-minutes.html

24 Jul 2012, 15:23:33

[fsasaki]: closed by proposal at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html

27 Jul 2012, 13:35:44

Display change log ATOM feed


Chair, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 35.html,v 1.1 2014-01-21 15:46:13 kahan Exp $