Semantic Web Tutorial

Part D3: Semantic Web Application Integration: Travel Tools

Text of this part

DanC, Sandro, and TimBL
$Id: all.htm,v 1.9 2003/05/19 07:55:35 connolly Exp $

Semantic Web Application Integration: Overview

  1. Motivation
  2. Approach, Principles
  3. How-to
  4. Future work

Semantic Web Application Integration: Motivation

stubborn computer cartoonThe bane of my existence is doing things I know the computer could do for me.*

The XML Revolution, Oct 1998 in Nature's Web Matters

Personal Information Disaster: legacy data

Proposed itinerary comes in email from the travel agency like this:

07 APR 03 - MONDAY
AIR AMERICAN AIRLINES FLT:3199 ECONOMY
OPERATED BY AMERICAN AIRLINES
LV KANSAS CITY INTL 940A EQP: MD-80
DEPART: TERMINAL BUILDING B 01HR 36MIN
AR DALLAS FT WORTH 1116A NON-STOP

Personal Information: COTS calendar tools

This is what I want to see, of course:

evo screenshot

AARGH! Do I really have to copy each field by hand?!?!?

Personal Information: reusable data

I want to put LV KANSAS CITY INTL and AR DALLAS FT WORTH on a map:

MCI to YMX and back for Extreme 2002

Semantic Web Application Integration: Approach

Modelling people, places, events for travel tool integration

travel terms

How-To: some integration tasks

After some preparation, we'll...

How-To work with legacy data

07 APR 03 - MONDAY
AIR AMERICAN AIRLINES FLT:3199 ECONOMY
OPERATED BY AMERICAN AIRLINES
LV KANSAS CITY INTL 940A EQP: MD-80
DEPART: TERMINAL BUILDING B 01HR 36MIN
AR DALLAS FT WORTH 1116A NON-STOP

Evangelism or perl + duct tape?

I hope that before too long they'll dump it from their database directly into RDF/XML using some travel industry vocabulary, but

Perl + duct tape... away!

itin.txt -> itin.n3/rdf

Perl + duct tape: extracting statements

AIR AMERICAN AIRLINES FLT:3199 ECONOMY
          /* ... */
	  if(/FLT:\s*(\d+)\s+([A-Z][A-Z]+)?/){
	      my($flightNum, $flightClassName) = ($1, $2);
	      $event = genSym("flt$flightNum");
	      makeStatement($event, $tNS . "flightNumber", '', $flightNum);
          /* ... */
	
_:flt3199_3 <http://www.w3.org/2000/10/swap/pim/travelTerms#flightNumber> "3199".
_:AMERICANAIRLINES_4 <http://opencyc.sourceforge.net/daml/cyc.daml#nameOfAgent> "AMERICAN AIRLINES".
_:AMERICANAIRLINES_4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://opencyc.sourceforge.net/daml/cyc.daml#AirlineCompany>.
_:flt3199_3 <http://www.w3.org/2000/10/swap/pim/travelTerms#carrier> _:AMERICANAIRLINES_4.
_:ECONOMY_5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "ECONOMY".
_:flt3199_3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:ECONOMY_5.
	

How-To: pretty-print n-triples to N3 with cwm

python cwm.py itin.n3.nt >itin.n3

yields

    :_gflt3199_3     a :_gECONOMY_5;
         k:endingDate :_gdayMONDAY07_2;
         k:fromLocation <http://www.daml.org/cgi-bin/airport?MCI>;
         k:startingDate :_gdayMONDAY07_2;
         k:toLocation <http://www.daml.org/cgi-bin/airport?DFW>;
         t:arrivalTime "11:16";
         t:carrier :_gAMERICANAIRLINES_4;
         t:departureTime "09:40";
         t:flightNumber "3199" .

    :_gAMERICANAIRLINES_4     a k:AirlineCompany;
         k:nameOfAgent "AMERICAN AIRLINES" .
    
    :_gECONOMY_5     r:value "ECONOMY" .

    :_gdayMONDAY07_2     a k:Monday;
         dt:date "2003-04-07" .
    

	

How-to: mixing vocabularies

my($rdfNS) = "http://www.w3.org/1999/02/22-rdf-syntax-ns#";
my($rdfsNS) = "http://www.w3.org/2000/01/rdf-schema#";
my($dcNS) = "http://purl.org/dc/elements/1.1/";
my($kNS) = "http://opencyc.sourceforge.net/daml/cyc.daml#";
my($dtNS) = "http://www.w3.org/2001/XMLSchema#";
my($tNS) = "http://www.w3.org/2000/10/swap/pim/travelTerms#";
my($aNS) = "http://www.daml.org/2001/10/html/airport-ont#";
travel terms

Mixing Vocabularies: wasn't that easy?

How-To: Integrate with iCalendar Tools

calendar integration toolchain

How-to: convert between vocabularies

itin2ical.n3 has rules like:


{ :FLT
    k:startingDate [ dt:date :YYMMDD];
    k:endingDate [ dt:date :YYMMDD2];
    t:departureTime :HH_MM;
    k:fromLocation [ :timeZone [ cal:tzid :TZ] ];
    t:arrivalTime :HH_MM2;
    k:toLocation [ :timeZone [ cal:tzid :TZ2] ].
  :DTSTART is str:concatenation of
    (:YYMMDD "T" :HH_MM ":00").
  :DTEND is str:concatenation of
    (:YYMMDD2 "T" :HH_MM2 ":00").

  ( :FLT!log:rawUri "@uri-2-mid.w3.org") str:concatenation :UID. #@@hmm... kludge?
}
 log:implies {
  :FLT a cal:Vevent;
    cal:uid :UID;
    cal:dtstart [ cal:tzid :TZ; cal:dateTime :DTSTART ];
    cal:dtend [ cal:tzid :TZ2; cal:dateTime :DTEND ].
}.

How-To: export to non-XML syntax

toIcal.py uses the cwm API:

    # ...
    sts = load(addr)

    #...
        for cal in sts.each(pred = RDF.type, obj = ICAL.Vcalendar):
            w("BEGIN:VCALENDAR" + CRLF) #hmm... SAX interface?

            for comp in sts.each(subj = cal, pred = ICAL.component):
                if sts.statementsMatching(RDF.type, comp, ICAL.Vevent):
                    self.exportEvent(sts, comp)
                elif sts.statementsMatching(RDF.type, comp, ICAL.Vtimezone):
                    self.exportTimezone(sts, comp)
        
            w("END:VCALENDAR" + CRLF)

How-To: import into iCalendar tool

results:

evo screenshot

How-to: plain text reports

{
  [ k:subEvents [
      k:startingDate [ dt:date :YYYY_MM_DD; a [ k:nameString :DOW ] ];
      t:departureTime :HH_MM;
      t:arrivalTime :HH_MM2;
      k:fromLocation [ apt:iataCode :IATA ];
      k:toLocation [ apt:iataCode :IATA2 ];
      t:carrier [ k:nameOfAgent :CARRIER ];
      t:flightNumber :NUM;
   ]
  ].

  :WHEN is str:concatenation of
   (:YYYY_MM_DD :HH_MM).

  :TXT is str:concatenation of
   (:YYYY_MM_DD " " :HH_MM " - " :HH_MM2 " " :IATA "->" :IATA2 " "
    :DOW " " :CARRIER " #" :NUM "\n").

} log:implies {
  :WHEN log:outputString :TXT
}.

How-to: plain text reports

run cwm like this...

python cwm.py itinBrief.n3 itin3.n3 --think --strings itin-brief.txt

and out comes...

2003-04-07 09:40 - 11:16 MCI->DFW Monday AMERICAN AIRLINES #3199
2003-04-07 12:03 - 15:49 DFW->MIA Monday AMERICAN AIRLINES #68
2003-04-10 19:12 - 21:32 MIA->ORD Thursday AMERICAN AIRLINES #1477
2003-04-10 22:33 - 23:54 ORD->MCI Thursday AMERICAN AIRLINES #1081

How-to: integration with mapping tools

tool chain:

map viz toolchain

How-to: reach out to airport data

#...
{
  ?WHERE apt:iataCode ?CCC.
  ?PG log:uri [ is str:concatenation of
               ("http://www.daml.org/cgi-bin/airport?" ?CCC) ];
}
  log:implies {
   ?WHERE :airportInfo ?PG
}.

#...

# believe what daml.org says about airport latitutde/longitudes...
:AirportProperty is rdf:type of
  apt:latitude,
  apt:name,
  apt:iataCode,
  apt:icaoCode,
  apt:location,
  apt:latitude,
  apt:longitude,
  apt:elevation.

{
  :P a :AirportProperty.
  ?WHERE a :InterestingPlace; apt:iataCode :K; :airportInfo ?PG.
  ?PG log:semantics [
      log:includes {
        [] apt:iataCode :K; :P :X.
      }
    ].
} log:implies {
  ?WHERE :P :X.
}.
	

these rules are slighly bogus, though they work. @@clean-up

How-to: using cwm --strings to export to xplanet

{ :P a :ArLv.
  [ k:subEvents [
      :P [
        apt:latitude :LAT1;
        apt:longitude :LON1;
        apt:iataCode :IATA;
      ]
  ] ].


  (:LAT1 " " :LON1 " \"" :IATA "\" color=blue\n") str:concatenation :TXT.
}
 log:implies { :IATA log:outputString :TXT }.

example output:

32.896388888888886 -97.0375 "DFW" color=blue
	

How-to: mapping tool results

The resulting map shows that we have given the machine a fairly deep understanding of the itinerary:

MCI to YMX and back for Extreme 2002

Checking Constraints

itineraries that have me leaving before 30 July are no good

{
 ?D a k:ItineraryDocument; k:containsInformationAbout-Focally ?TRIP.
 ?TRIP k:subEvents
    [ k:startingDate [ dt:date ?D1 ];
      k:fromLocation [ apt:iataCode "MCI" ];
      t:departureTime ?T1;
    ].
  ?D1 str:lessThan "2001-07-30".
} => {
   ?TRIP <#leavesDaysTooSoon> ?D1;
          <#at> ?T1.
}.

check ala...

$ python cwm.py proposed-itinerary.nt --think=constraints.n3

... and look for <#leavesDaysTooSoon> in the output.

Conclusions and Future Work

Future Work

Ideas for future work include: