Part D3: Semantic Web Application Integration: Travel Tools
DanC, Sandro, and TimBLThe bane of my existence is doing things I know the computer could do for me.*
The XML Revolution, Oct 1998 in Nature's Web Matters
Proposed itinerary comes in email from the travel agency like this:
07 APR 03 - MONDAY AIR AMERICAN AIRLINES FLT:3199 ECONOMY OPERATED BY AMERICAN AIRLINES LV KANSAS CITY INTL 940A EQP: MD-80 DEPART: TERMINAL BUILDING B 01HR 36MIN AR DALLAS FT WORTH 1116A NON-STOP
This is what I want to see, of course:
AARGH! Do I really have to copy each field by hand?!?!?
I want to put LV KANSAS CITY INTL and AR DALLAS FT WORTH on a map:
Meanwhile...
After some preparation, we'll...
we need to
07 APR 03 - MONDAY AIR AMERICAN AIRLINES FLT:3199 ECONOMY OPERATED BY AMERICAN AIRLINES LV KANSAS CITY INTL 940A EQP: MD-80 DEPART: TERMINAL BUILDING B 01HR 36MIN AR DALLAS FT WORTH 1116A NON-STOP
I hope that before too long they'll dump it from their database directly into RDF/XML using some travel industry vocabulary, but
AIR AMERICAN AIRLINES FLT:3199 ECONOMY
/* ... */ if(/FLT:\s*(\d+)\s+([A-Z][A-Z]+)?/){ my($flightNum, $flightClassName) = ($1, $2); $event = genSym("flt$flightNum"); makeStatement($event, $tNS . "flightNumber", '', $flightNum); /* ... */
_:flt3199_3 <http://www.w3.org/2000/10/swap/pim/travelTerms#flightNumber> "3199". _:AMERICANAIRLINES_4 <http://opencyc.sourceforge.net/daml/cyc.daml#nameOfAgent> "AMERICAN AIRLINES". _:AMERICANAIRLINES_4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://opencyc.sourceforge.net/daml/cyc.daml#AirlineCompany>. _:flt3199_3 <http://www.w3.org/2000/10/swap/pim/travelTerms#carrier> _:AMERICANAIRLINES_4. _:ECONOMY_5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "ECONOMY". _:flt3199_3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:ECONOMY_5.
python cwm.py itin.n3.nt >itin.n3
yields
:_gflt3199_3 a :_gECONOMY_5; k:endingDate :_gdayMONDAY07_2; k:fromLocation <http://www.daml.org/cgi-bin/airport?MCI>; k:startingDate :_gdayMONDAY07_2; k:toLocation <http://www.daml.org/cgi-bin/airport?DFW>; t:arrivalTime "11:16"; t:carrier :_gAMERICANAIRLINES_4; t:departureTime "09:40"; t:flightNumber "3199" . :_gAMERICANAIRLINES_4 a k:AirlineCompany; k:nameOfAgent "AMERICAN AIRLINES" . :_gECONOMY_5 r:value "ECONOMY" . :_gdayMONDAY07_2 a k:Monday; dt:date "2003-04-07" .
my($rdfNS) = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"; my($rdfsNS) = "http://www.w3.org/2000/01/rdf-schema#"; my($dcNS) = "http://purl.org/dc/elements/1.1/"; my($kNS) = "http://opencyc.sourceforge.net/daml/cyc.daml#"; my($dtNS) = "http://www.w3.org/2001/XMLSchema#"; my($tNS) = "http://www.w3.org/2000/10/swap/pim/travelTerms#"; my($aNS) = "http://www.daml.org/2001/10/html/airport-ont#";
itin2ical.n3 has rules like:
{ :FLT k:startingDate [ dt:date :YYMMDD]; k:endingDate [ dt:date :YYMMDD2]; t:departureTime :HH_MM; k:fromLocation [ :timeZone [ cal:tzid :TZ] ]; t:arrivalTime :HH_MM2; k:toLocation [ :timeZone [ cal:tzid :TZ2] ]. :DTSTART is str:concatenation of (:YYMMDD "T" :HH_MM ":00"). :DTEND is str:concatenation of (:YYMMDD2 "T" :HH_MM2 ":00"). ( :FLT!log:rawUri "@uri-2-mid.w3.org") str:concatenation :UID. #@@hmm... kludge? } log:implies { :FLT a cal:Vevent; cal:uid :UID; cal:dtstart [ cal:tzid :TZ; cal:dateTime :DTSTART ]; cal:dtend [ cal:tzid :TZ2; cal:dateTime :DTEND ]. }.
toIcal.py uses the cwm API:
# ... sts = load(addr) #... for cal in sts.each(pred = RDF.type, obj = ICAL.Vcalendar): w("BEGIN:VCALENDAR" + CRLF) #hmm... SAX interface? for comp in sts.each(subj = cal, pred = ICAL.component): if sts.statementsMatching(RDF.type, comp, ICAL.Vevent): self.exportEvent(sts, comp) elif sts.statementsMatching(RDF.type, comp, ICAL.Vtimezone): self.exportTimezone(sts, comp) w("END:VCALENDAR" + CRLF)
results:
{ [ k:subEvents [ k:startingDate [ dt:date :YYYY_MM_DD; a [ k:nameString :DOW ] ]; t:departureTime :HH_MM; t:arrivalTime :HH_MM2; k:fromLocation [ apt:iataCode :IATA ]; k:toLocation [ apt:iataCode :IATA2 ]; t:carrier [ k:nameOfAgent :CARRIER ]; t:flightNumber :NUM; ] ]. :WHEN is str:concatenation of (:YYYY_MM_DD :HH_MM). :TXT is str:concatenation of (:YYYY_MM_DD " " :HH_MM " - " :HH_MM2 " " :IATA "->" :IATA2 " " :DOW " " :CARRIER " #" :NUM "\n"). } log:implies { :WHEN log:outputString :TXT }.
run cwm like this...
python cwm.py itinBrief.n3 itin3.n3 --think --strings itin-brief.txt
and out comes...
2003-04-07 09:40 - 11:16 MCI->DFW Monday AMERICAN AIRLINES #3199 2003-04-07 12:03 - 15:49 DFW->MIA Monday AMERICAN AIRLINES #68 2003-04-10 19:12 - 21:32 MIA->ORD Thursday AMERICAN AIRLINES #1477 2003-04-10 22:33 - 23:54 ORD->MCI Thursday AMERICAN AIRLINES #1081
tool chain:
#... { ?WHERE apt:iataCode ?CCC. ?PG log:uri [ is str:concatenation of ("http://www.daml.org/cgi-bin/airport?" ?CCC) ]; } log:implies { ?WHERE :airportInfo ?PG }. #... # believe what daml.org says about airport latitutde/longitudes... :AirportProperty is rdf:type of apt:latitude, apt:name, apt:iataCode, apt:icaoCode, apt:location, apt:latitude, apt:longitude, apt:elevation. { :P a :AirportProperty. ?WHERE a :InterestingPlace; apt:iataCode :K; :airportInfo ?PG. ?PG log:semantics [ log:includes { [] apt:iataCode :K; :P :X. } ]. } log:implies { ?WHERE :P :X. }.
these rules are slighly bogus, though they work. @@clean-up
{ :P a :ArLv. [ k:subEvents [ :P [ apt:latitude :LAT1; apt:longitude :LON1; apt:iataCode :IATA; ] ] ]. (:LAT1 " " :LON1 " \"" :IATA "\" color=blue\n") str:concatenation :TXT. } log:implies { :IATA log:outputString :TXT }.
example output:
32.896388888888886 -97.0375 "DFW" color=blue
The resulting map shows that we have given the machine a fairly deep understanding of the itinerary:
itineraries that have me leaving before 30 July are no good
{ ?D a k:ItineraryDocument; k:containsInformationAbout-Focally ?TRIP. ?TRIP k:subEvents [ k:startingDate [ dt:date ?D1 ]; k:fromLocation [ apt:iataCode "MCI" ]; t:departureTime ?T1; ]. ?D1 str:lessThan "2001-07-30". } => { ?TRIP <#leavesDaysTooSoon> ?D1; <#at> ?T1. }.
check ala...
$ python cwm.py proposed-itinerary.nt --think=constraints.n3
... and look for <#leavesDaysTooSoon> in the output.
Ideas for future work include: