Part D3: Semantic Web Application Integration: Travel Tools
DanC, Sandro, and TimBL
The bane of my existence is doing things I know the computer
could do for me.*
The XML Revolution, Oct 1998 in Nature's Web Matters
Proposed itinerary comes in email from the travel agency like this:
07 APR 03 - MONDAY AIR AMERICAN AIRLINES FLT:3199 ECONOMY OPERATED BY AMERICAN AIRLINES LV KANSAS CITY INTL 940A EQP: MD-80 DEPART: TERMINAL BUILDING B 01HR 36MIN AR DALLAS FT WORTH 1116A NON-STOP
This is what I want to see, of course:
AARGH! Do I really have to copy each field by hand?!?!?
I want to put LV KANSAS CITY INTL and AR DALLAS FT WORTH on a map:

Meanwhile...
After some preparation, we'll...
we need to
07 APR 03 - MONDAY AIR AMERICAN AIRLINES FLT:3199 ECONOMY OPERATED BY AMERICAN AIRLINES LV KANSAS CITY INTL 940A EQP: MD-80 DEPART: TERMINAL BUILDING B 01HR 36MIN AR DALLAS FT WORTH 1116A NON-STOP
I hope that before too long they'll dump it from their database directly into RDF/XML using some travel industry vocabulary, but

AIR AMERICAN AIRLINES FLT:3199 ECONOMY
/* ... */
if(/FLT:\s*(\d+)\s+([A-Z][A-Z]+)?/){
my($flightNum, $flightClassName) = ($1, $2);
$event = genSym("flt$flightNum");
makeStatement($event, $tNS . "flightNumber", '', $flightNum);
/* ... */
_:flt3199_3 <http://www.w3.org/2000/10/swap/pim/travelTerms#flightNumber> "3199". _:AMERICANAIRLINES_4 <http://opencyc.sourceforge.net/daml/cyc.daml#nameOfAgent> "AMERICAN AIRLINES". _:AMERICANAIRLINES_4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://opencyc.sourceforge.net/daml/cyc.daml#AirlineCompany>. _:flt3199_3 <http://www.w3.org/2000/10/swap/pim/travelTerms#carrier> _:AMERICANAIRLINES_4. _:ECONOMY_5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "ECONOMY". _:flt3199_3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:ECONOMY_5.
python cwm.py itin.n3.nt >itin.n3
yields
:_gflt3199_3 a :_gECONOMY_5;
k:endingDate :_gdayMONDAY07_2;
k:fromLocation <http://www.daml.org/cgi-bin/airport?MCI>;
k:startingDate :_gdayMONDAY07_2;
k:toLocation <http://www.daml.org/cgi-bin/airport?DFW>;
t:arrivalTime "11:16";
t:carrier :_gAMERICANAIRLINES_4;
t:departureTime "09:40";
t:flightNumber "3199" .
:_gAMERICANAIRLINES_4 a k:AirlineCompany;
k:nameOfAgent "AMERICAN AIRLINES" .
:_gECONOMY_5 r:value "ECONOMY" .
:_gdayMONDAY07_2 a k:Monday;
dt:date "2003-04-07" .
my($rdfNS) = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"; my($rdfsNS) = "http://www.w3.org/2000/01/rdf-schema#"; my($dcNS) = "http://purl.org/dc/elements/1.1/"; my($kNS) = "http://opencyc.sourceforge.net/daml/cyc.daml#"; my($dtNS) = "http://www.w3.org/2001/XMLSchema#"; my($tNS) = "http://www.w3.org/2000/10/swap/pim/travelTerms#"; my($aNS) = "http://www.daml.org/2001/10/html/airport-ont#";
itin2ical.n3 has rules like:
{ :FLT
k:startingDate [ dt:date :YYMMDD];
k:endingDate [ dt:date :YYMMDD2];
t:departureTime :HH_MM;
k:fromLocation [ :timeZone [ cal:tzid :TZ] ];
t:arrivalTime :HH_MM2;
k:toLocation [ :timeZone [ cal:tzid :TZ2] ].
:DTSTART is str:concatenation of
(:YYMMDD "T" :HH_MM ":00").
:DTEND is str:concatenation of
(:YYMMDD2 "T" :HH_MM2 ":00").
( :FLT!log:rawUri "@uri-2-mid.w3.org") str:concatenation :UID. #@@hmm... kludge?
}
log:implies {
:FLT a cal:Vevent;
cal:uid :UID;
cal:dtstart [ cal:tzid :TZ; cal:dateTime :DTSTART ];
cal:dtend [ cal:tzid :TZ2; cal:dateTime :DTEND ].
}.
toIcal.py uses the cwm API:
# ...
sts = load(addr)
#...
for cal in sts.each(pred = RDF.type, obj = ICAL.Vcalendar):
w("BEGIN:VCALENDAR" + CRLF) #hmm... SAX interface?
for comp in sts.each(subj = cal, pred = ICAL.component):
if sts.statementsMatching(RDF.type, comp, ICAL.Vevent):
self.exportEvent(sts, comp)
elif sts.statementsMatching(RDF.type, comp, ICAL.Vtimezone):
self.exportTimezone(sts, comp)
w("END:VCALENDAR" + CRLF)
results:
{
[ k:subEvents [
k:startingDate [ dt:date :YYYY_MM_DD; a [ k:nameString :DOW ] ];
t:departureTime :HH_MM;
t:arrivalTime :HH_MM2;
k:fromLocation [ apt:iataCode :IATA ];
k:toLocation [ apt:iataCode :IATA2 ];
t:carrier [ k:nameOfAgent :CARRIER ];
t:flightNumber :NUM;
]
].
:WHEN is str:concatenation of
(:YYYY_MM_DD :HH_MM).
:TXT is str:concatenation of
(:YYYY_MM_DD " " :HH_MM " - " :HH_MM2 " " :IATA "->" :IATA2 " "
:DOW " " :CARRIER " #" :NUM "\n").
} log:implies {
:WHEN log:outputString :TXT
}.
run cwm like this...
python cwm.py itinBrief.n3 itin3.n3 --think --strings itin-brief.txt
and out comes...
2003-04-07 09:40 - 11:16 MCI->DFW Monday AMERICAN AIRLINES #3199 2003-04-07 12:03 - 15:49 DFW->MIA Monday AMERICAN AIRLINES #68 2003-04-10 19:12 - 21:32 MIA->ORD Thursday AMERICAN AIRLINES #1477 2003-04-10 22:33 - 23:54 ORD->MCI Thursday AMERICAN AIRLINES #1081
tool chain:
#...
{
?WHERE apt:iataCode ?CCC.
?PG log:uri [ is str:concatenation of
("http://www.daml.org/cgi-bin/airport?" ?CCC) ];
}
log:implies {
?WHERE :airportInfo ?PG
}.
#...
# believe what daml.org says about airport latitutde/longitudes...
:AirportProperty is rdf:type of
apt:latitude,
apt:name,
apt:iataCode,
apt:icaoCode,
apt:location,
apt:latitude,
apt:longitude,
apt:elevation.
{
:P a :AirportProperty.
?WHERE a :InterestingPlace; apt:iataCode :K; :airportInfo ?PG.
?PG log:semantics [
log:includes {
[] apt:iataCode :K; :P :X.
}
].
} log:implies {
?WHERE :P :X.
}.
these rules are slighly bogus, though they work. @@clean-up
{ :P a :ArLv.
[ k:subEvents [
:P [
apt:latitude :LAT1;
apt:longitude :LON1;
apt:iataCode :IATA;
]
] ].
(:LAT1 " " :LON1 " \"" :IATA "\" color=blue\n") str:concatenation :TXT.
}
log:implies { :IATA log:outputString :TXT }.
example output:
32.896388888888886 -97.0375 "DFW" color=blue
The resulting map shows that we have given the machine a fairly deep understanding of the itinerary:

itineraries that have me leaving before 30 July are no good
{
?D a k:ItineraryDocument; k:containsInformationAbout-Focally ?TRIP.
?TRIP k:subEvents
[ k:startingDate [ dt:date ?D1 ];
k:fromLocation [ apt:iataCode "MCI" ];
t:departureTime ?T1;
].
?D1 str:lessThan "2001-07-30".
} => {
?TRIP <#leavesDaysTooSoon> ?D1;
<#at> ?T1.
}.
check ala...
$ python cwm.py proposed-itinerary.nt --think=constraints.n3
... and look for <#leavesDaysTooSoon> in the output.
Ideas for future work include: