Copyright © 2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This is a report of the W3C Common Web Language Incubator Group (CWL-XG) as specified in the Deliverrables of its charter.
In this report we define the Common Web Language (CWL) Specifications and introduce various forms of representation of CWL and its platform to work with CWL.
Specifically the report:
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of Final Incubator Group Reports is available. See also the W3C technical reports index at http://www.w3.org/TR/.
This document was developed by the W3C Common Web Language Incubator Group, part of the W3C Incubator Activity.
Publication of this document by W3C as part of the W3C Incubator Activity indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. Participation in Incubator Groups and publication of Incubator Group Reports at the W3C site are benefits of W3C Membership.
Incubator Groups have as a goal to produce work that can be implemented on a Royalty Free basis, as defined in the W3C Patent Policy. Participants in this Incubator Group have made no statements about whether they will offer licenses according to the licensing requirements of the W3C Patent Policy for portions of this Incubator Group Report that are subsequently incorporated in a W3C Recommendation.
The CWL, a common web language for humans and computers, must solve the following two big problems exist in the present web world. One is the language barriers in the web, and another is lacking of machine understandability for the contents in the web.
Objectives of CWL is for exchanging information through the web and also for enabling computers to process information semantically. The CWL allows people to describe contents and meta-data of web pages written in natural languages and also allows realizing a language barrier free world in the web and it will also enable computers to extract semantic information and knowledge from web pages accurately.
The requirements for CWL to achieve the above objectives are the followings.
The CWL is a graph language of semantic hyper directed graph, a node represents a concept, an arc represents a relation between nodes and a node can be annotated by attributes. This CWL can be expressed in three languages UNL, CDL and RDF/OWL. The same information in CWL can be described in each language but in different manner. The CWL.unl is a language based on UNL. The CWL.cdl is a language based on the CDL. The CWL.rdf is a language based on RDF/OWL.
Three different types of representations of CWL allow different way of treatment of the same information described. The CWL.unl is for multilingualism, The CWL.cdl has compatibility with semantic computing systems for semantic computing, The CWL.rdf is for working with various data navigation and aggregation systems (like SPARQL).
Compatibility among expressions in UNL (CWL.unl), CDL (CWL.cdl), and RDF is guaranteed. Followings are the expression of each representation for the sentence `I purchased a computer yesterday.'.
{unl} //Table Form of UNL expression
agt(purchase(icl>buy(agt>person,obj>thing)).@entry.@past), I)
obj(purchase(icl>buy(agt>person,obj>thing)).@entry.@past),computer(icl>machine))
tim(purchase(icl>buy(agt>person,obj>thing)).@entry.@pst), yesterday(icl>day))
{/unl}
{#S Situation;
{#A Event tmp='past';
{#A1 purchase (icl>buy(agt>person,obj>thing) ;}
{#A2 I ral='def';}
{#A3 computer(icl>machine) ral='def';}
{#A4 yesterday(icl>day) ral='def';}
[#A1 cdd.nl#agt #A2]
[#A1 cdd.nl#obj #A3]
[#A1 cdd.nl#tim #A4]
}
}
RDF// N-Triples representation: Subject Property Object"." #S rdf:type Situation. #A rdf:type Event. #S hasComplexEntity #A. #A hasElementalEntity #A1. #A hasElementalEntity #A2. #A hasElementalEntity #A3. #A hasElementalEntity #A4. #A1 rdf:type purchase(icl>buy(agt>person,obj>thing). #A2 I rdf:type I. #A2 I ral 'def'. #A3 rdf:type computer(icl>machine). #A3 ral 'def'. #A4 rdf:type yesterday(icl>day). #A4 ral 'def'. #A1 agt #A2. #A1 obj #A3. #A1 tim #A4. #A tmp 'past'.
The CWL is designed to be used to describe meta-data and contents of web pages for breaking language barriers and enable computers to process web information semantically. The CWL is a graph language of extended semantic hyper directed graph, a node represents a concept, an arc represents a relation between nodes and a node can be annotated by attributes. Every node including nodes in hyper nodes can have relation with any other nodes.
A concept can be expressed in any kind of a morpheme, a word, a phrase, a clause or a sentence of any language and a Universal Word developed by the UNDL Foundation. A concept (character strings) express the meaning/concept that a a morpheme, a word, a phrase, a clause or a sentence can express in each language.
The meaning of a concept in CWL depend on the context. It means the meaning of a concept in CWL is restricted by relations with other concepts linked with arcs going in and out.
The following is a CWL.unl graph representation for the sentence `Long ago, in the city of Babylon, people begun to build a huge tower, which seemed about to reach the heavens.'
begin.entry.past-- tim->long ago
--plc->city.def
--agt->people.def
--obj->build.past--obj-> tower<- aoj--huge<- seem.past
--obj->reach.bigin.soon--obj->tower
--gol->heaven.def.pl
Compatibility among expressions in UNL (CWL.unl), CDL (CWL.cdl), and RDF is guaranteed. Followings are the expression of each representation for the sentence `I purchased a computer yesterday.'.
{cwl.unl}
tim(begin.@entry.@past,long ago)
mod(city.@def,Babylon)
plc(begin.@entry.@past,city.@def)
agt(begin.@entry.@past,people.@def)
obj(begin.@entry.@past,build.@past)
agt(build,people.@def)
obj(build,tower)
aoj(huge,tower)
aoj(seem.@past,tower)
obj(seem.@past,reach.@begin.@soon)
obj(reach.@begin.@soon,tower)
gol(reach.@begin.@soon,heaven.@def.@pl)
{/cwl.unl}
Relations constitute syntax of the CWL. They expresses objectivity together with UWs by describing how concepts(UW) constitutes a sentence related each other. We adopt the relation set of the UNL relation set for the basis of the relation set of CWL. Psudo relations will be added as the need arises.
There are 41 relations, such as "agt", "aoj", "bas", "ben", "cag", "cao", "cnt", "cob", "con", "coo", "dur", "equ", "fmt", "frm", "gol", "icl", "ins", "int", "iof", "man", "met", "mod", "nam", "nxt", "obj", "or", "per", "plc", "plf", "plt", "pof", "pos", "ptn", "pur", "qua", "ref", "rsn", "scn", "seq", "shd", "src", "tim", "tmf", "tmt", "to", and "via".
Attributes describe mainly subjectivity. We also adopt the attribute set of the UNL attribute set as a basis of the attribute set of the CWL. Attributes will be added as the need arises. Attributes are categorized into following 7 groups.at this moment.
A concept can be expressed in any kind of a morpheme, a word, a phrase, a clause or a sentence of any language and a Universal Word developed by the UNDL Foundation. A concept (character strings) express the meaning/concept that a morpheme, a word, a phrase, a clause or a sentence can express in each language.
The CWL vocaburary is defined as the CWL ontology. Every concept must be defined in the CWL ontroly. The CWL ontology gives the linguistic knowledge of the CWL. The following hierarchy of concepts is a part of the top ontology.
concept
nominal copncept
thing
abstract thing
attribute
quality
feature
event
action
mental action
physical action
process
phenomenon
mental phenomenon
physical phenomenon
change
process
information
statement
description
quantity
rule
state*
condition
mental state
physical state
situation
way
arrangement
behavior
manner
method
attributive thing
group
group(icl>volitional thing)
set
concrete thing
living thing
human
animal
plant
natural world
substance
...
functional thing
facilities
symbol
tool
...
volitional thing
animal
group
human
...
place
area
relative place
...
time
period
...
predicative concept
do
act
express
state
get
obtain
give
provide
make
produce
take
change
move
put
mentally do
physically do
do(agt>thing)
do(agt>thing,obj>thing)
...
treat
deal
perform
occur
become
happen
change
move
mentally happen
physically happen
...
occur(obj>thing)
occur(gol>thing,obj>thing)
...
be
be(aoj>thing,obj>thing)
be(aoj>thing)
...
attributive concept
(qua<thing) :quantifier
(mod<thing) :modifier
...
adverbial concept
(qua>quantifier)
(man<predicative concept)
how
The CWL ontology also defineds every possible relations between concepts.
There are many factors to be considered in choosing an inventory of relations between concepts. Different factors taken into account in choosing the relations lead to different sets of the relations. We aopt the relation set of the UNL. The UNL relations are selected basically according to the following principles:
Relation
predicative relation
agt(agent)
aoj(thing with attribute)
cag(co-agent)
cao(co-thing with attribute)
ptn(partner)
ben(beneficiary)
cob(affected co-thing)
obj(affected thing)
opl(affected place)
ins(instrument)
met(method or means)
man(manner)
plc(place)
plf(initial place)
plt(final place)
scn(scene)
gol(goal, final state)
src(source, initial state)
via(intermediate place or state)
dur(duration)
tim(time)
tmf(initial time)
tmt(final time)
inter-concept relation
and(conjunction)
or(disjunction, alternative)
fmt(range/from-to)
frm(origin)
to(destination)
equ(equivalent)
icl(included/a kind of)
iof(an instance of)
inter-event relation
con(condition)
coo(co-occurrence)
pur(purpose or objective)
rsn(reason)
seq(sequence)
qualification relation
bas(basis for expressing a standard)
cnt(content, namely)
man(manner)
mod(modification)
nam(name)
per(proportion, rate or distribution)
pof(part-of)
pos(possessor)
qua(quantity)
The following are the relations defined according to the above principles. A relation label is represented as strings of 3 characters or less
Agt defines a thing (an agent) in focus that initiates an action. It can be a relation between:
do--agt->thing action--agt->thing
An agent initiate an action indicated by "do" or "action" and is thought of as having a direct role in making the action happen.
| break--agt->John | John breaks … |
| translate--agt->computer | computer translates … |
| run--agt->car | car runs … |
| destroy--explosion | explosion destroys … |
And defines a partner to have conjunctive relation. It can be a relation between:
and(concept, concept)
A conjunction is defined as the relation between a concept, and another concept.
| quickly--and->easily | … easily and quickly |
| dance--and->sing | … singing and dancing |
| Mary--and->John | … John and Mary |
Aoj defines a thing that is in a state or has an attribute. It can be a relation between:
be--aoj->thing thing--aoj->thing
| red--aoj->leaf | … leaf is red. |
| available--aoj->information | This information is available for … |
| nice--aoj->ski | Skiiing is nice. |
| teacher-->aoj->John | John is a teacher. |
| I<-aoj--have--obj->pen | I have a pen. |
| know--aoj->John | John knows … |
Bas defines a thing used as the basis (standard) of comparison. It can be a relation between:
be--bas->thing do--bas-> thing how--bas->thing
| more--bas->7 | more than seven. |
| more--bas->Jack | Betty weighs more than Jack (does). |
| beautiful--man->more--bas->rose | A tulip is more beautiful than a rose |
| (quiet(aoj>thing)--man->more--bas->shy)--aoj->John | John is more quiet than shy. |
| prefer--bas->(liveing--plc->city) | Many people prefer living in the country to living in a city |
Ben defines an indirectly related beneficiary or victim of an event or state. It can be a relation between:
predicative concept--ben->thing
| give-->ben->country | To give one's life for one's country. |
| good--ben->John | It is good for John to … |
Cag defines a thing not in focus that initiates an implicit event that is done in parallel. It can be a relation between:
do--cag->thing
| walk--cag->John | To walk with John |
| live--cag->aunt | To live with … aunt |
Cao defines a thing not in focus that is in a parallel state. It can be a relation between:
be--cao->thing thing--cao->thing
| exist--cao->you | … be with you |
Cnt defines or show the content of a concept. It can be a relation between:
concept--cnt->concept
| Internet--cnt->amalgamation | The Internet: an amalgamation |
| language generator--cnt->"deconverter" | a language generator "deconverter"… |
| risk--cnt->(losing--obj->money | the risk of losing money |
Cob defines a thing that is directly affected by an implicit event done in parallel or an implicit state in parallel.
predicative concept--cob->thing
| die--cob->Mary | … dead with Mary |
| John<-obj--njure--cob->friend.pl--pos->he | John was injured in the accident with his friends |
Con defines a non-focused event or state that conditions a focused event or state.
predicative concept--con->concept
| you<-aoj--tired--con->go | If you are tired, we will go straight home. |
Coo defines a co-occurrent event or state for a focused event or state.
predicative concept--coo->predicative concept
| cry--coo->run | … was crying while running |
| red--coo->hot | … is red while … is hot |
Dur defines a period of time during which an event occurs or a state exists.
predicative concept--dur->predicative concept predicative concept--dur->event predicative concept--dur->period predicative concept--dur->state
| work--work->hour--qua->9 | … work nine hours (a day) |
| talk--dur->meeting | … talk … during meeting |
| come--dur->absence | … come during (my) absence |
Equ defines an equivalent concept.
concept--equ->concept
| deconverter--equ->generator | the deconverter (a language generator) |
Fmt defines a range between two things.
thing--fmt->thing
| z--fmt->a | the alphabets from a to z |
| New York--fmt->Osaka | the distance from Osaka to New York |
| Friday--fmt->Monday | the weekdays from Monday to Friday |
Frm defines an initial state of a thing or a thing initially associated with the focused thing.
thing--frm->thing thing->frm->be
| visitor--frm->Japan | a visitor from Japan |
Gol defines a final state of object or a thing finally associated with the object of an event.
be--gol->thing do--gol->thing do--gol->be occur--gol->thing occur--gol->be
| change--gol->red | the lights changed from green to red |
| deposit--gol->account | millions were deposited in a Swiss bank account |
Icl defines an upper concept or a more general concept.
concept--icl->concept
| mammal--icl->animal | a mammal is a (kind of) animal |
Ins defines an instrument to carry out an event.
do--ins->concrete thing
| look--ins->telescope | look at stars through [with] a telescope |
| write--ins->pencil | write [draw] with a pencil |
| cut--ins->scissors | He cut the string with a pair of scissors |
Int defines an intersection between concepts. a partner to take an intersection
concept--int->concept
Iof defines a class concept that an instance belongs to.
concept--iof->concept
| Tokyo--iof->city in Japan | Tokyo is a city in Japan |
Man defines a way to carry out an event or the characteristics of a state.
predicative concept--man->how
| move--man->quickly | move quickly |
| visit--man->often | I often visit him. |
| beautiful--man->very | it is very beautiful. |
Met defines a means to carry out an event.
do--met->abstract thing do--met->do
| solve--met->dynamics | … solve … with dynamics |
| solve--met->algorithm | … solve … using … algorithm |
| separate--met->cut | … separate … by cutting … |
Mod defines a thing that restricts a focused thing.
thing--mod->thing thing--mod->attributive concept
| story--mod->whole | the whole story |
| plan--mod->master | a master plan |
| part--mod->main | the main part |
Nam defines a name of a thing.
thing--nam->thing
| tower--nam->Tokyo | Tokyo tower |
Obj defines a thing in focus that is directly affected by an event or state.
predicative concept--obj->thing
| move--obj>table | the table moved. |
| melt--obj->sugar | the sugar melts into … |
| cure--obj->patient | to cure the patient. |
| have--obj->pen | I have a pen. |
Opl defines a place in focus affected by an event.
do--opl->thing occur--opl->thing
| pat--opl->shoulder | … pat … on shoulder |
| cut--opl->middle | … cut … in middle |
Or defines a disjunctive relation between two concepts.
concept--or->concept
| leave--or->stay | Will you stay or leave? |
| blue--or->red | Is it red or blue? |
| Jack--or->John | Who is going to do it, John or Jack? |
Per defines a basis or unit of proportion, rate or distribution.
thing--per->thing
| 8<-qua--hour--per->day | eitgh hours a day |
| 2<-qua--time--per->week | … twice a week |
Plc defines a place where an event occurs, or a state that is true, or a thing that exists.
predicative concept--plc->place thing--plc->place
| cook--plc->kitchen | … cook … in the kitchen |
| sit--plc->beside | … sit beside me |
| cool--plc->here | It's cool here. |
Plf defines a place where an event begins or a state that becomes true.
predicative concept--plf->thing
| travel--plf->Tokyo | travelling from Tokyo |
| deep--plf->there | The sea is deep from there to here. |
Plt defines a place where an event ends or a state that becomes false.
predicative concept--plt->thing
| travel--plt->Boston | to travel to Boston |
| deep--plt->here | The sea is deep from there to here |
Pof defines a concept of which a focused thing is a part.
thing--pof->thing
| preamble--pof->document | the preamble of a document |
| initial--pof--machine translation | the initials of Machine Translation |
Pos defines the possessor of a thing.
thing--pos->volitional thing
| dog--pos->John | John's dog |
| book--pos->I | my book |
Ptn defines an indispensable non-focused initiator of an action
do--ptn->thing
| compete--ptn->John | … compete with John |
| share--ptn->poor | … share … with the poor |
| collaborate--ptn->he | … collaborate with him … |
Pur defines the purpose or objective of an agent of an event or a purpose of a thing that exists.
do--pur->do do--pur->thing thing--pur->concept
| come--pur->see | … come to see you |
| work--pur->money | … work for money |
| budget--pur->research | our budget for research |
Qua defines the quantity of a thing or unit.
thing--qua->quantity
| coffee--qua->cup--qua->cup--qua->2 | Two cups of coffee |
| kilogram--qua->many | many kilograms |
| dog--qua->2 | two dogs |
Rsn defines a reason why an event or a state happens.
predicative concept--rsn->predicative concept predicative concept--rsn->thing
| go.not--rsn->rain | … didn't go because of the rain |
| start--rsn->(Mary<-agt--arrive) | They can start because Mary arrived. |
| city<-aoj--known--rsn->beauty | a city known for its beauty |
Scn defines a scene where an event occurs, or state is true, or a thing exists.
predicative concept--scn->thing thing--scn->thing
| win--scn->contest | … win a prize in a contest |
| appear--scn->program | … appear on a TV program |
| play--scn->movie | … play in movie |
Seq defines a prior event or state of a focused event or state.
predicative concept--seq->predicative concept
| leap--seq->look | Look before you leap. |
| red--seq->green | It was green and then red. |
| take off-->seq->come in | She came in and took her coat off. |
Src defines the initial state of an object or thing initially associated with the object of an event.
do--src->thing do--src->be occur--src->thing occur--src->be
| change--src->red | The lights changed from green to red. |
| withdraw--src->stove | I quickly withdrew my hand from the stove. |
Tim defines the time an event occurs or a state is true.
predicative concept--tim->time thing--tim->time
| leave--tim->Tuesday | … leave on Tuesday |
| do--tim->o'clock | … do … at … o'clock |
| start--tim->come | Let's start when … come |
Tmf defines the time an event starts or a state becomes true.
predicative concept--tmf->time thing--tmf->time
| work--tmf->morning | … work from morning to [till] night |
| change--tmf->live | … has changed … since I have lived here. |
Tmt defines a time an event ends or a state becomes false.
predicative concept--tmt->time thing--tmt->time
| work--tmt->night | … work from moning to [till] night |
| full--tmt->tomorrow | … be full till tomorrow |
To defines a final state of a thing or a final thing (destination) associated with the focused thing.
thing--to->thing
| train--to->London | a train for London |
| letter--to--you | a letter to you |
Via defines an intermediate place or state of an event.
do--via->thing occur--via->thing
| go--via->New York | … go … via New York |
| bike--via->Alps | … bike … through the Alps |
| drive--via->tunnel | … drive … by way of the tunnel |
Attributes are mainly for the purpose to describe the subjectivity information of sentences. We adopt the attributes set of the UNL. They show what is said from the speaker's point of view: how the speaker views what is said. This includes phenomena technically called "speech acts", "propositional attitudes", "truth values", etc. Attributes are also used to express the range of concepts such as the concept indicate generic type of concept and so forth. This time, we newly introduce attributes to express logical expressions in order to strengthen the expressibility of the CWL.
Relations and concepts are used to describe the objectivity information of sentences. Attributes modify concepts including hyper node (compound concepts) to indicate subjectivity information such as about how the speaker views these states-of-affairs and his attitudes toward them and to indicate the property of the concepts.
Attributes are divided into the following groups:
Attribute:
attribute of concept
attribute of nominal concept
attribute of predicative concept
aspect (view on aspects of event)
begin :beginning of an event or a state
complete :finishing/completion of a (whole) event
continue :continuation of an event
custom :customary or repetitious action
end :end/termination of an event or a state
experience :experience
progress :an event is in progress
repeat :repetition of an event
state :final state or the existence of the
object on which an action has been taken
time (with respect to speaker)
past :happened in the past
present :happening at present
future :will happen in future
view of emphasis, focus and topic
contrast :contrasted concept
emphasis :emphasized concept
entry :entry or main UW of a sentence or a scope
qfocus :focused concept of a question
theme :instantiates an object from a different class
title :title
topic :topic
attitudes(modality)
affirmative :affirmation
confirmation :confirmation
exclamation :exclamation
humility :in a humility manner
imperative :imperative
interrogative :interrogation
invitation :inducement
polite :polite way
request :request
respect :respectful way
vocative :vocative
feelings and judgments
ability :ability, capability of doing something
get-benefit :speaker's feeling of receiving
benefits through the fact or result of
something (to be) done by somebody else
give-benefit :speaker's feeling of giving benefits
by doing something for somebody else
conclusion :logical conclusion due to a certain condition
consequence :logical consequence
sufficient :sufficient condition
consent :consent feeling of the speaker about something
dissent :dissent feeling of the speaker about something
grant :to give/get consent/permission to do something
grant-not :not to give consent to do something
although :something follows against [contrary
to] or beyond expectation
discontented :discontented feeling of the speaker
about something
expectation :expectation of something
wish :wishful feeling, to wish something is
true or has happened
insistence :strong determination to do something
intention :intention about something or to do something
want :desire to do something
will :determination to do something
need :necessity to do something
obligation :obligation to do something according
to (quasi-) law, contract, or …
obligation-not :obligation not to do something, forbid
to do something according to (quasi-)
law, contract or …
should :to do something as a matter of course
unavoidable :unavoidable feeling of the speaker
about doing something
certain :certainty that something is true or happens
inevitable :logical inevitability that something
is true or happens
may :practical possibility that something
is true or happens
possible :logical possibility that something is
true or happens
probable :(practical) probability that something
is true or happens
rare :rare logical possibility that
something is true or happens
unreal :unreality that something is true or happens
admire :admiring feeling about something
blame :blameful feeling about something
contempt :contemptuous feeling about something
regret :regretful feeling about something
surprised :surprised feeling about something
troublesome :troublesome feeling about occurrence
of something
logicality
transitive :has transitivity
symmetric :has symmetricity
identifiable :can identify the subject
disjointed :all element concept have no common
instance all connected concept do not
share instances
view of reference
generic :generic concept
def :already referred
indef :non-specific class
not :complement set
ordinal :ordinal number
modifying attribute on aspect
just :expresses an event or a state that has
just begun or ended/completed
soon :expresses an event or a state that is
about to begin or end/completed
yet :expresses an event or a state that has
not yet started or ended/completed,
together with not
Where does the speaker situate his description in time, taking his moment of speaking as a point of reference? A time before he spoke? After? At approximately the same time? This is the information that defines "narrative time" as past, present or future. These Attributes are attached to the main predicate.
Although in many languages this information is signaled by tense markings on verbs, the concept is not tense, but "time with respect to the speaker:. The clearest example is the simple present tense in English, which is not interpreted as the present time, but as "independently of specific times".
Consider the example:
The earth is round.
This sentence is true in the past, present and future, independently of the speaker's time, so although the tense is "present" it is not interpreted as the present time.
| past | happened in the past | ex) It was snowing yesterday |
|---|---|---|
| present | happening at present | ex) It is raining hard. |
| future | will happen in future | ex) He will arrive tomorrow |
A speaker can emphasize or focus on part of an event or treat it as a whole unit. This is closely linked to how the speaker places the event in time. These Attributes are attached to the main predicate.
The speaker can focus on the beginning "begin" of the event, looking forward to it "begin.soon", or backward to it "begin.just".
He can also focus on the end "end" or completion "complete" of the event, looking forward to it "end.soon" or "complete.soon", or backward to it "end.just" or "complete.just".
He can focus on the middle "progress" or continuation "continue" of the event.
The speaker can choose to focus on the lasting effects or final state of the event "state" or on the event as a repeating unit "repeat", experience "experience" or custom "custom".
He can also focus on the incompleteness or the fact that it has not yet happened, by using "yet".
| begin | beginning of an event or a state | ex) It began to work again. |
|---|---|---|
| complete | finishing/completion of a (whole) event. | ex) I've looked through the script. look.entry.complete |
| continue | continuation of an event | ex) He went on talking. talk.continue.past |
| custom | customary or repetitious action | ex) I used to visit [I would often go] there when I was a
boy. visit.end.present |
| end | end/termination of an event or a state | ex) I have done it. do.end.present |
| experience | experience | ex) Have you ever visited Japan? visit.experience.interrogation ex) I have been there. visit.exterience |
| progress | an event is in progress | ex) I am working now. work.progress.present |
| repeat | repetition of an event | ex) It is so windy that the tree branches are knocking against
the roof. knock.entry.present.repeat |
| state | final state or the existence of the object on which an action has been taken | ex) It is broken. break.state |
These attributes are used to modify the attributes above, to express a variety of aspects of natural languages.
| just | Expresses an event or a state that has just begun or ended/completed | ex) He has just come. come.complete.just |
|---|---|---|
| soon | Expresses an event or a state that is about to begin or end/completed | ex) The train is about to leave. leave.begin.soon |
| yet | Expresses an event or a state that has not yet started or ended/completed, together with "not". | ex) I have not yet done it. do.complete.not.yet |
Whether an expression refers to a single individual, a small group or a whole set is often not clear. The expression "the lion" is not sufficiently explicit for us to know whether the speaker means "one particular lion" or "all lions". Consider the following examples:
The lion is a feline mammal.
The lion is eating an antelope.
In the first example, it seems reasonable to suppose that the speaker understood "the lion" as "all lions", whereas in the second example as "one particular lion".
The following Attributes are used to make explicit what the speaker's view of reference seems to be.
| generic | generic concept | ex) The dog is a faithful animal. |
|---|---|---|
| def | already referred | ex) the book you lost |
| indef | non-specific class | ex) There is a book on the desk. |
| not | complement set | ex) Don't be late! |
| ordinal | ordinal number | ex) the 2nd door |
These attributes are usually attached to UWs that denote things.
The speaker can choose to focus or emphasize parts of a sentence to show how important he thinks they are in the situation described. This is often related to sentence structure.
| contrast | Contrasted UW | For instance, "but" in the examples below is used to
introduce a word or phrase that contrasts with what was said
before. ex) It wasn't the red one but the blue one. ex) He's poor but happy. |
|---|---|---|
| emphasis | Emphasized UW | ex) I do like it. |
| entry | Entry or main UW of a sentence or a scope | ex) He promised (entry of the sentence) that he would come (entry of the scope) |
| qfocus | Focused UW of a question | ex) Are you painting the bathroom blue? To this question, the answer will be `No, I'm painting the LIVING-ROOM blue' |
| theme | Instantiates an object from a different class | |
| title | Title | |
| topic | Topic | ex) He("topic") was killed by her. ex) The girl("topic") was given a doll. ex) This doll("topic") was given to the girl. |
One concept marked with "entry" is essential for each CWL expression or in a compound concept.
The speaker can also express, directly or indirectly, what his attitudes or emotions are towards what is being said or whom it is being said to. This includes respect and politeness towards the listener and surprise toward what is being said.
| affirmative | Affirmation |
|---|---|
| confirmation | Confirmation ex) You won't say that, will you? ex) It's red, isn't it? ex) Then you won't come, right? |
| exclamation | Feeling of exclamation ex) kirei na! (`How beautiful (it is)!' in Japanese) ex) Oh, look out! |
| humility | In a humility manner to express something ex) That is quite impossible for the likes of me.@humility. |
| imperative | Imperative ex) Get up! ex) You will please leave the room. |
| interrogative | Interrogation ex) Who is it? |
| invitation | Inducement to do something ex) Will / Won't you have some tea? ex) Let's go, shall we? |
| polite | Polite way to express something ex) Could you (please)... ex) If you could … I would … |
| request | Request ex) Please don't forget… |
| respect | Respectful feeling. In many cases, some special words are
used. ex) o taku (`(your) house' in Japanese) ex) Good morning, sir. |
| vocative | Vocative ex) Boys, be ambitious! |
These attributes express the speaker's feelings or how the speaker views or judges what is said.
This sort of subjective information is very much dependent on the type of language. It should be possible to express every kind of subjective information from all languages. Thus, the development of the attributes is open to the developers of each language, who can introduce a new attribute when no current attribute expresses its meaning. The new attribute must be also introduced in the same way.
The following attributes are used to clarify the speaker's viewpoint information.
| ability | Ability, capability of doing something ex) The child can 't walk yet. ex) He can speak English but he can't write it very well. |
|---|
| get-benefit | Speaker's feeling of receiving benefits through the fact or
result of something (to be) done by somebody else ex) I'll have my secretary type the letter. *In Japanese the auxiliary verb of `~te morau' is used to express the getting benefits feeling. For instance it is frequently used in a sentence in the sense of `to have somebody do something' in Japanese. |
|---|---|
| give-benefit | Speaker's feeling of giving benefits by doing something for
somebody else ex) Be kind to old people. *In Japanese the auxiliary verb of `~te ageru' is used to express the giving benefits feeling. For instance it is frequently used in a sentence in the sense of `Be kind to old people' in Japanese. |
| conclusion | Logical conclusion due to a certain condition ex) He is her husband; she is his wife. |
|---|---|
| consequence | Logical consequence ex) He was angry, wherefore I left him alone. |
| sufficient | Sufficient condition ex) only have to |
|---|
| consent | Consent feeling of the speaker about something |
|---|---|
| dissent | Dissent feeling of the speaker about something ex) But that's not true. |
| grant | To give/get consent/permission to do something ex) Can I smoke in here? ex) You may borrow my car if you like. |
| grant-not | Not to give consent to do something ex) You {mustn't/are not allowed to/may not} borrow my car. |
| although | Something follows against [contrary to] or beyond
expectation ex) Although he didn't speak, I felt a certain warmth in his manner. |
|---|---|
| discontented | Discontented feeling of the speaker about something ex) (I'll tip you 10 pence.) But that's not enough! |
| expectation | Expectation of something ex) Children ought to be able to read by the age of 7. ex) If you leave now, you should get there by five o'clock. |
| wish | Wishful feeling, to wish something is true or has
happened ex) If only I could remember his name! (~I do wish I could remember his name!) ex) You might have just let me know. |
| insistence | Strong determination to do something ex) He will do it, whatever you say. |
|---|---|
| intention | Intention about something or to do something ex) He shall get this money. (Speaker's intention) ex) We shall let you know our decision. |
| want | Desire to do something ex) I want to go France. |
| will | Determination to do something ex) I'll write as soon as I can. ex) We won't stay longer than two hours. |
| need | Necessity to do something ex) You need to finish this work today. ex) I must be going now. ex) I always have to work hard. |
|---|---|
| obligation | Obligation to do something according to (quasi-) law,
contract, or … ex) The vendor shall maintain the equipment in good repair. ex) You must come by nine. |
| obligation-not | Obligation not to do something, forbid to do something
according to (quasi-) law, contract or … ex) Cars must not park in front of the entrance. ex) No smoking |
| should | To do something as a matter of course ex) You should do as he says. ex) You ought to start at once. |
| unavoidable | Unavoidable feeling of the speaker about doing something ex) I could not help speaking the truth. |
| certain | Certainty that something is true or happens ex) If Peter had the money, he would have bought a car. ex) They should be home by now. |
|---|---|
| inevitable | Logical inevitability that something is true or happens ex) All living things must die. |
| may | Practical possibility that something is true or happens ex) It may be true. ex) It could be. |
| possible | Logical possibility that something is true or happens ex) Anybody can make mistakes. ex) If Peter had the money, he would buy a car. |
| probable | (Practical) probability that something is true or happens ex) That would be his mother. ex) He must be lying. |
| rare | Rare logical possibility that something is true or happens ex) If such a thing should happen, what shall we do? ex) If I should fail, I will [would] try again. |
| unreal | Unreality that something is true or happens ex) If we had enough money, we could buy a car. ex) If Peter had the money, he could buy a car. |
| admire | Admiring feeling of the speaker about something |
|---|---|
| blame | Blameful feeling of the speaker about something ex) A sailor, and afraid of the sea! |
| contempt | Contemptuous feeling of the speaker about something ex) You never could do it *In Japanese the postpositional particles of `nado', `nanka' or `nante' as in `kimi nado niha..' can be used to express the contemptuous feeling of the speaker about the target, mainly in a negative sentence |
| regret | Regretful feeling of the speaker about something ex) It's a pity that he should miss such a golden opportunity. |
| surprised | Surprised feeling of the speaker about something ex) (He has succeeded!) But that's great! |
| troublesome | Troublesome feeling of the speaker about the occurrence of
something ex) My house was [I had my house] broken into.@troublesome yesterday. *There is a troublesome feeling of the speaker when using a passive form of the verb in this case in Japanese. |
Typical CWL structures can be expressed by attributes to avoid the complexity of enconverting and deconverting. What marks are used for enclosing a word or phrase can also be expressed by attributes. The attributes for indicating enclosure must be attached to the scope node of the enclosed phrase if it consists of a (set of) binary relation(s) of CWL.
| passive | passive form | ex) Being bitten.@passive by a dog ... |
|---|---|---|
| pl | more than one | ex) children: child(icl>young person).@pl |
| angle_bracket | < > are used | |
| brace | { } are used | |
| double_parenthesis | (( )) are used | |
| double_quote | " " are used | |
| parenthesis | ( ) are used | |
| single_quote | ' ' are used | |
| square_bracket | [ ] are used |
As we do not have any system for CWL at this moment, we need to use the UNL system for input CWL and output what is expressed in CWL in natural languages. We basically use the UNL system as a platform of CWL.
The CWL platform consists of CWL Editor, CWL converter and the UNL System consisting of Enconverter (which convert natural languages into UNL), Deconverter (which convert UNL into natural languages).
Since compatibility of the three types of expression of CWL.unl, CWL.cdl, CWL.rdf, The UNL system together with conversion system allow people to make web pages in CWL.unl, CWL.cdl, CWL.rdf and also allow people to see those web pages in their mother tongues.
Beside the UNL system, we provided conversion system (CWL converter) among CWL.unl, CWL.cdl and CWL.rdf. And also we provide necessary vocabulary for CWL based on the UNLKB which provides semantic background of Universal Words of UNL. Knowledge on words (CWL ontology) of each language is stored in UNLKB (CWL.unl), CDD.nl (CWL.cdl) and OWL (CWL.rdf).
The UNL is an acronym for `Universal Networking Language'. The CWL.unl is a prepsentation of CWL in UNL. The UNL including UNLKB and UWs was developed under United Nations University / Institutes of Advanced Sturdy in 1996, and research and development were transferred to the UNDL Foundation in 2001.(http://www.undl.org/unlsys/uw/unlkb.htm)
The UNL expresses information or knowledge in the form of semantic network with hyper-node. UNL semantic network is made up of a set of binary relations, each binary relation is composed of a relation and two UWs that hold the relation. A binary relation of UNL is expressed in the following format:
<relation> ( <uw1>, <uw2> )
In <relation>, one of the relations defined in the UNL Specifications is described. In <uw1> and <uw2>, the two UWs that have the relation given by <relation> are described. Semantic network of UNL expression is a directed graph by means of the binary relations with direction. The three elements of each binary relation have the following interrelationship:
<uw1> – <relation> → <uw2>
Such a binary relation is interpreted as that:
the UW given in <uw2> plays the role indicated by the relation given in <relation> held by the UW given in <uw1>; whereas the UW given in <uw1> holds the relation given in <relation> with the UW given in <uw2>.
A UNL expression is a hyper semantic network. That is, each node of the graph, <uw1> and <uw2> of a binary relation, can be replaced with a semantic network. Such a node consists of a semantic network of a UNL expression and is called a `scope'. A scope can be connected with other UWs or scopes. The UNL expressions in a scope is distinguished from others by assigning an ID to the <relations> of the set of binary relations that belong to the scope.
The general description format of binary relations for a hyper-node of UNL expression is the following:
<relation> : <scope-id> ( <node1>, <node2> )
Where,
UNL expressions are provided in the format of UNL documents. A UNL document is a text file that includes the original sentences, UNL document tags, UNL expressions and etc.
A UNL document is enclosed with tags `[D:<dinf>]' and `[/D]'. Within these tags, each paragraph is enclosed with a pair of tags `[P:<p_num>]' and `[/P]', and each sentence is enclosed with a pair of tags `[S:<s_num>]' and `[/S]'. Inside a sentence, the source text is enclosed with `{org:<l_tag>}' and `{/org}', its UNL expression is enclosed with `{unl:<uinf>}' and `{/unl}'. Sentences of target languages can also be stored in the UNL document. Each target sentence is enclosed with a pair of language tags `{<l_tag>}' and `{</l_tag>}' following the UNL expression of each sentence.
Description format of a UNL document is the following:
| <UNL Document> | ::= "[D:" <dinf> "]" {"[P:" <paragraph number> "]" {"[S:" <sentence number> "]" <sentence> "[/S]"} ... "[/P]" } ... "[/D]" |
| <dinf> | ::= <document name> "," <author name> [ "," <document ID> "," <date> "," <email address> ] |
| <document name> | ::= "dn=" <character string> |
| <author name> | ::= "on=" <character string> |
| <document ID> | ::= "did=" <character string> |
| <date> | ::= "dt=" <character string> |
| <email address> | ::= "mid=" <character string> |
| <sentence> | ::= "{org:" <l-tag> [ "=" <code> ] "}" <source sentence> "{/org}" "{unl" [ ":" <uinf> ] "}" <UNL expression> "{/unl}" { "{" <l-tag> [ "=" <code> ] [ ":" <sinf> "]" "}" <target sentence> "{/" <l-tag> "}" } ... /* whole information that necessary for a sentence */ |
| <l-tag> | ::= "ab" | "cn" | "de" | "el" | "es" | "fr" | "id" | "hd" |
"it" | "jp" | "lv" | "mg" | "pg" | "ru" | "sh" | "th"; /* language codes : language tags */ |
| <code> | ::= <character code name> |
| <character code name> | ::= <character string> |
| <source snetence> | ::= <character string> |
| <target sentence> | ::= <character string> |
| <uinf> | ::= <system name> "," <post-editor name> "," <reliability> [ "," <date> "," <email address> ] |
| <sinf> | ::= <system name> "," <post-editor name> "," <reliability> [ "," <date "," <email address> ] |
| <system name> | ::= "sn=" <character string> |
| <post-editor name> | ::= "pn=" <character string> |
| <reliability> | ::= "rel=" <a number> |
| <paragraph number> | ::= <a number> |
| <sentence number> | ::= <a number> |
The tags used in a UNL document are the following:
| [D:<dinf>] | indicates the Beginning of a document and the necessary information about the document |
|---|---|
| [/D] | indicates the End of a document |
| [P:<p_num>] | indicates the Beginning of a paragraph |
| [/P] | indicates the End of a paragraph |
| [S:<s_num>] | indicates the Beginning of a sentence and the sentence number |
| [/S] | indicates the End of a sentence |
| {org:<l_tag>=<code>} | indicates the Beginning of an original/source sentence, language and character code, "=<code>" can be omitted. |
| {/org} | indicates the End of an original sentence |
| {unl:<uinf>} | indicates the Beginning of the UNL expressions of a sentence and necessary information, `:<uinf>' can be omitted. |
| {/unl} | indicates the End of the UNL expressions of a sentence |
| {<l_tag>} | indicates the Beginning of a target sentence of the language indicated by <l_tag> |
| {/<l_tag>} | indicates the End of a target sentence of the language indicated by <l_tag> |
See the following section about <UNL expression>.
A UNL expression of a sentence is identified with the following tags: {unl} and {/unl}.
Any component, such as a word, phrase and, of course, a sentence of a natural language can be represented with UNL expressions. A UNL expression therefore consists of a UW or a (set of) binary relation(s). In UNL documents, a UNL expression for a sentence is enclosed by the tags {unl} and {/unl} inside [S] and [/S]. If a UNL expression consists of a UW, this UW should be enclosed further by the tags [W] and [/W]. If necessary, the whole sentence can also be expressed as a scope. In this case, the Compound UW-ID of the scope should be enclosed by [W] and [/W].
There are two forms for expressing UNL expressions, one is the table form and the other is the list form. The table form is made up of a set of binary relations, and each binary relation is expressed by connecting the two related UWs directly. And the list form is divided into two parts: a list of UWs corresponding IDs and a list of binary relations described by the IDs. The table form of a UNL expression is more readable than the list form, but the list form of a UNL expression is more compact than the table form. These two forms are convertible with each other.
{unl}
<binary relation>
...
{/unl}
{unl}
[W]
<UW><attribute list>
[/W]
{/unl}
{unl}
[W]
":" <compound UW-ID><attribute list>
[/W]
<binary relation>
...
{/unl}
Each tag and binary relation should end with a return code: `0x0a'.
Description format of a binary relation of the table form is the following:
| <binary relation> | ::= <relation> [":"<compound UW-ID>] "(" {{ <UW1> [":" <UW-ID1>]} | { ":" <compound UW-ID1> }}[<attribute list>] "," {{<UW2> [":" <UW-ID2>]} | { ":" <compound UW-ID2> }}[<attribute list>] ")" |
| <relation> | ::= a relation label, defined in `Chapter 2 Relations' |
| <UW> | ::= a UW, see `Chapter 3 Universal Words' |
| <attribute list> | ::= { `.' <attribute>} … |
| <attribute> | ::= an attribute, see `Chapter 4 Attributes' |
| <UW-ID> | ::= two alphanumeric characters of '0' - '9' and 'A' - 'Z' |
| <compound UW-ID> | ::= two digits of `00' - `99'. `00' must be used for the main sentence and can be ommited. |
A UNL expression can include more than one scope. Compound UW-IDs are for identifying each concept specified by compound UWs (scopes) in a UNL expression. A scope is a group of binary relations that can be referred to as a UW by indicating its compound UW-ID in the format of `:<Compound UW-ID>'. A node described in this way in the UNL expression network that refers to a scope is called a `Scope Node'. For details about the scope please refer to `3.2 Compound UWs'.
UW-IDs are for identifying each concept specified by UWs in a UNL expression. If a UW appears in a UNL expression more than once and means different concepts (things or events), a unique UW-ID must be given to each concept of the UWs.
The following shows an example of UNL expressions of the sentence `I can hear a dog barking outside':
{unl}
agt(hear(icl>perceive(agt>person,obj>thing)).@entry, I)
obj(hear(icl>perceive(agt>person,obj>thing)).@entry, :01)
agt:01(bark(agt>dog).@entry, dog(icl>canine))
plc:01(bark(agt>dog).@entry, outside(icl>place))
{/unl}
In above UNL expression, `agt', `obj' and `plc' are relation labels, `I', `bark(agt>dog)', `dog(icl>canine)', `hear(icl>perceive(agt>person,obj>thing))' and `outside(icl>place)' are UWs. `a dog barking outside' is expressed by a scope, and `01' is given as the compound UW-ID to the scope. `:01' appears in the position of a UW is the scope node to refer to the scope. Binary relations indicated by the Compound UW-ID define the contents of the scope.
The list form of a UNL expression consists of a set of UWs and a set of encoded binary relations (expressed by UW-IDs) of a sentence. In case a whole sentence is treated as a scope, the Compound UW-ID of the scope for the sentence can be included in the UW list between [W] and [/W].
{unl}
[W]
{<UW> | {`:'<compound UW-ID>}}[<attribute list>]":" <UW-ID>
…
[/W]
[R]
<binary relation by UW-IDs>
…
[/R]
{/unl}
The tags used above have the following meanings.
| [W] | indicates the Beginning of the UW list. |
|---|---|
| [/W] | indicates the End of the UW list. |
| [R] | indicate the Beginning of the encoded binary relations. |
| [/R] | indicates the End of the encoded binary relations. |
Each tag, encoded binary relation and UW should end with a return code: `0x0a'.
UWs of a UNL expression must be listed between [W] and [/W] with different (unique) UW-IDs for different concepts. This means that the same UW expression but expressing different concepts (instances) must be given different UW-IDs. A scope must be defined again in the UW list.
| <binary relation by UW-IDs> | := <UW-ID1><relation>[`:'<Compound UW-ID>]<UW-ID2> |
| <UW-ID> | := two alphanumeric characters of `0' – `9' and `A' – `Z' |
| <Compound UW-ID> | := two digits of `00' – `99' |
For instance, the following shows an example of the list form of a UNL expression of the sentence `I can hear a dog barking outside'.
{unl}
[W]
I:01
hear(icl>perceive(agt>person,obj>thing)).@entry:02
dog(icl>canine):03
bark(agt>dog).@entry:04
outside(icl>place):05
:01:06
[/W]
[R]
02aoj01
02obj06
04agt:0103
04plc:0105
[/R]
{/unl}
In the above example, between [W] and [/W], UWs 'I', 'hear(icl>perceive(agt>thing,obj>thing))', 'dog(icl>canine)', 'bark(agt>dog),' 'outside(icl>place)' and the scope node `:01' are given a UW-ID from 01 to 06 respectively.
Between [R] and [/R], binary relations are described using the UW-IDs defined in the UW list. For example, `02obj06' in the second line shows that the concept identified by UW-ID 06 is the 'obj' of the concept identified by UW-ID 02. UW-ID 06 means the concept of scope 01, and UW-ID 02 means the concept of 'hear(icl>perceive(agt>thing,obj>thing))'.
Binary relations `04agt:0103' and `04plc:0105' express the UNL expression of scope 01. This is indicated by the CompoundUW-ID `01' described following the relations 'agt' and 'plc'.
qua [":"<Compound UW-ID>] "(" {<UW1>|":" <Compound UW-ID>} "," {<UW2>|":"<Compound UW-ID>} ")"
A UW is made up of a character string (an English-language word) followed by a list of constraints. The meaning and function of each of these parts is described in the next section, on Interpretation.
The following is the syntax of description of UWs:
| <UW> | ::= <headword> [<constraint list>] |
| <headword> | ::= <character>… |
| <constraint list> | ::= "(" <constraint> [ "," <constraint>]… ")" |
| <constraint> | ::= <relation label> { `>' | `<' } <UW>
[ <constraint list>] | <relation label> { `>' | `<' } <UW> [ <constraint list>] [ { `>' | `<' } <UW> [<constraint list>] ] … |
| <relation label> | ::= "agt" | "and" | "aoj" | "obj" | "icl" | ... |
| <character> | ::= "A" | ... | "Z" | "a" | ... | "z" | 0 | 1 | 2 | ... | 9 | "_" | " " | "#" | "!" | "$" | "%" | "=" | "^" | "~" | "|" | "@" | "+" | "-" | "<" | ">" | "?" |
The headword is an English word/compound word/phrase/sentence that is interpreted as a label for a set of concepts: the set made up of all the concepts that may correspond to that in English. A Basic UW (with no restrictions or constraint list) denotes this set. Each Restricted UW denotes a subset of this set that is defined by its constraint list. Extra UWs denote new sets of concepts that do not have English-language labels.
Thus, the headword serves to organize concepts and make it easier to remember which is which.
The constraint list restricts the interpretation of a UW to a subset or to a specific concept included within the Basic UW, thus the term `Restricted UWs'. The Basic UW `drink', without a constraint list, includes the concepts of `putting liquids in the mouth', `liquids that are put in the mouth', `liquids with alcohol', `absorb' and others. The Restricted UW 'drink(agt>thing,obj>liquid)' denotes the subset of these concepts that includes `putting liquids in the mouth', which in turn corresponds to verbs such as `drink', `gulp', `chug' and `slurp' in English.
A restriction of a UW is made up of a pair of a relation and a pre-defined UW (or part expression of a pre-defined UW) that holds the relation with this UW. If more than one restrictions are necessary, a comma `,' should be used between restrictions. A Restricted UW is defined through a Master Definition (for details please refer to UW Manual). In a Master Definition, full expressions of pre-defined UWs must be described in the restrictions, whereas as for a UW, if and only if the uniqueness can be kept, part of the pre-defined UWs (its headword or part restrictions) can be used in the restrictions. Relation labels used in the constraint list must be defined in the UNL specifications and should be sorted in alphabetical order if more than one restrictions are used.
In order to define the meaning of a UW more accurately, for instance, a subset concept of UW is always defined under an upper UW that has the closest but more general meaning. This is implemented by linking the UW to be defined with the upper UW using 'icl' relation. For example, UW 'provide(icl>give(agt>thing,gol>thing,obj>thing))' is defined as a subset concept of UW 'give(agt>thing,gol>thing,obj>thing)'. However if the headword of the upper UW is either of `be', `do', `occur' and `uw', such a headword is not necessary to remain in the restrictions of lower UWs as the each set of restrictions of these upper UWs is set enough to restrict their lower UWs. For example, from Master Definition 'drink({icl>do(}agt>thing,obj>liquid{)})' a UW 'drink(agt>thing,obj>liquid)' and a binary relation 'icl(drink(agt>thing,obj>liquid), do(agt>thing,obj>liquid))' are generated. The part related to the headword `do' is removed from its lower UW expression and the binary relation that will be described in the UNLKB shows that 'drink(agt>thing,obj>liquid)' is a subset concept of 'do(agt>thing,obj>liquid)'. For details of description of UW please refer to UW manual.
A UW is a character string and most of the UWs are basically made up of an English expression with restrictions. A UW can express various levels' concepts depending on the restrictions and can be used to express a more specific or particular concept or an instance by giving attributes and IDs or restrictions from other UNL expressions. The UWs are divided into four types:
go take house state
state(icl>express(agt>thing,gol>person,obj>thing)) state(icl>country) state(icl>region) state(icl>abstract thing) state(icl>government)
ikebana(icl>flower arrangement) samba(icl>dance) souffle(icl>food)
1234 xyz
Basic UWs are character strings that correspond to English words. Such a basic UW denotes all the concepts that may correspond to those in English. However a basic UW is not used if the English expression is ambiguous. Such a basic UW is usually used as the headwords of Restricted UWs for its various specific concepts. A basic UW is used if the English expression has no ambiguity.
Restricted UWs are by far the most important. A Restricted UW is made up of a headword (English expression) with restrictions. It is necessary when the English expression of headword has broader sense (more meanings) than the concept aimed to define. The restrictions restrict the range of the concept that an English expression represents. Each Restricted UW made from an English expression represents a more specific or particular concept, or a subset of the concepts of the English expression.
For example, following are the Restricted UWs made from the English word `state':
The information in parentheses is the constraint list and it describes some conceptual restrictions; this is why they are called Restricted UWs. Informally, the restrictions mean `restrict your attention to this particular sense of the word'.
Extra UWs denote concepts that are not found in English and therefore have to be introduced as extra categories. Foreign-language words are used as headwords using English (Alphabetical) characters.
For example, following are the examples of Extra UWs:
To the extent that these concepts exist for English speakers, they are expressed with foreign-language loanwords and do not always appear in English dictionaries. So they simply have to be added to be able to use these specific concepts in the UNL system. The restrictions give the idea of what kind of concept is associated with these Extra UWs and the constraints provide the binary relations between this concept and other, more general, concepts already defined. Needless to say, an Extra UW is also defined through a Master Definition, and a pre-defined UW or its part expression must be used in the restrictions of an Extra UW.
A number or an address of email that has to be used as it is not necessary to define. They can appear in a UNL document and are treated as temporary UWs.
ex) 1234
xyz
Compound UWs are a set of binary relations that are grouped together to express a complex concept. A sentence itself is considered as a compound UW. Compound UWs denote complex concepts that are to be interpreted/understood as a whole so that one can talk about their parts all at the same time. A compound UW is expressed by a scope in UNL expressions. A scope makes it possible when a compound UW is necessary to be connected with other UWs.
Consider the following example:
[Women who wear big hats in movie theaters] should be asked [to leave].
The part of the sentence within square brackets is what should be asked. Only when they are grouped together and considered as a whole unit can the correct interpretation be obtained.
Attributes can be attached to them to express negation, speaker attitudes, etc., which are usually interpreted as modifying the main UW (attached with @entry) and its coordinate UWs within the compound UW (scope).
A Compound UW is defined by placing a Compound UW-ID immediately after the Relation Label in all of the binary relations that are to be grouped together. Thus, in the example below, `:01' indicates all of the elements that are to be grouped together to define Compound UW number 01.
agt:01(wear(aoj>thing,obj>hat), woman(icl>person).@pl) obj:01(wear(aoj>thing,obj>hat), hat(icl>wear)) aoj:01(big(aoj>thing), hat(icl>wear)) plc:01(wear(aoj>thing,obj>hat), theater(icl>facilities)) mod:01(theater(icl>facilities), movie(icl>art)) agt:01(leave(agt>thing,obj>place).@entry, woman(icl>person).@pl)
After this group has been defined, wherever the Compound UW-ID is, for instance `01' in the above example, it can be used to cite the Compound UW. The way to cite a Compound UW is explained in the next section.
A Compound UW is considered as a sentence or sub-sentence, so in the definition of a Compound UW one entry node marked by @entry is necessary.
Once defined, a Compound UW can be cited or referred to by simply using the Compound UW-ID as an UW. The method is to indicate the Compound UW-ID following a colon `:'. The reference to a Compound UW is also called a Scope Node. The Scope Node has the following syntax:
| <Scope Node> | ::= ":" <Compound-ID> [ <Attribute List> ] |
| <Compound-ID> | ::= two digits of a number "01"–"99", except "00" |
| <Attribute List> | ::= { `.' <Attribute> } … |
| <Attribute> | ::= `@entry' | `@may' | `@past' | … |
| <Node> | ::= <UW> [`:' <Compound-ID> | <Sentence-ID>]… [ <Attribute List> ] |
| <Compound-ID> | ::= two digits of a number "01"–"99", except "00" |
| <Attribute List> | ::= { `.' <Attribute> } … |
| <Attribute> | ::= `@entry' | `@may' | `@past' | … |
To complete the UNL expression of `[Women who wear big hats in movie theaters] should be asked [to leave]', the following are necessary:
obj(ask(agt>thing,gol>person,obj>uw).@should.@entry, :01) gol(ask(agt>thing,gol>person,obj>uw).@should.@entry, woman(icl>person).@pl.@topic)
'obj(ask(agt>thing,gol>person,obj>uw).@should.@entry, :01)' shows that Scope 01 is the obj of `ask'.
':01' shows the scope node. It is interpreted as the whole set of binary relations defined above. It means that `:01' should be understood as comprising all of these binary relations. Compound UWs can be cited within other Compound UWs.
The UNLKB is a semantic network comprising every directed binary relation between UWs. All the binary relations of the UNL KB are in the following format: 'relation(UW1, UW2)=c', where 'c' is the degree of certainty, which has the value 0 (impossible) or 1 to 128 (certain). This binary relation means `UW1 takes UW2 as the relation in certainty value c', or `UW2 plays the role specified by relation for UW1 in certainty value c'.
The UNLKB provides concepts and UW of the concepts. A UW (Universal Word) is a label for a concept. Concepts labelled by UWs are defined by describing the set of possible relations that each concept can have with other concepts. Definitions of possible relations with other concepts describe behaviours of concepts in terms of other concepts. This behaviour is the property of a concept in the sense that the descriptions of behaviour characterize the concept and provide enough information to understand the semantic structure of the sentence.
The behaviour of a concept is considered as linguistic knowledge on the concept. This knowledge is used to provide semantic structure of sentences of natural languages. For example, an `author' is a `person', which can take various actions that a person can take such as a person can writes something and something can be a book, and so forth. This level of knowledge is necessary to provide semantic background of natural language sentences. Further knowledge, for example real world knowledge and so forth, will be established based on this linguistic knowledge using the UWs.
In the UNL KB, all UWs are linked with each other through 'icl',(subclass) 'iof'(element/instance), or 'equ' relations. 'Icl' links a UW of a subclass concept to a super-class concept UW; 'iof' links a UW expressing an instance to a UW of a class concept; The UWs related to each other through 'icl', 'iof' and 'equ' relations make up a hierarchy of UWs. This hierarchy of UWs is the UW system. This UW system allows having multiple super-class concepts. Accordingly, the UW system is a lattice type of network.
The hierarchy of the UW system is constructed by taking the property inheritance and replacement by super-class concept mechanisms into consideration. In UW system, lower UWs inherit the properties of upper UWs; and upper UWs can replace lower UWs to convey a more general sense in the specific context of the lower UWs. All these inheritance and replacement are carried out through the relations 'icl', 'iof' and 'equ'.
In the UNL KB, all possible relations, such as 'agt', 'obj', etc, that an UW can have with others are defined for each. Every possible relation is defined between the two most general UWs of the two categories (of lower UWs) that can have the relation. Utilizing the property inheritance mechanism of the UW system possible relations of lower concepts are deductively inferred and this inference mechanism can reduce the number or binary relations.
Replacement of lower UWs by upper UWs can cause problems by introducing ambiguities if the upper UWs are not close in meaning to the lower UWs. To avoid this, the upper UWs must be the closest UWs among all of the more general UWs . In other word, every UW must be positioned under the closest upper UWs.
The UNLKB defines the syntax and semantics of the language UNL. In the UNL System, UNLKB is used in sentence analysis for disambiguation and in sentence generation for finding more general concepts when encountering unknown concept to the language. The UNLKB also is used to verify UNL expressions since it provide syntax and semantics of UNL.
The CWL.cdl is a representation of CWL in CDL.nl. The CDL.nl is appropriate for logical treatment of contents together with information in multimedia. The UNL is appropriate for communication beyond language barriers. RDF/OWL is appropriate for dealing with meta data of web pages and has already many applications actually working.
Concept Description Language (CDL) is a language that describes a wide variety of representation media and content, as well as conceptualization of their meaning, in a common format. CDL is a simple extension of the basic principles of XML and is able to inherit stored data tagged with XML.
In contrast to XML, which annotates the document structure (syntactic structure) of content, CDL describes the concept structure (semantic structure) of content. This paper begins with the characteristics of CDL compared to those of XML.
Specifications design for XML is summarized as follows.
The followings are the key concepts of the CDL specification.
It is possible to set up partially a mechanism compliant to CDL for XML. This mechanism is realized by RDF/OWL in the Semantic Web. The following are its key characteristics in RDF/OWL specifications design.
In comparison, the following are the key CDL specifications.
In contrast to RDF/OWL which was designed in bottom-up design from XML for metadata description, CDL was designed in top-down approach from the basic perspective on concept structure for the purpose of content concept description. For this reason, CDL is consistent, simple and clear in specifications. In CDL implementation, RDF/OWL implementation specifications and XML implementation specifications are prepared.
In CDL, CDL.xxx for various representation media and the various content formats have to be set up, based on CDL.core, which provides the meta-concepts for describing concept definitions. The respective concept vocabularies are defined in their corresponding CDD (CDD.xxx). For natural language media, concept shared with natural language is defined as CDD.nl. Concepts necessary for the conceptualization of meanings representing terms, phrases, sentences and text in natural language are defined. CDLs corresponding to each language, such as Japanese, English, and Chinese, are created as CDL.jpn, CDL.eng, CDL.chi, etc., and CDDs defining concepts for each language are prepared. Furthermore, CDD corresponding to everyday language, CDD for specialized areas and communities, and CDD for controlled language, etc., are created to prepare CDL that corresponds to a variety of language usage.
In addition, CDLs can also be made for mathematical equations, programming languages, video image, music sound, etc., in the form of CDL.math, CDL.prog, CDL.movie, CDL.music, etc. CDDs therefore consist of concepts inherent to each representation media and concepts borrowed from natural language.
In CDL, maintaining continuity with XML is important. This is because the document structure (syntactic structure) serves as a guide for the concept structure (semantic structure) and it is necessary to appropriately preserve the huge amount of content that is XML-tagged. The schemes prepared for appropriate coordination between CDL and XML are following. There are two schemes.
Model and Syntax determine how concepts or conceptualization are to be modeled and what kind of data format or grammar will be set up to express the model. In order words, it determines the common form of expression for all CDL family languages. Therefore, CDL.core follows this notation format.
First, the basic categories for the model of conceptualization are determined as follows.
Concept
Entity
ElementalEntity
ComplexEntity
Relation
ElementalRelation
ComplexRelation
Attribute
Concepts and concept descriptions are modeled in the following network structure, namely:
As method for network structure notation for concepts, graph notation with diagram and text notation with symbolic text are made available. Text notation syntax, specifically basic syntax, is shown below. The following four meta symbols are used to describe grammar.
| `::=' | Left side is defined as right side |
| `|' | Selection `or' |
| `Symbol' | Meta symbol (Non-terminal symbol) |
| `…' | Repetition zero or more times |
ConceptDescription ::= EntityDescription EntityDescription ::= ElementalEntityDescription | ComplexEntityDescription ElementalEntityDescription ::= {RealizationLabel DefinitionLabel AttributeValuePair… ;} ComplexEntityDescription ::= {RealizationLabel DefinitionLabel AttributeValuePair… ; EntityDescription… RelationDescription… Arc…} RelationDescription ::= ElementalRelationDescription | CompexRelationDescription ElementalRelationDescription ::= {RealizationLabel DefinitionLabel AttributeValuePair… ;} CompexRelationDescription ::= {RealizationLabel DefinitionLabel AttributeValuePair… ;} From To EntityDescription…EntityDescription… Arc…} AttributeValuePair ::= Attribute =Value Arc ::= [EntityRealizationLabel-from RelationRealizationLabel EntityRealizationLabel-to] RealizationLabel ::= #ArbitraryCharacterString DefinitionLabel ::= DefinitionLabelDefinedInCDD Attribute ::= AttributeDefinedInCDD Value ::= ValueDefinedInCDD
In both entity and relation, concept description is unified, simple and clear. The difference between elemental concept and complex concept is whether there is a structure description after the delimiter (;). The ComplexRelationDescription includes special ElementalEntity (From and To) to match the inside and the outside of an arc.The reason of the use of "{" and "}" is to perceive element set in structure description as bag (set). Therefore, concept description in structure description will be identified uniquely with realization label. Scope can be set on the realization label according to nested structure of concept description. The scope on the realization lablel is set in open scope and closed scope modes. Open scope is default. Description examples hereafter are shown in open scope. The arcs are in triples with direction from left to right.
There are elemental concept description and complex concept description in concept description. One definition is used in various realizations. Concept description requires localized Attribute-Value pair. These are easy to understand for entity concept but require explanation for relation concept. To understand the explanation, the description in the next and subsequent chapters is necessary. However, it will be summarized below.