Specification of Requirements/Lexico-Syntactic Patterns

From Ontology-Lexica Community Group

Summary on Requirements on Lexico-Syntactic Patterns (Synthesis by PC)

The patterns considered here recur across genres/domains featuring a certain invariance. Lexico-syntactic patterns (LSP, cf. LSPs at ontologydesignpatterns.org) are generalized linguistic structures or schemas that indicate semantic relationships between terms and can be applied to the identification of formalized concepts and conceptual relations in natural language text.

The lexicon-ontology must allow to represent such as patterns though not necessarily as lexical entries. It must allow to represent the lexico-syntactic structure of the patterns as well as the semantic relation it expresses.

Let us take the following pattern as example $NP1 be (the same as/synonym of)/know as/call/(refer to as) $NP2. This pattern might be said to express an equivalence related between an OWL class C1 labelled with NP1 and an OWL class C2 labelled with NP2. Thus, the semantics of this lexico-syntactic pattern is captured by the following template:

c1 rdfs:label "$NP1".
c2 rdfs:label "$NP2".
c1 owl:equivalenctClass c2.

where class1 and class2 are new or existing constants.

What I am not sure about is whether we would not rather like to infer relations between senses here, i.e.

le1 rdf:type lemon:LexicalEntry.
le1 lemon:hasWrittenForm "$NP1".
le1 lemon:hasSense s1.
s1 lemon:reference class1.
le2 rdf:type lemon:LexicalEntry.
le2 lemon:hasWrittenForm "$NP2".
le2 lemon:hasSense s2.
s2 lemon:reference class2.
s1 lemon:synonym s2.


The lexico-syntactic pattern above would match an expression such as Statement of financial position also known as balance sheet would have the following semantic pattern as counterpart:

le1 rdf:type lemon:LexicalEntry.
le1 lemon:hasWrittenForm "statement of financial position".
le1 lemon:hasSense s1.
s1 lemon:reference class1.
le2 rdf:type lemon:LexicalEntry.
le2 lemon:hasWrittenForm "balance sheet".
le2 lemon:hasSense s2.
s2 lemon:reference class2.
s1 lemon:synonym s2.


Other requirements

The lexicon model should represent how lexical information links to semantic structuring information in order to:

  • provide semantic dependencies between concepts
  • map syntactic details to semantic roles (transformation rules) possibly to achieve a standardized representation of meaning
  • determine semantic rules and syntactic composition rules of lexicon elements (compositionality)
  • provide basic constituency and dependency information (head phrase/noun, modifier, …)
  • join axioms and lexical characteristics


Definition:
Patterns recur across genres/domains featuring a certain invariance. Lexico-syntactic patterns (LSP, cf. LSPs at ontologydesignpatterns.org) are generalized linguistic structures or schemas that indicate semantic relationships between terms and can be applied to the identification of formalized concepts and conceptual relations in natural language text.


Representation (to be discussed):

  • as part of the lexicon-ontology interface or
  • in a separate domain-independent lexicon that captures the possible meaning of lexico-syntactic patterns with respect to OWL.


Examples:
Semantic dependencies
LSP corresponding to equivalence relation between ontology classes:

      NP<class> be (the same as/synonym of)/know as/call/(refer to as) NP<class>
      Statement of financial position also known as balance sheet

The ontological equivalence relation corresponds to the lexical relation synonymy. Both terms provide a corporation's liabilities and assets and are frequently used interchangeably, even if the former usually refers to for-profit whereas the latter to non-profit organizations.

LSP corresponding to datatype property:

      NP<class> sein [(AP<property>,)*] oder AP<property>: Aggregatszustände sind flüssig, fest oder gasförmig.
      :flüssig a owl:DatatypeProperty;
               rdfs:domain :Aggregatszustand;
      :fest a owl:DatatypeProperty;
               rdfs:domain :Aggregatszustand;
      :gasförmig a owl:DatatypeProperty;
               rdfs:domain :Aggregatszustand;

Datatype properties as presented in this pattern represent the lexical relation of hyponymy. The trichotomic characteristic of the hypernym state of matter divides it into the hyponyms liquid, solid, and gaseous.

Reverting the pattern facilitates the detection or generation of variants and compounds:

      AP<property> NP<class>: flüssiger Aggregatszustand; fester Aggregatszustand; gasförmiger Aggregatszustand
      AP<property> oder AP<property> NP<class> : flüssiger oder fester Funktionen; flüssiger oder gasförmiger Funktionen,...
      ...
      NP<class> können [(AP<property>,)*] oder AP<property> sein

Additionally, multi-word expressions (e.g. small appliance industry) can be represented by LSPs:

      AP<property> NP<class> => AMOD(small-JJ, industry-NN), NN(appliance-NN, industry-NN) 

The same pattern does not apply to the hyphenated compound word small-appliance industry consisting of the same elements but designating an industry producing small appliances.

One possible ontological relation for this multi-word expression might be that "industry" has the sublcass "appliance industry" featuring the property "small".

Semantic roles
Frames describe relational arguments of verbs in a predicate-argument structure (cf. frames in lemon).

      X is the capital of Y
      :capital_of lemon:synBehavior [ lemon:synArg :subject ;
                                      lemon:synArg :pobject ] .
      :noun_pp_pobject lemon:marker :of .

Semantic roles label these arguments:

      CITY is the capital of COUNTRY or NP<class> is the capital of NP<class>
      AGENT sieht/betrachtet ENTITY  or NP<class> sieht/betrachtet NP<class>

Lexico-Syntactic Patterns as Constructions

Another potential use case of LSPs is for the modelling of constructions. For example we consider the proprerty population, which may be lexicalized as $NP has $NUM inhabitants. It would unsatisfactory to consider this a lexical entry of inhabitants as the meaning is unique to this particular construction, and it is unclear as to whether this is a sense of "to have" or of "inhabitants". Instead this maybe readily expressed as a pattern as follows

   :has_inhabitants a lemon:Construction ;
      lemon:decomposition (
        [ lemon:element :arg1 ]
        [ lemon:element :have ]
        [ lemon:element :arg2 ]
        [ lemon:element :inhabitant ]
      ) ;
      lemon:sense [
        lemon:reference onto:population
      ] .
   :have a lemon:Word ;
     lemon:canonicalForm [
       lemon:writtenRep "have"@en 
     ] .
   :inhabitant a lemon:Word ;
     lemon:canonicalForm [
       lemon:writtenRep "inhabitant"@en
     ] .
   :arg1 a lemon:Argument ;
     lexinfo:phraseType lexinfo:nounPhrase .
   :arg2 a lemon:Argument ;
     lexinfo:phraseType lexinfo:numeral .

Pharse trees can also be used to represent this if desired

Possibly relevant use cases:
Natural Language Generation from Triples (NLG): mapping of linguistic knowledge to ontological predicates
Lexical Linked Data (LLD): linking LRs between them, as well as to ontologies
Ontology-based Machine Translation (OMT): syntactic roles and corresponding semantic classes
Lexicon driven Ontology Evolution (LOE): lexical semantic relations, constituency and dependency information
Ontology Transformation (OT): Patterns matching syntactical constructs to semantic constructs