Skip to toolbar

Community & Business Groups

OWLED 2014: Day 2 LiveBlogging

I was a bit late, but here we go!

Chris Mungal asked yesterday for a link to the papers and/or slides. Here’s the link to proceedings:

http://ceur-ws.org/Vol-1265/

I’ll try to go back and link through directly and call for slides.


Tahani Alsubait, Bijan Parsia and Uli Sattler. Generating Multiple Choice Questions From Ontologies: Lessons Learnt presented by Tahani

What is a MCQ and it’s parts (stem, key, distractors)
Example of good and bad distractors.
What makes a good one: Item response theory: Tuned difficulty, low guess ability, and right discrimination.

MCQ generation is difficult and time consuming and you need a lot of them (exam coverage, security, and practice exams).

Automation!
Difficulty prediction. Needed to increase validity and exam quality. People are bad at it.

Similarly conjecture (degree of similarity between keys and distractors is proportional to difficulty). Existing measures didn’t work so we developed new ones (see ISWC poster).

Key questions: Can we control difficulty? Can we generate exams? Is it cost effective?

Experiment description: 1) Build ontology. 2) Generate questions. 3) Expert (3 — Two instructors and one domain expert) review. 4) Test with students.

Two rounds of student tests (in review session and online).

Usefulness rating: High heterogenity (only 1 question agreed upon by all reviewers).
Distractor utility: For all 6 questions, 2 out of 3 distractor.
5 out of 6 questions the key was picked more frequently than the distractors.
Discrimination: in class better than online
Difficulty: 4 out of 6 correctly predicted by tool.

Check out the tool:
http://edutechdeveloper.com/MCQGen

QA

QUESTION: About reusability: Often the labels are general and not specific for domains, so may not be suitable. I.e., reusability in general might hurt applicability.


Catherine Chavula and C. Maria Keet. Is Lemon Sufficient for Building Multilingual Ontologies for Bantu Languages? presented by Maria.

Need to be able to write things in one’s own language (and there are a lot of languages).

OWL has challenges (e.g., Manchester syntax not suitable for Spanish).
We need a lot more linguistic annotation than alternative terms.

Some approaches separate linguistic and ontological layer. Community Group ontolex-lemon.

Bantu langauges: spoken by >200 million. Being picked up by the likes of Microsoft, Google, and Facebook.

Each noun (in Bantu)  (class in OWL) have between 10 and 23 noun classes. Complexity which must be captured!

“Concordial agreement of verb with nc of noun of OWL class)

No Semantic Web stuff for Bantu. Some XML. Some work on multilingual ontologies.

lemon defines verbalisation of ontological elements in a particular language.
lemon advocates GOLD and ISOcat which don’t do the job for Bantu noun classes. E.g.,  nc is like gender but with semantic significance.

To address this, developed a Noun Class Ontology (small, e.g., 42 classes, 6 op, and 130 axioms.).

Word variation. lemon generally uses Perl like regexs. Chichewa has too much complexity for this.

Agglutination is a challenge.

Application: Lexicalise FOAF and GoodRelations in Chichewa.

(foaf:knows requires a big chunk of rules!)

FOAF: 1:1 correspondence for classes; some odd things; object properties are usually verb phrases; data properties are easier. FOAF lemon covers 90% of foaf.

GoodRelation: Very domain specific thus difficult due to lack of terms in Chichewa. Only 25% of the entities were lexicalized.

http://www.meteck.org/files/ontologies/

QA

QUESTION: What are the actual remaining problems with OWL re: multilingualism? Annotations don’t quite do it. Like name of the OP property varying with type of object.
QUESTION: Is this a problem with programming languages as well? Yes, people are tackling it.
QUESTION: You mentioned issues with Manchester Syntax, can we localise it by varying things a bit? No, it’s much harder. [Ed; Much much harder!]


Matteo Matassoni, Marco Rospocher, Mauro Dragoni and Paolo Bouquet. Authoring OWL 2 ontologies with the TEX-OWL syntax presented by Mauro.

Motivation: Write onts quicky; XML syntaxes are verbose; don’t want to learn heavy tool.

https://github.com/matteomatassoni/TexOwl/blob/master/docs/grammar.pdf

Example animal \c == Class(Animal).
animal \cdisjoint plant.
branch \cisa \oforall{is_part_of}{tree}

Evaluation:

Two task:
surveys http://goog.gl/Cjpqtg 10 examples for intuitiveness, conciseness, and understandability
usability for writing small ontology

Results: Latex near as good s Manchester on intuitiveness and better than everything else. Concision, latex was way ahead with functional next (9.7;5.3)

re ont building: Difficulty 3.5, syntax easy to remember 3.17. Compare to prior syntax 3.67 (all on 5 point scale).

Come see the poster!

http://dkmlab.fbk.eu:8080/converter-webapp

QA

QUESTION: Does it work with latex files? Nope!
QUESTION: What are the characteristics of the participants? See poster!
POINT: The fragments displayed are non equivalent in many ways.
QUESTION: Do you need declarations in your syntax? Need to check grammar.


Alessandro Solimando, Ernesto Jimenez-Ruiz and Giovanna Guerrini. A Multi-strategy Approach for Detecting and Correcting Conservativity Principle Violations in Ontology Alignments presented by Alessandro.

Ontology matching problem (same domain, variant modelling, how to align?)

Three matching principles: consistency principle (no new incoherent classes), conservatively principle (no new entailments inside a single signature), locality (map neighbourhoods)

Conservativity (Deductive)/Deductive differnce (are there different entailments over a given signature).

Approach: Approx diff (atomic); use modularisation, projection to propositional horn, reuse semantic indexing from LogMap to help scalability; disjointness assumption [Sch05] helps basic violation reduction to sat; equivalence violations detected using answer set programming

[[sorry, had to prepare for covering the next talk![[

Lots of violations in experiments. Reasonable computation time (80secs)
Fully automated and “conservative” repair (as well as detection)

30 million violations in some cases!!!


Birte Glimm, Yevgeny Kazakov, Ilianna Kollia and Giorgos Stamou. OWL Query Answering based on Query Extension presented by Bijan

(Links to slides and explanatory text coming soon)


Marcelo Arenas, Bernardo Cuenca Grau, Evgeny Kharlamov, Sarunas Marciuska and Dmitriy Zheleznyakov. Enabling Faceted Search over OWL 2 with SemFacet presented by Evgeny

Tons of data and ontologies.
What can we do with it? SPARQL Queries (yeek!); Controlled Natural Languages for query (e.g., Quelo); Visual Query formulation (geek); and Faceted Search!

Lots of work on faceted search on semantic web. What’s the common principle?
Faceted Search over RDF

  • Search over several sets of items
  • Progressive filters (which extend a query)
  • output a user-chosen subset of items

Variance in systems is within this paradigm (what you can filter, what filters to add, etc.)

Existing Solutions: Don’t use ontologies but are data driven; no theoretical underpinnings (e.g., what fragments of SPARQL are covered; complexity of those fragments; formal capture of update)

Formalised faceted interface tailored toward RDF and OWL abstracted from GUI.
Study expressive power and complexity
Study interface generation and update
Result: SemFacet system (scales to millions of triples).

Simple mapping of facets to predicates and conjunctions and disjunctions. Then translate to FO (i.e., positive Existential formulas, monadic, directed tree rooted at free variable; disjunctive connected share one variable).

Combined complexity: Faceted query answering over RDF datasets is tractable. WRT to  (active domain) RL (p-Complete), EL p-complete, QL, in P; (classic semantics) RL: P-complete; EL (guarded) P-Complete; QL NP-complete.

Bottom up evaluation.

Interface generation & update. Algos guided by ontology and data. Every facet in the initial interface is justified by an entailment.

Each update is semantically justified>

Facet graph. Project ont and data on graph. Nodes are possible facet values. Edges are facet names. Every edge must be justified by an entailed axiom or fact.

Systesm (SemFacet) combines keyword and faceted search.
Auto generation of f-search
In memory
online and off lien reasoning
Scales to millions of triples
Configurable


Keynote: Claudia d’Amato. Machine Learning for Ontology Mining: Perspectives and Issues

Use ML for Ontology mining
Inductive learning (robust)
Supervised, Unsupervised & semi-supervised concept learning (basic refresher)

Instance retrieval regarded as a classification problem; challenging for State of the Art ML Classification

  • SOTA applied to feature vector representation (not relational DL expressivity)
  • Implicit closed world assumption (unlike DLs)
  • SOTA treat classes as disjoint (unlike DLs)

Solutions: New semantic similarity measures for DL representations; cope with all problems

Problem Definton: Given a populated ontology, a query concept, and training set with +1 0 -1 as targets learn a classification st f(a) = +1 if a in Q, -1 if in ~Q, and 0 otherwise.
Duel problem, given individual find all C it bellows to

Example, nearest neighbour (given a similarity metric) voting.

How to evaluate classifiers.
Compare to standard reasoner (Pellet) but this didn’t help with the “new” knowledge
Added evaluation parameter, match rate, omission error rate, commission error rate, induction rate: Key bit: classic reasoner is indecisive and Classifier is determined

Commission and omission rates are basically null; induction rate is not null! New knowledge (perhaps supporting semi-automated ontology population)
Most of the time the most effective method is relational K-NM
Most scalable kernal method embedded in blah balh

Concept Drift and Novelty Detection via Conceptual clustering methods

Clustering: Intra-cluster similarity is high and inter-cluster similarity is low.

(Key idea: Global Decision boundary; if new candidate cluster is out side, then it’s either novel or drift depending how it relates to existing clusters)

Evaluation needs domain expert 🙁

How to learn intensional description of the new clusters (separate and conquer vs. divide and conquer)

Ontology enrichment as pattern discovery problem

[[holy moly, a lot more stuff; learning DL Safe rules for a variety of things using a variety of techniques]]

Data driven tableaux — drive/guide the reasoning process using data induced rules.

QA
QUESTION: Is the data driven tableau for unsound inferences or for optimisation? Both!
QUESTION: Tell us a bit more about scalability? What’s the size of the ontologies? Started with toy onts with 1000s of individuals. Scaling up!


ORE report slides and stuff coming
 


Feature popularity contest
 


 

Leave a Reply

Your email address will not be published. Required fields are marked *

Before you comment here, note that this forum is moderated and your IP address is sent to Akismet, the plugin we use to mitigate spam comments.

*