User preference modeling for top-k answers

/Discussion

Version	3
Date/Time	February 20, 2008
Original author	PeterVojtas
Current lead	lead
Last Modified By	PeterVojtas
Primary Actors	user searching web, recommender system, web service
Secondary Actors
Application domain	web search
Triggering event
Relation to other use cases	Soft Shopping Agent, Discovery

Purpose/Goals

This is in a sense a generalization of some aspects of Discovery use case. Given a populated catalogue by some extraction tool (see use case about extraction) of items and a user’s criteria and/or multicriterial utility function for item potentially listed in the catalogue retrieve best, top-k matches.

Usually, the main problem is to learn user preferences. This can be done either by implicit information collection (system tracks user behavior, click streams, …) or by explicit information collection (system poses questions, user answers). Sometimes a recommender system finds similar users (UncAnn UncertaintyModel: SimilarityModels). Another problem is effective retrieval of search results ordered by these preferences (usualy top-k answers suffice).

Issues and Relevance to Uncertainty

In what follows we present issues and relevance to uncertainty which are specific for this use case and we annotate them (UncAnn) with reference to Uncertainty Ontology (UncertaintyOntology) and extensions to classes and properties described in Fine grained version of Uncertainty Ontology.

As result of any data mining procedure, results of such user preference mining will be uncertain.

Typical sentence which is a subject of uncertainty assignment is: (UncAnn Sentence) User1 prefers most item1 (list of of top-k most preferred items for User1 consists of item1, ..., itemk).

This statement can be made by the user himself or by another human (UncAnn Agent:HumanAgent). For the semantic web, more interesting case is the statement made by (UncAnn Agent:MachineAgent). The statement can be produced by a combination of an inductive procedure (UncAnn Agent:MachineAgent:InductiveAgent - mining user preferences) and a deductive procedure (UncAnn Agent:MachineAgent:DeductiveAgent - optimizing finding top-k answers on the web).

Uncertainty assigned to the above statement has typically (UncAnn UncertaintyNature:Epistemic:MachineEpistemic)

User's preference is in no case Boolean (yes-no), typical (UncAnn UncertaintyType:Vagueness) is about vagueness, which arises when the boundaries of meaning of user objective are indistinct.

There are models using partially ordered sets to represent preferences. Different ad hoc ranking approaches are used. Possible model is UncAnn UncertaintyModel:FuzzySets or UncertaintyModel: PreferenceModels. To make these uncertainty annotations usable for other machine agents a fine grained specification of UncAnn World:DomainOntology:Instances and or World:DomainOntology:Evidence has to be made to support agents decision how to proceed with this information.

Assumptions/Preconditions

similarly as in Discovery - The catalogue has been populated using a property set and property values that have a machine processable representation of the vocabulary used
We have data extracted and annotated

Required resources

Populated catalog
user preference model and instances for single user

Associate methodologies that could help

We have developed and inductive method learning user preference from given evaluation of a sample of objects (see reference list)

recommend those aspects that are considered most important to be included in a standard representation of vagueness and uncertainty

a concept of truth value as a comparative notion of relevance / preference

Successful End

User gets answer fitting to his/her preferences He/she gets answers fast, because retrieving top-k is understood without computing all answers

Failed End

There is a danger in uncertainty methods - namely combinatorial explosion - namely, replacing two valued Boolean model can lead to the point that everything is relevant in some nonzero degree

Main Scenario

Buyer queries the web for a product
He / she realizes that to search all web resources is very time consuming
Consulting a web services providing overview information is a good choice
He has to follow instructions which web service uses to learn his / her preferences or the system uses implicit learning (learning from click stream)

Additional background information or references Efficient algorithms for top-k answering where studied in Fagin et al., ''Making Optimal aggregation algorithms for middleware '', learning user preferences from explicit information is described in ''Ordinal Classification with Monotonicity Constraints''. In ''EL description logic with aggregation of user preference concepts'' it is shown that these models can be described in description logic and hence compatible with web modeling standards.

Variations

Open Issues