ABSTRACT

Queries informing policy management and enforcement must address trust issues. The RDF query language SPARQL provides access to provenance information and a reasonably rich set of constraints. This document describes how a policy management system can use SPARQL to reliably investigate and enforce policies.

Introduction

RDF was designed as a description language for web resources. As such, it is useful for describing policies associated with resources. The RDF Data Access Working Group is standardizing the SPARQL Query Language for RDF [SPARQL]. The SPARQL language, used to access simple triple stores or inferred triples, is useful for expressing/testing many practical policies.

It is essential that any agent enforcing policies trust its information. In a heterogeneous trust environment such as the semantic web, the chain of custody of policy data must be rigorously examined. Many RDF stores maintain the provenance of RDF data and SPARQL provides access to that information. Queries may interrogate the provenance of query solutions or specify that the solutions come from particular sources. This capability meets the reasonable requirements of semantic web policy agents.

Some policy languages, such as KAoS [KAoS], or XML Advanced Electronic Signatures (XAdES) [XAdES], include expiries or durations. SPARQL expresses numeric, string pattern, and datetime value constraints, which can be used to determine whether a given policy is applicable, or select solutions from only the relevant policies.

SPARQL also provides a set of logical expressions, including disjunction and optionally bound patterns. Used in conjunction with an operator to test whether a variable was not bound in a pattern, SPARQL provides a limited form of negation as failure (NAF). This feature is useful for practical reasons, limiting the amount of unwanted data the client consumes, but also to enable the client to avoid receiving information that would violate some policy.

SPARQL is not the only RDF query language, nor is it the most expressive. It is, however, the product of standardization; developers may count on reasonable conformance and vendor independence. It is beyond the scope of this document to compare the RDF query languages.

Provenance Constraints

A simple example of a policy is an access control list that associates a group of principals with a set of operations. The W3C site uses a simple ontology for expressing different people's right to perform HTTP operations on resources. For example, the ACLs for this document are expressed as:

@prefix rdf : <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix : <http://www.w3.org/2001/02/acls/ns#>.
[ a :resourceAccessRule;
  :access :racl, :head, :get, :options, :trace;
  :accessor <http://www.w3.org/Systems/db/webId?all=all>;
  :hasAccessTo <http://www.w3.org/2005/02/14-PMQuery/>
] .
[ a :resourceAccessRule;
  :access :chacl, :racl, :head, :get, :put, :delete, :connect, :options, :trace;
  :accessor <http://www.w3.org/Systems/db/webId?group=w3t_passwords>;
  :hasAccessTo <http://www.w3.org/2005/02/14-PMQuery/>
] .

This first ResourceAccessRule grants the group http://www.w3.org/Systems/db/webId?all=all the privileges to perform the HTTP operations HEAD, GET, OPTIONS, TRACE on the resource http://www.w3.org/2005/02/14-PMQuery. (The "racl" privilege is not an HTTP operation, but instead the meta-operation of reading the ACLs for that resource. The second ResourceAccessRule grants some additional HTTP operations, PUT and DELETE, to the group http://www.w3.org/Systems/db/webId?group=w3t_passwords. This group may chacl, change the ACLs for the resource.

Elsewhere the membership of the group http://www.w3.org/Systems/db/webId?group=w3t_passwords is enumerated, along with various credentials.

<http://www.w3.org/Systems/db/webId?group=w3t_passwords>
    :includes <http://www.w3.org/Systems/db/webId?user=eric> .
<http://www.w3.org/Systems/db/webId?user=eric>
    a :user ;
    :publicKey "30 82 01 0a 02 82 01 01 00..." .

Thus, the principal http://www.w3.org/Systems/db/webId?user=eric is a member of a group that has the ability to PUT this document, which is fortunate because PUT is the HTTP operation for updating a resource and ...eric is the author of this document.

The author has appropriate credentials to prove that he is the ...eric in the above list, and thus, has permission to change the documents. The group http://www.w3.org/Systems/db/webId?all=all is a special group understood by the W3C web servers to mean everybody, regardless of credentials or lack thereof.

When the user eric attempts to perform a PUT operation on this document on the W3C site, the site machinery verifies the credentials, verifies that the request action is within those allowed for this user for this resource, and grants access. So far, we haven't gone beyond what ordinary HTTP and DAV servers do every day. We have, however, made it expressible in a language that transcends server implementations and sites.

If a proxy site were to cache some or all of the W3C web site with the agreement that they would enforce the appropriate ACLs policies, they could use publicly available W3C ACLs policy information. If they were to query a public RDF aggregator, a semantic search engine, they would need to query for the provenance information associated with the policies:

PREFIX s: <http://www.w3.org/2001/02/acls/ns#>
ASK
 WHERE { GRAPH <http://www.w3.org/2005/02/14-PMQuery/,access?w3c_display=13>
         { ?policy s:access s:put .
           ?policy s:accessor ?group .
           ?policy s:hasAccessTo <http://www.w3.org/2005/02/14-PMQuery/> .
           ?group s:includes ?user .
           ?user s:publicKey "30 82 01 0a 02 82 01 01 00..." } }

This query simply specifies that everything in the access recipe must come from a source known to be authoritative for that resource. To make the scenario much more interesting, we can abstract the query, introducing multiple trust domains:

PREFIX s: <http://www.w3.org/2001/02/acls/ns#>
PREFIX meta: <http://www.w3.org/2002/xx#>
ASK
 WHERE {{ ?resource meta:keywords ?keywords .
          ?resource meta:abstract ?abstract .
         FILTER regex(?keywords, "SPARQL") && 
                regex(?abstract, "policy management") } .
  GRAPH <http://policies.example/knownSites.rdf>
        { ?resource s:policyAuthority ?policyAuth } .
  GRAPH ?policyAuth
        { ?policy s:access s:put .
          ?policy s:accessor ?group .
          ?policy s:hasAccessTo ?resource .
          ?group s:includes ?user .
          ?user s:publicKey "30 82 01 0a 02 82 01 01 00..." }}

Here we have asked the web for a document with a keyword "SPARQL" and the phrase "policy management" in the abstract. (This document has meta tags for the keywords and abstract.) Next we asked a trusted resource knownSites.rdf for the corresponding policy authority. Finally, we checked that authority to see what access privileges are extended to the user holding a particular public key. This query takes the appropriate conservative approach of failing to provide any access if the chain of trust cannot being established.

Expressing our policy in RDF allows us to develop arbitrarily complex trust chains. Evolving the authenticating software is as easy as mirroring new models in the SPARQL query, minimizing vulnerability to implementation errors and reducing deployment time and costs.

Mixing SPARQL and Policy Rules

Some policy languages exceed the expressivity of SPARQL, or are impractical to enumerate in SPARQL. Since SPARQL may operate over a graph created by inference, it can be used to access the inferences of any policy language that produces triples. A policy protocol using SPARQL may rely on it simply for a standard query interface, or for matching some or all of the rule conditions. The policy protocol may trade off between expressing the policy conditions in SPARQL vs. a rule language. For example, a REI [REI] policy expression can predicate a Permission on some conditions:

...
<action:Delegation rdf:ID="TimToCSMembers"> 
  <action:sender rdf:resource="&inst;TimFinin"/> 
  <action:receiver rdf:resource="#PersonVar"/> 
  <action:content> 
    <deontic:Permission> 
      <deontic:actor rdf:resource="#PersonVar"/> 
      <deontic:action rdf:resource="#ObjectVar"/> 
    </deontic:Permission> 
  </action:content> 
  <action:condition> 
    <constraint:And> 
      <constraint:first rdf:resource="#IsMemberOfCS"/> 
      <constraint:second rdf:resource="#IsFacultyPrinting"/> 
    </constraint:And> 
  </action:condition> 
</action:Delegation>
<constraint:SimpleConstraint rdf:ID="IsMemberOfCS">
  <constraint:subject rdf:resource="#PersonVar"/>
  <constraint:predicate rdf:resource="&univ;affiliation"/>
  <constraint:object rdf:resource="&univ;CSDept"/>
</constraint:SimpleConstraint>
<constraint:SimpleConstraint rdf:ID="IsFacultyPrinting">
  <constraint:subject rdf:resource="#ObjectVar"/>
  <constraint:predicate rdf:resource="&rdf;type"/>
  <constraint:object rdf:resource="#FacultyPrinting"/>
</constraint:SimpleConstraint>
...

SPARQL's terse syntax provides a very short expression of the above policy:

...
COLLECT ?sender ?receiver
 WHERE { ?permit rei:sender ?sender .
         ?permit rei:receiver ?receiver .
         ?permit rei:actor ?person .
         ?permit rei:action ?object .
         ?person univ:affiliation univ:CSDept.
         ?object rdf:type p:FacultyPrinting }

Languages with rule heads that are chained to further rules will not be well-represented in SPARQL as SPARQL is not a rules language. In such cases, it would only be useful to express as queries the questions that the application ultimately needs resolved, such as, "is the client in a class that has access to a given resource?"

Value Constraints

Many policy languages express a duration of validity. The following excerpt from KAoS states a policy update time stamp:

<policy:PosAuthorizationPolicy>
  <policy:controls rdf:resource="#GET" />
  <policy:hasSiteOfEnforcement rdf:resource="#w3site" />
  <policy:hasPriority>10</policy:hasPriority>
  <policy:hasUpdateTimeStamp>2006-01-01T00:00:00Z</policy:hasUpdateTimeStamp>
</policy:NegAuthorizationPolicy>

A SPARQL query looking for a current policy would rely on built-in dateTime comparison functions:

...
 WHERE { ?pol policy:hasUpdateTimeStamp ?update .
       FILTER ?update > 2005-04-17T13:34:52Z }

SPARQL also provides numeric comparison and string regular expression operators.

Negation

The PRIME project concerns itself with privacy and identity management in Europe. The design of SPARQL is informed by the needs of PRIME. In particular, a participant in the project, Thomas Roessler, submitted this use case [PRIME] to the RDF Data Access Working Group:

A mobile phone provider offers location and contact information to third parties which, in turn, offer location-based advertising by mobile phone short message. An airline operates an airport restaurant as a subsidiary, which wants to advertise a special gourmet meal based on pork to members of the airline's frequent flier program who are nearby the restaurant, unless these have indicated halal, kosher, or vegetarian meal preferences.

(Note that meal preferences give hints about religious convictions and health conditions, and should as such not be processed by a restaurant's advertising department.)

This use case demonstrates not a strict policy established by the traveler, but instead a sensitivity on the part of the airline toward the traveler's implicit policy. The SPARQL expressivity that this query leverages is the ability to filter solutions that do not include a specified statement (specifically, whether the person has expressed a preference for any of a set of special meals).

PREFIX flt: <http://someAirline.org/ns#>
SELECT ?smsAddr
 WHERE {<http://someAirline.org/flt217/20050215#> flt:traveler ?traveler .
        ?traveler flt:smsAddr ?smsAddr .
       OPTIONAL {?traveler flt:mealReq ?mealReq .
       FILTER !BOUND(?mealReq) }

The expressivity for NAF may seem awkward, but it is effective, and specifically addressed in the specification. By restricting the results to not include any solutions where the traveler had a flt:mealReq, the restaurant avoided moving more data than necessary and avoided having to local filtering of the data. These are obvious motivations for including NAF in the language. More importantly, the restaurant was able to avoid learning travelers' religious convictions by filtering out results which implied a religious conviction.

Designing for a query language that can express negation as failure allows policy ontologists to annotate entities with types of confidentiality. Given appropriate terms, people can indicate that some material is not intended for certain audiences, allowing school library web browsers to perform content selection. These terms can advertise increased privacy preferences; sympathetic agents can voluntarily comply with these preferences.

Decisions based on NAF are non-monotonic and must be regarded as uncertain. Agents working with partial knowledge or potentially incomplete inference can make incorrect decisions. In the above example, the restaurant may advertise a pork meal to a traveler who keeps kosher and who has not filled out his or her meal preference, or a library browser may let a user see content that would have hidden, had the browser had access to more information.

In many cases, policies have a conservative response to incomplete knowledge. For instance, a principal will be denied access to a resource if it is either not known that the principal should have access or it is known that the principal should not have access. For this reason, SPARQL's use of default negation is a practical alternative to using OWL's complementOf for classical negation.

Additional Expressivity: Disjunction and Optional

Disjunction and Optional graph patterns in SPARQL enable complex policies to be tested efficiently. Any query involving disjunction (called Union in SPARQL) or optional patterns could be expressed as a set queries with purely conjunctive graph patterns. This seemingly redundant expressivity allows many complex policies to be expressed as a single query, providing an intuitive interface and an efficient protocol.

SPARQL also has extensible value restrictions. The expressivity of the restrictions could be extended to calculations such as geographical radius, repeated temporal intervals or arbitrary mathematical functions of any set of parameters derived from the graph. These extension functions will not be available across all SPARQL implementations, but they can be used to express more complex policy tests within the SPARQL syntax.

Conclusions

This document has described two use cases where SPARQL's provenance queries support a rigorous enforcement of policies. The provenance information in the ACLs query allowed the suspicious agent to establish a chain of trust between a principal requesting access and a policy for that resource. The KAoS example showed how value constraints usefully enforce policies with expiries, and the PRIME example demonstrated how query support for negation allows policy data to be added to increase privacy.

Designers of policy protocols will need to provide a query mechanism of some sort to allow applications to act on policies. Using SPARQL provides the advantages of using a standard query language, as well as the opportunity to leverage SPARQL implementations to provide some or all of the calculations. While every language makes trade-offs between complexity and expressivity, it is the author's opinion that deployment of SPARQL agents will provide a strong foundation for a policy aware Web.

RDF Query for Policy Management

Eric Prud'hommeaux

W3C

eric@w3.org