The SPECIAL Policy Log Vocabulary
A vocabulary for privacy-aware logs, transparency and compliance
Javier D. Fernández, Vienna University of Economics and Business
Piero Bonatti, Università di Napoli Federico II
Wouter Dullaert, Tenforce
Sabrina Kirrane, Vienna University of Economics and Business
Uros Milosevic, Tenforce
Axel Polleres, Vienna University of Economics and Business
The upcoming European General Data Protection Regulation (GDPR) brings new challenges for companies, who must provide transparency with respect to personal data processing and sharing within their organisations. Additionally companies must be able to demonstrate that their systems and their business processes comply with usage constraints specified by data subjects.
At the core of any transparency and compliance architecture is the logging of data processing and sharing events in a manner than can be used to verify compliance with relevant policies.
This position statement outlines splog, a Linked Data vocabulary developed within the SPECIAL EU H2020 project, which can be used to log data processing and sharing events, including the consent provided by the data subject and subsequent changes to or revocation of said consent.
The goal is to use our initial model and vocabulary, known as splog, as a basis for discussion with participants of the workshop, on relevant vocabularies, best practices and potential standardisation activities.
The proposed model and vocabulary are framed by use cases arising from the SPECIAL (Scalable Policy-aware Linked Data Architecture For Privacy, Transparency and Compliance) project, a Research and Innovation Action funded under the H2020-ICT-2016-1 Big Data PPP call (Privacy-preserving Big Data technologies, ICT-18-2016).
The SPECIAL platform is fully rooted in Semantic Web technologies and Linked Data principles, namely: (i) supporting the acquisition of user consent at collection time and the recording of both data and metadata (consent, usage constraints, event data, and contextual information) according to legislative and user-specified policies; and (ii) catering for privacy-aware, secure workflows that include usage control, transparency and compliance verification.
In splog we use Linked Data to represent processing and sharing events, as well as consent-related activities (acquisition and revocation). However, there are a number of avenues for improvement, which need to be discussed in a wider forum: (i) there is a lack of specific and standard vocabularies for representing privacy-related events; (ii) it should be possible to describe the event content at different granularities, from categorizing the content according to a simple taxonomy (stating the type of data, processing, etc.), to the most fine-grained description of the actual data associated to the event; and (iii) finally, given that events could potentially involve different systems within a company and/or different companies (e.g. sharing personal data between companies), interoperability is a crucial requirement.
Links to related supporting resources, activities and working groups
Our current work adaps/extends the following:
i ) The SPECIAL policy language [SPECIAL D2.1] that specifies basic usage policies represented using OWL-EL.
In the following we briefly outline the main concepts of our splog vocabulary.
Log. A Log is a collection of data that records data processing and sharing events as well
as consent-related activities (acquisition and revocation). Besides general metadata, the Log contains, information relating to the:
Log entries. A log entry contains information about processing and sharing events that are associated with data subjects, as well as actions related to the consent provided (or revoked) by a data subject. Besides general metadata, a log entry contains:
Log entry Content. The actual content is represented by the splog:LogEntryContent Class, which is a rdfs:subClassOf the SPECIAL spl:Authorization [SPECIAL D2.1]. This specifies the data involved in the event, how is data processed, how is data processed, the purpose of the data processing, where and for how long is the data stored and potential disclosures to third parties. This way, event content and data policy authorizations can be checked for compliance.
At the core of any transparency and compliance architecture is the logging of events in a manner that can be used to check that data processors abide by policies that are associated with the data. Such policies could be based on the usage policies, business rules and/or relevant regulations.
We look forward to presenting our initial approach and to discussing mutual efforts to standardize a vocabulary and/or best practices that can be used to describe and manage data processing and sharing events.