Jump to content

RDF Messages Task Force/Meeting 2026-04-16

From RDF Stream Processing Community Group

Attendees

  • Piotr Sowiński (NeverBlink)
  • Anastasiya Danilenka (NeverBlink)
  • Tobias Schwarzinger
  • Pieter Colpaert (UGent)
  • Anh Le Tuan (TUB)
  • Pieter Bonte (KU Leuven)
  • Pedro (NICT)
  • Jean-Paul Calbimonte
  • Abraham Bernstein
  • Riccardo Tommasini

Agenda

  • Fill in attendance list
  • Agenda overview
  • Pick a scribe
  • https://w3c-cg.github.io/rsp/spec/messages
  • General: Anyone attending ESWC? Want to meet up?
    • Pieter C (UGent) is! Piotr and I have a poster about RDF Messages accepted
  • RDF Message & RDF Message Stream definitions
  • RDF Message Log formats for Turtle, TriG, N-Triples, N-Quads
  • RDF Message Stream Profiles
    • Review of the vague idea for what profiles should be: "govern the structure of messages, their interpretation scope, and the required metadata", "A profile may define for example: how to give an identifier and type to a message, how timestamps are expressed, or what shape constraints apply to messages."
    • Current candidates: LDES, SOSA/SSN, SAREF, ActivityStreams
    • What should be within the scope of profiles?
      • Ordering guarantees, causality?
      • Specifying the timestamp property, if present
      • Message shapes (using SHACL) – required, optional?
      • Procedure for enveloping an RDF dataset into a compliant message, if given also a timestamp – required, optional? Note that enveloping is required to “lift” streams of raw RDF data into something importable into a triple store without losing information.
      • Whether or not the first, second, etc. messages have any special meaning.
      • Scope to which messages should be asserted (e.g., for message N, we always assert it together with message 1, which provides the context).
    • Technical choices
      • How should the profiles be advertised, i.e., how do we know which profile is used in a stream? Define something for HTTP and leave this open for other implementations?
      • How should the timestamp property be specified? Leave it open? SHACL shape? SPARQL fragment?
    • -> gather feedback to write the first draft of the section on profiles
  • Other matters:

Action points

  • Next TF meeting:
    • Should we define a vocabulary for RDF Message Stream Profiles?
    • How deep should we go with specifying profiles? Discuss some alternatives.
    • PieterC: I opened a discussion in the RDF-JS github group on adding RDF Messages support in JavaScript: https://github.com/rdfjs/data-model-spec/issues/183

Minutes

Scribe: Anastasiya Danilenka

RDF Message & RDF Message Stream definitions

  • Piotr: Merged PR about scope of RDF messages (presented on monthly call). Another PR clarified what is an RDF stream. When you request a stream it might not be the same stream someone else requests. Because due to technical reasons it can change between requests (e.g., MQTT).
  • Piotr: any urgent matters with these definitions?
  • PieterC: definitions leave open a lot of room to more formally define types of streams. We say it stream in the highest possible sense, and stick to the essence.
  • Ricardo: this definition is associated with manifest when the person makes a call to the streaming point? Is the definition meant to be a web resource or not?
  • PieterC: it is not a resource, it is on the instance level.
  • Piotr: yes. We can apply it not only to http, but MQTT, file, Kafka
  • Ricardo: We were talking about making a specialized protocol for RDF streams. HTTP was an example. But is it a problem?
  • PieterC: we want to make it out of scope by limiting it to being consumed in the moment. There can be a discovery spec developed later.
  • Ricardo: vocals was a description ontology, not a protocol. You mean to have a negotiation step. For different people this means different things: socket, file, etc. We want a stream to be a resource. In Vocals it is like “give me a stream version”.
  • PieterC: we are not creating resources. RDF message is a set of quads that happen to be sent together and need to be processed together. Adding identifiers and protocols are subjects to other specs.
  • Ricardo: alternative – using content-type. Using it is much easier. Everything is http on paper. Negotiation happens on the content-type level.
  • Piotr: keep content-type out of this spec. Some protocols do not support content-types. When you transfer binary data over a wire and just interpret that as RDF – what is the content type here then?
  • Ricardo: a stream is dual.
  • Piotr: there is one comment from Tobias on the last PR that was not resolved.
  • Tobias: the comment was more for clarification.
  • Piotr: what it means that an RDF message stream can be made ad-hoc.
  • Piotr: I will then close this issue and open a new one for this clarification.
  • Tobias: agreed.

RDF Message Log formats for Turtle, TriG, N-Triples, N-Quads

  • Piotr: We had comments as delimiters of messages. Downsides: making comments a semantic part of the document. Parsers can struggle with that. Approach: directives. I tried doing that based on RDF 1.2 mechanisms.
  • Version label from RDF 1.2.
  • Piotr: looking at n-quads, at the beginning of the doc you have a version announcement. Or the “@version” . We extend that and propose new version labels. “-messages” version label suffix to support RDF messages. Same taxonomic relations. There are quirks in this taxonomy of version labels. So comments are welcomed.
  • Piotr: serializing/parsing examples done. Use the same mechanisms from RDF 1.2. Message delimiters: in 1.2-messages we add directives that split the messages. See spec. Message directive at the end of the file – no more messages. We allow empty messages. Examples are in the spec. In turtle and Trig we have analogous situations. And more examples.
  • Piotr: repeated prefix and directions. Stating the same prefix directive more than once does not change anything and can be overwritten.
  • PieterC: So we allow, I think for the first time? also empty messages, however this means that different stream profiles can attach semantics to an empty message as well: e.g.: an empty message means the data stayed the same:
  • Piotr: we have multiple version labels, we have 3 types. We propose \-messages and we attach them to every existing version label in RDF 1.2
  • PieterC: alternatives – profile negotiation. RDF 1.2. Introduces that and it is nice to adopt that.
  • Piotr: content-type: RDF 1.2 says to use text/turtle and add version parameter. We use the same mechanism instead of introducing a new content-type parameter.
  • JP: previous proposition with comments was more in line with backward compatibility. We put semantics into comments. How does this proposal in terms of backward compatibility
  • Piotr: you cannot parse that as a regular RDF document. But, the proposed extension of the format is simple. So compatibility is simple too. But the drawback is no direct compatibility.
  • PieterC: you propose MESSAGE or “@message” directives.
  • Piotr: Yes, there are two variants. We mirror what RDF 1.2 did with VERSION and @version.
  • PieterC: message stream logs documents as RDF documents – they will lose semantics of messages. If you create stream log document, you want to be able to be fully aware this is a log and parse it correctly w/o ignoring messages
  • Piotr: parsing as a whole sometimes does not make sense. See Ex 14\. If you ignore “message” it looks like the person said two things at the same time.
  • PieterC: ignoring messages might make sense but it can make things slower. Triples will still work universally.
  • Tobias: blank nodes can be in multiple datasets. Ignoring message directive would merge BNs.
  • Piotr: same BN identifier in two messages – we do not define semantics. We assume they are in different worlds. You probably need to skolemize them. Any thoughts?
  • PieterC: We resolved that? Identical BN – we do not change the scope of BN. We are going to see that as the same thing.
  • Piotr: this is probably not clear.
  • PieterC: from the same file – yes, from the different files \- no. What if you split the file?
  • Piotr: you want to (ex 13). Load two messages with the same BN as persons. It will merge.
  • PieterC: scope of BNs. Do we want to allow sharing BN identifiers across messages? Look at all possible streams you might have. If any exist – we need to allow it too. RDF-STAX types?
  • Piotr: triples and quads from that taxonomy (RDF-STAX) are not the same as what we are talking about.
  • PieterC: you could put a triple per message – this is the triple RDF stream from RDF-STAX.
  • Piotr: yes. If we say BNs must not be shared. Parsers have BN transformations; they transform BNs to triple store space. They will need to reset this mapping structure every time they encounter a directive. From a performance standpoint, if the stream is big – it is good, I think. Then we say, these BNs ids are not shared, then parser concerns only about the message it currently parses and then forget about it. Which is attractive from the performance point of view.
  • PieterC: I was against directives for message delimiters initially, but now resetting BN ids after you encounter a message directive seems nice. You still can have a use case of sharing BN identifiers if you use skolemization.
  • Piotr: will make an issue on that. And make a separate PR. And hopefully merge in a week or so.

RDF Message Stream Profiles

  • Piotr: general idea of profiles – govern the interpretation, metadata, structure, timestamps, shape constraints, etc. Linked data event streams or activity streams, etc
  • Piotr: What is the biggest thing you want profiles to cover?
  • PieterC: what will we do in this group? Should we propose smth to SOSA? ActivityStreams?
  • Piotr: we create a framework for creating profiles, create examples, and base further work on that. Standardization – later, for now we work ourselves.
  • Piotr: what should be in profiles?
  • JP: time metadata and properties.
  • PieterC: it depends. Profile can be NOT about time.
  • JP: it can also be sequencing and ordering semantics.
  • Riccardo: What decision can clients make based on the profile? We use SPARQL for some reason. What can/cannot I do within a profile.
  • Piotr: depends on what you want to know about the stream. Sometimes you do not care about the profile, e.g., in a stream broker. You may be interested in how to extract timestamps.
  • Riccardo: So it is a schema + metadata. Let people express metadata how they want. Then the best practice will appear. Do not formalize profiles too soon.
  • JP: We are talking about profile examples, so it can be a SAREF, etc.
  • JP: SAREF strongly relates to ontology but it does not have to.
  • Riccardo: vocals did smth similar. Automate the consumption by providing metadata. But this is hard. You want to generate a client on the spot.
  • Piotr: We want to do that for good problems. But it is an ongoing struggle. We do not want to describe specific endpoints. We want to specify the contents of the stream and semantics. E.g., SHACL shape of SAREF or how to find the timestamp. Regarding ordering semantics: what do we want to include?
  • JP: there was a lot of work by Emanuele Della Valle. Message happened before this message. Can be done w/o timestamp. If there is a seq of events in this order (a, b, c), so temporal semantics.
  • Piotr: We could specify relations. What message was when.
  • JP: or this event is contained in this event. Various types of relations between messages.
  • Piotr: this is very broad in terms of possibilities.
  • JP: you can do, e.g., linear temporal logic
  • Riccardo: We can use time ontology. We can represent data in traces. In Prometheus there is a data model like this. Lighter profile like sequences.
  • Piotr: temporal-aware shape constraints?
  • Riccardo: queries are simple. They look at SHACL constraints. But you can do queries. Shallow semantics, simple. If the door closes before opening – the data is invalid, discard it.
  • Riccardo: We need to think about what type of queries people might want to express.
  • PieterC: allow profile creators to build functionality they have in mind to the extent they want. We need to give them this possibility. If you want to attach a SHACL shape → look at this template profile.
  • Riccardo: vocals has something similar. Could work for 80% of metadata

Aggregated notes regarding profile scope

  • What is the timestamp property – optional
  • What are the ordering semantics
  • E.g., in Kafka partition – total order, in topic partial order, or no order
  • Look into prior work of Emmanuele – which message happened before which message. Important for CEP applications. Ordering without timestamps.
  • Specifying temporal relationships between messages, for example a profile for linear temporal logic
  • Schema + metadata
  • Let people express metadata in any way? Or make it narrower?
  • https://semiceu.github.io/LinkedDataEventStreams/#context-information ==> LDES does that here
  • LDES has support for configuring:
    • Versioning of representations
    • Indicating the order (timestamp or sequence)
    • Referring to CRUD semantics
    • Indicating a transaction: multiple events to be processed together
    • Retention policies of events in a log
    • Shapes of the events
  • Temporal-aware shape constraints
    • E.g., if door was opened, it cannot be opened again
    • Riccardo did some work on temporal SHACL

Vocabulary for profiles

  • Piotr: Do we want to have a vocab for the profiles? For people to describe the profiles:
  • PieterC: no. No added value compared to LinkedData event streams
  • Riccardo: yes
  • Pedro: agree with Pieter, so no.
  • Piotr: LDES allows specifying that (section 4). Disadvantage: no interoperability for these profiles.
  • PieterC: you will implement a lot of different clients, do we really need interoperability across different worlds?
  • Riccardo: very simple actionable vocab would be a great contribution.