Linked Data Security -- 18 Sep 2019

scribe+

manu: Just a handful of slides. (link forthcoming)

<manu> slides: https://docs.google.com/presentation/d/1dn4uotAHXgKIwrPW3dlPArB19NZzxUf_cOSQcSlxb2Y/edit

manu: We’ll discuss LD security and whether to push forward at W3C.
... There have been a number of initiatives floating around for 7+ years (jcarroll 2003)
... Why now?
... C14N, proofs, signatures
... I’m co-inventor of VC, JSON-LD, Diigital Bazaar, …
... The VC spec will be a REC in about amonth, and specifies proofs using JWT and LD Signatures.
... The concerns are that there’s no recommended specification of LD sigantures.
... The US Fed Govt through DHS has mandated the use of JSON-LD and VC and DIDs. They want an official standard for LD Sigs.
... (no supprize, but now it’s important).
... Also banks, healthcare, etc.
... DID camp needs signatures too.
... Other groups have provenance use cases, graph equaility, etc.
... Also WoT needs signatures.
... (picture of stack)
... At the bottom is RDF Dataset C14N, to ensure that different expressions result in the same hash.
... Above, LD Proofs. A digital signature is just one type of proof. (proof of work, stake, elapsed time, …)
... Above that is LD Signatures, and above that Cryptography standards

An example proof mechanism is Equihash which is not a digital proof.

scribe: RDF C14N transforms N-Quads into a canonical serialization
... There are two mathematical proofs, one by Aiden Hogan, and the most recent by Rachel Arnold and David Longley undergoing peer review. (Aiden’s has been peer reviewed).

ivan: The C14N of an RDF graph has been an open issue. jcarrol and Pat Hayes had an algorithm in 2002, but indicated that it was not complete (there were graphs for which it wouldn’t work).
... For a long time, nothing happened, because there was no proof.
... Aiden published a paper which was complete, and had an implementation (in Java). I implemented in JavaScript, but the paper is not REC quality.
... In parallel, dlongley came out with their algorithm, but it had no proof. manu and I have been discussing for a while.
... We expect a peer review of the associated paper, and we’ve discussed on how to put into a REC. Most RECs aren’t mathimatical, and W3C is not equiped to judge mathematics. But, with peer review, we feel we can publish such a REC.

arnod: two papers, are they the same?

ivan: I understand that the two papers are very close. Simple cases plus some esoteric casses. a WG would need to choose between them.

sander: you mentioned that W3C doesn’t have mathematical capability, but many members do have the expertise.

ivan: You need different worlds to work together, SemWeb/RDF is one thing, and Crypto people looking at RDF would be difficult to make happen. They’d have to deeply understand things like BNodes.
... Aiden is a mathematician with a SemWeb background, and the other is a combination of mathematician and engineer.

manu: We’re trying to find good review from other universities.

ivan: If we get close to the point where we need to get a charter, we will have to call out university members who’s opinion would be important.

tobias: W3C has said they would incubate?

ivan: For now, there’s an understanding in W3C is that if there is an “official” proof that says the algorithms are okay, that W3C would accept that as input, and we would not have to review.
... We would accept that the community of Crypto people have accepted it.

<deiu> ack

Scribe notes that discussion is that it’s not simply JSON-LD.

manu: Is there support to charter a group to handle these specs?

ivan: Need is not just for VC, there is a broad need for signed RDF data.
... We need to understand boundaries of the WG. Obviously, C14N is necessary. What else do we need?
... If I compare to XML Sig, it has C14N, a vocabulary, and a further XML serialization.
... We may have to have a vocabulary to describe how to put the data back into LD.

manu: Shows JSON-LD to C14N in N-Quads. (the _:c14nxxx is part of the serialization).
... That gives you a cross-syntax signature.
... The key is that it is syntax/serialization independent.
... LD Proofs are used to express digital proofs. (THere are other types of proofs).
... LD Proofs are a way of attaching a proof to an RDF document.
... (illustrates a proof)
... A CuckooCycleProofOfWork2019 could be used to show proof of work.
... comes with domain, proofValue, nonce, and other annotations that are included in the verification of the signature.
... Above proof is LD Signatures. It adds a verification method (e.g., pub key). What matters is the graph, not where the graph is located.
... This is the vocabulary part.
... Proof requires C14N to get hash.
... There aren’t many developer options when picking a signature method.
... LD Cryptosuites are provided pre-packaged suites that bundle the various pieces together in an easy to use type.

<Zakim> azaroth, you wanted to discuss dependent canonicalizations and to note dependent canonicalizations

azaroth: isn’t there also a sute of other C14N bits?

manu: There could be, we have a univeral RDF Dataset canonicalization algorithm.

azaroth: If we have a JSON literal and the encoding doesn’t canonicalize that, there would be a different hash generated by an algorithm which uses different white space, for examples.

ivan: for datatypes, there are C14N issues. There is one for XML, probably not for HTML.

azaroth: The WG would not consider JSON canonicalization as being in scope.

manu: We don’t go into literals.

<bigbluehat> JSON Literals in JSON-LD 1.1 (for the curious) https://w3c.github.io/json-ld-syntax/#json-literals

ivan: The signature algorithms shown exist? Who defines them.
... My feeling is that we standardize the vocabulary, and C14N, but not the specific methods

manu: Customers need the algorithms to be standardized.
... the specs don’t define the encodings, just the vocabulary.

<bigbluehat> scribe+ bigbluehat

<bigbluehat> gkellogg: the RDF canonicalization does define 3 mechanisms

<bigbluehat> manu: other things (signature algo, etc) would be out of scope

manu: hashing and signature algorithms are out of scope, but signatures are

azaroth: We are not defining, but we are selecting. THis plays into registries and such, which may need to be updated.

<Zakim> azaroth, you wanted to ask about registries?

manu: The CCG is currently in charge of the registry, but could be handed off.
... next steps. We still need peer review, then we need to seek a charter

ivan: Speaking for myself, if we have the reviews for the algorithms (reconciled). Creating a charter using the algorithms as input that it is doable.

arnod: Why isn’t one peer reviewed algorithm sufficient?

manu: We need two independent proofs.

ivan: We have an implementation of an unproved algorihtm, and no production ready implementation of the one which is reviewed.
... At some point, we will make the bridge to reconcile the two different algorithms.

manu: Expectation is that they algorithms converge.

ivan: Aiden’s algorithm is IP free. DB’s has been published to the public domain.

ken: just to clarify that the box at the top defines out to call out to an external crypto library and how to apply them (protocol?)

manu: What remains is if anyone sees issues around formal objections or organizations that may object.

ivan: There will be “the usual” objections.

tony: I tried to implement this, but couldn’t. Spec is incomplete.

ivan: that’s why the mathematical paper needs to be done, but not that the spec is complete.

tony: I worked on XML C14N and it was a disaster.
... The processing time required was a problem, required sender vs receiver C14N.
... Is it going to be fast enough?

ivan: I know Aiden ran his implementation through large LD sets and showed performance.

manu: Depending on the type of graph (poison graphs) it can take 50-100ms to detect an attack.
... In the easiest case, all your doing is sorting, takes about 5-25ms.
... If we put JSON Schema in, it takes 10x longer to do it vs C14N.
... A Base64 encode takes about 1/2 the time.
... It’s on the order of Base64 enode time.

sander: would it be possible for graph stores to just use canonicalized bnode names?

ivan: In theory, but any change throws it off.

manu: You can also canonicalize to a template and reuse at very fast speed.

tony: can I use some XPath like thing?

manu: you could, but it’s likely unnecessary, as we don’t have the same problems. E.G., there’s no nesting.
... You could use framing and JSON pointer.

<manu> rrsagent bye

- DRAFT -

Linked Data Security

18 Sep 2019

Attendees

Contents

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output