Meeting minutes
phila: Any new attendees?
Today: comparing the algorithms, deciding editors,
<phila> https://
phila: How do we compare the two algorithms
… described as being similar
… as a group we need to decide how to proceed
… various ideas - need to be open and fair.
<Zakim> manu, you wanted to suggest one approach to compare algorithms.
phila: if the graph is simple then it is a simple algorithm, as bnode structures increase it gets harder.
manu: Aiden presented his algorithm. Could compare algorithm A and B work as the stages are similar.
… run both in parallel
… guilded tour of the process
gkellogg: if we able to pick some examples to run through would be useful
… complex category of graphs - set of promlematic use cases - performance testing
<Zakim> dlongley, you wanted to ask about criteria for making choices if we knew what the differences were to understand what differences to look for
dlongley: if we figure out the differences - need to keep on mind criteria.
… how formal do we need to get in understanding the differences
… worried about the amount of work
manu: add an input:
… formal analysis
<manu> Technical Report on the Universal RDF Dataset Normalization Algorithm: https://
manu: we might consider bringing in that group
… introduction to the formal analysis
AndyS: You mentioned - might not be all graphs that were covered. I think we ought to target all graphs as you don't know what you'l encounter in the real word
<Zakim> manu, you wanted to comment on the "all graphs" thing
dlongley: solve for all graphs (within resource limits)
… by default solve all "normal graphs"
… special flag for all graphs
<Zakim> manu, you wanted to comment on the "all graphs" thing -- concerns around "as big as the web"
AndyS: I'd push back a little on that as it means deciding what is and is not normal
<dlongley> "don't (try to) canonicalize the Web"
manu: at a higher level ... potential formal objections on charter ... e.g. very very large graphs
… working on documents that are bounded
… general algorithm ... state caveats e.g. not unbounded graphs
… "poison graphs" as an attack vector.
… we can eat up a lot of time on this.
… scoping of graph needed
phila: we have to limit the scope
… we could create an algorithm for all but not the requirement
phila: UCR says what we are trying to solve
… (editors needed)
<manu> +1 to explainer document to set the boundary of what we're trying to do.
<gkellogg> SHACL doesn't do datasets, only graphs.
phila: is there a condition we can do as a preprocessing graph
AndyS: If you take a FOAF graph built up from bnodes - can become complex in a small file
… I'd rather an approach that recognizes that sometimes you can't execute, rather than defining upfront what you can't compute
<Zakim> dlongley, you wanted to say i think you'd have to formally prove a preprocessing step would protect you if there will be no false safe constraints in the processing algorithm
dlongley: a preprocessing step would need proving
<Zakim> manu, you wanted to speak about "multiple phased solutions" not THE algorithm.
manu: we are not generating one algorithm. There exists today some impls in the field.
… we might look at whether it is good enough
… then consider next version
… not all or nothing
AndyS: What are the limitations? Assumption?
<Zakim> dlongley, you wanted to say we also know that RDF-star is coming -- and we'll need another algorithm for that
dlongly: current limitations/assumption URDA2015 - any bound dataset
… bail out at cost points.
AndyS: I'm happy with bailing out. But you can go further and say it doesn't handle all graphs. I'm happy with all graphs, with a bail out if it takes too much computing
AndyS: Defining a shape before hand is not something we should do
<manu> +1 to what AndyS is saying -- sounds like we're agreeing :)
phila: Others?
Kazue: thinking external criteria hard to decide
phila: and it is political
yamdan: also important to be clear about processing.
… A difference of the two algorithms is scope - dataset vs graph.
<manu> +1 to yamadan's points.
dlongley: criteria important. Formally defining the differences is itself difficult.
<Zakim> gkellogg, you wanted to suggest identifying specific categories of graphs in our hypothetical dataset that are known to create computational problems.
phila: please think of two criteria
gkellogg: want a collection of cases beyond test cases e.g. known expensive.
<dlongley> 1. ease of implementation, 2. existing incubation / use in the marketplace, 3. time / resource complexity in solving common datasets, 4. time / resource complexity in solving complex (or poison?) datasets
dlongley: not an ordered list
<manu> 5. Existence of formal proofs for the algorithms
<manu> 6. Demonstration of review of formal proofs for the algorithms
phila: easy of implementation - yes.
… incubation - yes
… resource complexity - yes
… formal proofs - yes
AndyS: Ease of implementation and complexity of algorithm can be in opposition
<dlongley> yes, there is a tension between ease of implementation and time complexity (sometimes)
<manu> +1 to create an issue to track this.
<dlongley> 7. reusing existing primitives that are available on various platforms
<Kazue> coverage of target RDF?
dlongley: reuse primitives e.g. hashing algorithms.
… existing RDF serialization.
Kazue: cover real life examples
phila: need to note that only usual graph trigger the failsafes.
<Zakim> manu, you wanted to note "hashed data" as the output... for BBS.
manu: BBS signature do a statement by statement signature
<dlongley> 8. allow signatures on individual statements and components of statements
manu: criteria: has to support selective disclosure. Hashing alternatves.
<Zakim> AndyS, you wanted to give criteria
<yamdan> +1 to BBS-friendly hash
AndyS: Dataset, not graph, no shape excluded, cover RDF-star
<gkellogg> +1 to AndyS
AndyS: Translates as do stuff with the longest life
<manu> I was with AndyS all the way up to "cover RDF-star" :)
<gkellogg> Also, Generalized RDF (bnode predicates, literal subjects)
+1 to gkellogg.
dlongley: RDF-star. Do existing use cases.
phila: rdf-star is a nice to have but should not fail because of rdf-star
URDNA2015 FPWD
phila: URDNA2015 as FPWD.
… likes explanatory examples.
Editors
phila: need to do a test suite and an explainer.
<Zakim> gkellogg, you wanted to volunteer to edit one or both of the documents and help with the test suites.
gkellogg: have been active in CG
… hat in the ring
For the C14N spec: ...
… For the C14N spec ...
<manu> Thank you, Gregg for Editor-ing the canonicalization spec! :)
dlongley: can contribute as backup editor
phila: any one like to be an editor or contribute in some way.
… hash doc
… "RDH"
<Zakim> manu, you wanted to note they might be the same doc?
manu: might be the same doc.
… hashing is C14N input, hash it. -- one page?
<manu> Woo! Thanks Tobias for volunteering to be an Editor!
Tobias_: happy to help edit esp hashing
<dlongley> +1
<manu> (for the second part)
<pchampin> +1
<Zakim> gkellogg, you wanted to discuss testing implications.
gkellogg: tesring may be easier as 2 docs
<pchampin> a contrario, the C14N itself may be a complex document. That could justify keeping the hashing part out.
yamdan: interested in hashing part
<dlongley> +1 to Phil
phila: end meeting
<manu> Note: Ahmad Alobaid volunteered to be a first-time Editor in this group.
<pchampin> > rrsagent, draft minutes