Meeting minutes
Complex Data Types
https://
erich: Complex Data Types (CDT) is implemented in Jena
… And RDF lists can be turned into CDT arrays.
… CDT is a proposal
… We also need a standardbinary representation of RDF.
… RDF HDT allows query on it.
… They treat all literals as a string, which doesn't work well for a lot of numeric data, but it's very fast.
… Great for huge amounts of data.
… I used it with Apache Arrow that I made.
… HDF5 is use in the scientific world, like an intelligent zip file. Chunks can be compressed, but you can pull out parts.
… I'm putting the guts of HDT into HDF5.
… OpenLink Software won't touch it if it isn't a standard.
… I harvested all of the imaging commons data and it was 90 TB of data, and pulled out all the metadata in a week.
… I'm working on a SOLID project -- Java library that wraps a read/write storage for linked web storage.
… Want to use it for image annotations and displays.
dbooth: Any SPARQL implementatoins of CDT yet?
erich: Only the Jena SPARQLstore.
erich: CDT paper: https://
… also this: https://
… I also suggested allowing JSON paths
dbooth: Community group is a good step toward official W3C working group
erich: Need to remove redundancy in the file
detlef: From a theoretical standpoing, the hierarchical approach in DICOM is the right way to do it
… You have to connect to the UIDs of the entities
https://
erich: The HDT source code is GPL. I'd rather have it MIT or Apache 2 license.
… but it squeezes the data down a lot.
… I doesn't need to be serialized/deserialized. Like a direct memory copy.
… HDF5 is used for a lot of learning models.
… The ordering of the HDT data is not currently usable.
… I'm stuffing HDT data into HDF5.
ADJOURNED