13:32:34 RRSAgent has joined #lws 13:32:38 logging to https://www.w3.org/2025/10/27-lws-irc 13:32:38 Zakim has joined #lws 13:32:54 meeting: Linked Web Storage WG 13:32:54 agenda: https://www.w3.org/events/meetings/a19ab7dc-1753-433d-bac5-64e3ad8c0a43/20251027T100000/ 13:32:54 clear agenda 13:32:54 agenda+ Introductions & announcements 13:32:54 agenda+ Categories of metadata 13:32:54 agenda+ Relationship between metadata & resource headers 13:32:54 agenda+ Server-managed vs user-managed metadata 13:32:57 agenda+ Metadata format(s) 13:32:59 agenda+ HTTP management of metadata 13:59:29 gibsonf1 has joined #lws 13:59:34 eBremer has joined #lws 13:59:39 present+ 13:59:55 acoburn has joined #lws 13:59:57 present+ 14:00:20 present+ 14:00:30 present+ 14:00:35 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:01:31 previous meeting: https://www.w3.org/2025/10/20-lws-minutes.html 14:01:31 next meeting: https://www.w3.org/2025/11/03-lws-minutes.html 14:02:34 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:04:21 ericP has joined #lws 14:04:30 chair: ericP 14:04:40 jeswr has joined #lws 14:04:47 present+ ericP, jeswr 14:04:51 scribe: jeswr 14:05:05 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:05:17 bartb has joined #lws 14:05:54 present+ 14:05:55 present+ 14:06:04 present+ 14:06:18 Zakim, take up next agendum 14:06:18 agendum 1 -- Introductions & announcements -- taken up [from agendabot] 14:07:08 acoburn: This week is daylight savings time in Europe. Next week we will be back to normal. Please use W3C calendar to get the canonical time of meetings. 14:07:25 take up next agendum 14:08:06 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:08:56 present+ dmitri 14:08:58 acoburn: We plan to spend the next 2 weeks discussing resource metadata, such that we can then draft spec text. 14:09:37 ... We will the do 1 week of storage metadata, and then containment. 14:09:55 ... By January we hope to be able to draft language for metadata and containment 14:11:04 q+ to ask about definition of metadata 14:11:18 ... Before we can specify format and content of metadata, we need to agree on the categories of data we want to describe 14:11:39 gibsonf1: How do we define metadata as compared to data 14:12:08 acoburn: We have data resources, these could be in any format (RDF, JSON, XML). When we have binary resources e.g. JPEG, we cannot have self-description in that. 14:12:22 ... a metadata resource is a resource attached to a data resource which describes it 14:12:41 gibsonf1: So then the definition of a resource is a file 14:12:43 q+ 14:12:52 q+ to propose thinking of metadata as a combo of managed data and stuff that the user has sequestered as "metadata" 14:12:53 acoburn: Yes 14:13:21 gibsonf1: I am confused because we can resources that aren't files, e.g. request/response 14:13:55 TallTed: My analogy is - if you have a book the stuff on the pages is the data. The title, copyright, version etc. are metadata. 14:14:38 ... this is messy. My advice is not to worry about it too much until things are concrete. 14:15:18 ... Thinking about databases. The query is not part of the results set - but the query can be put beside it, and that would be metadata. 14:15:57 ack next 14:15:58 gibsonf, you wanted to ask about definition of metadata 14:15:59 ... as acoburn was saying - with a JPEG, the data is what lets you display the image. The rest is the side card, e.g. where the picture was taken, time, what device etc. 14:16:01 ack next 14:16:13 ack next 14:16:14 ericP, you wanted to propose thinking of metadata as a combo of managed data and stuff that the user has sequestered as "metadata" 14:16:59 ericP: Do we also consider some metadata to be server managed, and other metadata to be user managed. 14:17:04 acoburn: Yes 14:18:27 acoburn: I have categories of metadata here ???. There are also types of metadata that we may also want to allow for, but not require; e.g. memento versioning. 14:18:50 ... I will go through them quickly, then we can have a discussion about whether these categories need to be modified. 14:19:58 ... First should server managed data be in a container or resource? We have both Solid and Fedora as input documents and ??? 14:20:01 "Categories" are "types"! 14:20:01 This list is confusing, because there are multiple layers mushed together, and same/VERY-similar line items repeat. 14:20:18 ... Second is user-managed types in addition to server managed types. E.g. for types-indexes 14:21:31 q+ to ask about metadata location 14:21:41 ... Third is storage description resources. In Solid it doesn't tell you what should be in this resource - just that there must be only one. Fedora does not specifically refer to a storage resource, it does have a describedBy relation which comes from LDP. 14:22:13 q+ 14:22:36 ack next 14:22:37 gibsonf, you wanted to ask about metadata location 14:22:38 gibsonf1: Why are we adding locations to find information about a request, rather than allowing the server to manage that - and the client going and saying "give me RDF about this resource" to the server. 14:22:44 ack next 14:22:53 acoburn: This is fine unless you want metadata in the same format as the base resource 14:24:41 TallTed: This is where having a good understanding of HTTP is important. In HTTP the client asks for a resource, which is the URL they identify. They can also include the media types they want to get back; with different levels of preference. The server then returns some representation in a form that that the client also requested - but not 14:24:41 necessarily. The server is free to do anything that it wants. Servers can also have an internal quality rating which mixes the clients quality ratings with the servers own quality ratings of resources. 14:25:09 s/necessarily. The/... necessarily. The/ 14:26:01 ... It may be a text, a PNG rather than a JPEG. Again we are in a place of "don't worry about this until it comes up". It is not really easy to answer or understand until we have some real examples to discuss. There are so many possible examples that it doesn't work out to have discussions of them on this call, because everyone has their own pet 14:26:01 examples. 14:26:27 gibsonf1: So if the request somehow says "I want to request this resource BUT I want metadata, then it would resolve the issue" 14:26:56 TallTed: Yes, but everyone has their own definition of "what is metadata for this resource". So it makes more sense to ask for a description of the resource. 14:27:32 ... e.g. if you ask for a text type on an image, then you might get a text description of the image. These are all quote/unquote metadata but these are also descriptions of the resource 14:27:51 ... We just can't cover this all in hypotheticals - it is untenable. 14:28:08 laurens has joined #lws 14:28:40 acoburn: This metadata discussion comes from LDP. Here there are 2 resources a and b, where a is describedBy b. The implication is that they are different resources where one is described by the other. 14:29:15 ... Another category coming from both Solid and Fedora is having reference to an ACL which may be on the same or different server. 14:29:52 (I'm guessing that this "Fedora" is not the same as "Fedora Linux"?) 14:29:53 ... Another category is reference to a container in which a resource is currently existing. 14:30:29 ... A few from HTTP mediatype etc. which would probably be included in standard HTTP headers 14:31:01 ... Fedora is NOT Fedora Linux. Fedora is https://fedora.info/spec/ which is one of the input documents in our charter. 14:31:13 -> Fedora https://fedora.info/ 14:31:20 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:31:29 q+ 14:31:38 dmitriz has joined #lws 14:31:41 ... are there any categories which are missing or shouldn't be here 14:31:44 q+ 14:31:47 ack next 14:32:59 could s/Categories/Sources/ 14:33:24 TallTed: The current list is confusing. The two first categories are Types, other line items appear to intersect with each other. I would break this into sub-lists starting with server-managed and user-managed. These will be special cases in and of themselves. For example in the case of size, server-managed might be down to the type; user managed 14:33:24 might be big and small. 14:33:38 ack next 14:33:53 ... In the case of creation time - is it when a file was actually created, or is it when it was copied onto the linked web server 14:34:22 ... in the case of author is it client, legal entity that created it etc. 14:34:58 acoburn: Yes I will put into server-managed vs. user-managed categories. Then there is ambiguity, e.g. size is often a function of the payload that the user has created. So they don't control it; but they quasi do. 14:35:29 dmitri: The other categories I would love to see are (1) replication of auxilliray document (2) for encrytion (3) linking to an access log 14:35:38 q+ to ask for replication clarification 14:35:44 s/dmitri/dmitriz/ 14:36:10 acoburn: Do you want this on all implementations - or something a particular implementation would add in a specific way 14:36:23 dmitriz: The latter - so in this documents parlance they would be extensions 14:36:58 ack next 14:36:59 ericP, you wanted to ask for replication clarification 14:37:00 s/present+ dmitri/present+ dmitriz 14:37:22 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:37:29 ericP: Is chain of custody, something that would be replicated place to place 14:37:57 dmitriz: The provenance of authorship would be one thing. Here I am more interested in where copies of this file exist, e.g. on my home server 14:38:21 ... we haven't dealt with replication in Solid much before, but it has always been a missing piece. 14:38:42 acoburn: Another point I wanted to discuss is the relationship between metadata resoures and headers. 14:38:58 ... e.g. for size, should that come back as content-size in the HTTP headers 14:39:12 here's an example of Replication related settings for a replicating db like PouchDB: https://pouchdb.com/guides/replication.html 14:39:34 ... for these, often we have used link-headers in Solid and the Fedora project. 14:39:38 think of it, at base level, as a git-like list of "remotes" 14:40:19 +1 to simplicity of single source of truth 14:40:31 ... a minimal approach is that anything in metadata resources is not included in HTTP headers, you just link to the external resource; and all metadata is included in there. Both Solid and the Fedora project do that. Neither Solid nor Fedora have much to say about the format of it; how do you modify it etc. 14:40:53 possibly surprising weaknesses in HTTP -- content negotiation lets the client request any media type, and the server respond with any media type, but content size is always handled in bytes, notably including for paging, which is very unhelpful when working with RDF or CSV or TSV or various other "data" media types 14:41:11 ... a unified approach is that anything you put in a metadata resource is added into the HTTP request. The major con is that you are then constrained on the values that you can put in metadata resources 14:41:44 ... the split approach is that one type of metadata resource is used for content that will go into HTTP headers; and another metadata resource is for content that will _not_ go into HTTP headers 14:41:45 q+ 14:41:46 q+ on location 14:42:50 TallTed: Another challenge comes from a weakness in HTTP. The client can request any media type, and servers can respond with any media type. Size in only in bytes; therefore you cannot say "give me record 16, or row 16". 14:42:58 ack next 14:42:59 yeah, but HTTP's concet-length is a transfer parameter 14:43:07 ack next 14:43:08 gibsonf, you wanted to comment on location 14:43:46 q+ to ask if that means only one link header 14:43:54 gibsonf1: One option is to specify a location. Give me metadata on this thing and then ??? 14:45:08 TallTed: The deployment needs to know what to give you back when you ask for metadata on a resource; but you are not going to know what the deployments set-up was. The server will then go and say "sure, here is what the metadata is", you will go that is a bunch of gobbildygook and I just wanted size, media type and something else. 14:45:29 gibsonf1: But here (in the LWS group) are we not defining exactly what metadata is 14:46:01 TallTed: Here we have a shortlist of what metadata is. But someone else will then say "no, that causes limitations - here, here and here." 14:46:20 ... We need to not be so boxed in that other people cannot extend them within the bounds of LWS 14:46:53 q+ to talk about the requirement for self-descriptive and discoverable APIs 14:46:59 ... There is a lot of challenging stuff here without concrete examples. These concrete examples would take a lot of time to pull together. Some are in the UC&R's, but we have not gotten very granular on many of those. 14:47:02 q- 14:47:04 ack next 14:47:05 acoburn, you wanted to talk about the requirement for self-descriptive and discoverable APIs 14:47:24 ... We would probably see resource size. Likely in bytes because that is what HTTP supports, but not much beyond that. 14:47:38 acoburn: I want to bring us back to self-descriptive and discoverable API's 14:48:40 ... If you specified a particular location; then someone needs to read the specification to know what the location is. This goes agains the principle of having self-descriptive APIs within the specification. 14:48:45 +1 14:49:00 ... I think we should be really careful about specifying locations 14:49:04 +1 on having metatdata location in the header 14:49:31 https://datatracker.ietf.org/doc/html/draft-ietf-appsawg-uri-get-off-my-lawn-05 14:49:44 acoburn: There is an IETF specification about being very careful when specfying resource URI's and instead using discovery 14:49:55 speaking of IETF specifications, there's a decent one on Linksets (which are basically a format to express all these auxiliary resources) 14:49:57 https://www.rfc-editor.org/rfc/rfc9264.html 14:50:07 "URI Design and Ownership" 14:50:27 acoburn: Are there thoughts on how rich this data should be and how we should link to it 14:50:46 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:50:52 ... one option is that we have a link-set resource which contains metadata that is going to influence what comes back on a get request 14:51:07 ... there could be a link-set which links to the location where that link-set is managed 14:51:23 ... in addition you could have an RDF or JSON resource which is pointed to by describedBy 14:51:43 ... the Solid specification currently prevents the situation where there are two describedBy resources 14:51:47 q+ to say for link header example, its super complex 14:51:58 ... what does this group think about this kind of a structure. 14:52:50 (and fwiw, the Linkset RFC is json-ld enabled) 14:53:55 dmitriz - yes see: https://www.rfc-editor.org/rfc/rfc9264.html#name-the-linkset-relation-type-f 14:53:55 https://www.w3.org/TR/json-ld/#interpreting-json-as-json-ld 14:54:18 ack next 14:54:19 gibsonf, you wanted to say for link header example, its super complex 14:54:20 ... To exaplain Link-Sets [RFC9264]. Which enables you to go between HTTP headers and a JSON document. 14:54:56 gibsonf1: It seems to me that if we are talking about metadata, and assume the metadata is in RDF and you can do anything you want with it. Then we would be solving this in the simplest and most interoperable way. 14:55:17 ... wheras if we pick one of these things like RFC9264 then it gets complex 14:55:33 q+ 14:56:01 ericP: It is challenging to have a document that is both server and user managed when you run into conflicts. Keeping those both separate is useful. If you want to point to the user-managed document; then we are back in the space of how we point to it. 14:56:30 gibsonf1: Here you assume that is a file, I am not sure why you make that assumption. It could be coming from a database etc. 14:56:46 ericP: If it is user-managed metadata, then they have rights on that as well. 14:57:09 gibsonf1: I don't see that as an issue either. Wouldn't the user just do a write request with whatever predicates we define, and the server knows how to handle it. 14:57:36 ericP: If a subset is server managed, does this then change the notion of the data that the server has got? 14:57:55 q- 14:58:05 gibsonf1: From my implementor persepctive; someone requests the metadata on a resource - and the server responds with both server and user managed metadata 14:58:24 ... there is no issue from the implementation side. You just send all of the metadata. 14:59:02 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 14:59:14 acoburn: Let's wrap it here for the week. 14:59:33 ADJOURNED 14:59:43 I have made the request to generate https://www.w3.org/2025/10/27-lws-minutes.html TallTed 15:00:24 acoburn has left #lws 16:00:30 Zakim, bye 16:00:30 leaving. As of this point the attendees have been gibsonf, TallTed, eBremer, acoburn, ericP, jeswr, bartb, dmitri 16:00:30 Zakim has left #lws 16:00:33 RRSAgent, bye 16:00:33 I see no action items