Tim Berners-Lee
Date: 2022-12-21, last change: $Date: 2022/12/26 20:24:26 $
Status: personal view only. Editing status: first draft.

Up to Design Issues


Stuff in your Pod

Up to Design Issues

Tim BL

What should be in your Solid Pod? Anything, of course. Any data in your life, across the data spectrum from public to private and everything shared with communities in between. The activities in your life you currently do on your phone, on on a web app, on or on any device at all. And activities which currently you can't really do online but you will be able to in future. This data will vary hugely in size and shape, and immutability, and it's needs for speed, and it's needs for security, may be very different. But still it must be organized in clear way, extensible way, and a way which allows interoperability between different application. To be extensible across these dimensions, we must give applications the power to configure this data in the users pod in a way which meets these requirements, and also works well for that particular app. We expect different apps to often shared common design patterns for similar types of data. We do this by principally initially keying data by its RDFS Class, not by the particular app which wrote it. But then different apps need to be able to create very specific structures including many different types. We expect the classes of data used by different apps to overlap a lot. We track and use its provenance carefully and carefully control is access, to be able to model the way we really trust data, people and bots in this very rich, varied, and valuable system. Keeping stuff related to the same activity together in a pod, even though it is of many different types, is valuable, because the same access and trust conditions apply and it is much simpler to get those right.


The Variety of Stuff

When we think in general about the structures for organizing data in a pod, we need to think about the huge variety of data which there will be. It should be able to include any data in your life, across the data spectrum from public to private and everything shared with communities in between. The activities in your life you currently do on your phone, on on a web app, or on a laptop, on or on any device at all. Gaming systems, pubic kiosks, Automatic Teller Machines. This will include activities which currently you can't really do online but you will be able to in future. Maybe sing in a choir together, say, or share yourself in real-time ultrasound scan form. This data will vary hugely in a number of dimensions we already see to a limited extent

Size
It will vary in size, from single byte information ("The Doctor is [In]") though medium sized things like events and photos, though to huge things like genome data, and scientific data from climate change experiments.
Shape
It will vary is shape, from simple table data to complex graphs of many different connected objects of different classes such as a compilation of all of the data the a journalist have relating somehow to one interesting incident. Some shapes, like events and contacts, maybe be ubiquitous, and some, like births and deaths, will be rare.
Immutability
It will vary into whether it is, once written, unchangeable. Many documents, like birth certificates, drivers licenses and passports and the digitally signed certificates that once has taken a course, are just issued once and never change. The same is also true of the archive of chat messages and social actions of past days. Immutability is a useful property. It means that you can keep cached copied of these things forever. You can store them on write-once devices, like optical disks. You can refer to the by their cryptographic hash as a another name.
Speed
It will vary in the sort of speed a person needs of access to it. You may be happy for your calendar events to synchronize every few minutes or every few hours. You expect messages in a chat to be as instant as they can. In general when you bare collaboratively editing anything then you want the changes each person makes to be propagated to all people and devices involved as fast as possible. Making that work well, and automatically, in the Solid platform, is a huge selling point for developers and users.
Security
Across the expanse of the activities of someone's life, the level of security needed varies. You may decide that birth certificates and the documents proving ownership of your house plus a few old photos, are the things you really don't want a loose and you put in a box at the bank. But you may not worry so much about a movie you bought and will watch tonight. This splits into the three different branches of how likely it is to be lost, to be faked, to be read by the wrong person. As the aspects of security depend on things including the underlying pod service, We may want our system to limit which pods (and devices) different data is in fact stored on, and the use of client-side encryption.

Interlinking and interconnection

Despite this range of vastly different types of data, it must still be organized in clear way, extensible way, and a way which allows interoperability between different application. In the Solid project, we use Linked Data, ie RDF as links, and we have decided to base the dispatch of functions to handle things based on the RDF Class. But it more complex than just lining up an app for each Class, the the Solid world where a mantra is that you can do anything with anything. Whatever you put on your pod, I can bookmark it or like it on my pod, using my favorite app. People -- developers and in fact users -- will be thinking up new sorts of social action, new form of organization, and so on, all the time.

So classes and apps are related but not 1:1, as we are reminded by the old Semantic Web Metro diagram.

Apps are like station on the metro where several RDF classes (metro  lines) meet
Data about people, the yellow line, is used in: an event, in the calendar, and the CRM system, and in the Address Book app where it connects to the geospatial (green) line. And so on.

So all those apps should use the same shape for contact information.

We expect the classes of data used by different apps to overlap a lot. So those designing new shapes in their app should be conscious of that.

To be extensible across these dimensions, we must give applications the power to configure this data in the users pod in a way which meets these requirements, and also works well for that particular app.

For example, take an app for running a meeting. A meeting typically involves, say a poll among the group as to when thy can meet, them a time and place; an agenda, groups of people invited and attending, a video call and/or chat during the meeting, the minutes recorded, action items taken by people. It is often part of a series of similar meetings.

An extensible way to do the is to have the functionality of the poll be a generic one usable for other surveys, and to share the UX for dealing with groups with Contacts apps and anything else on the yellow metro line.

Because the Solid Pod has a folder structure (strictly, LDP containers), the meeting can be managed in a main folder for the meeting, and then subfolders can be made within that for the poll, for groups (if they are new groups not just existing groups in the user's contacts), for the video call link, the agenda, the minutes, the action items.

Data within different parts of the app can be saved in subfolders of subfolders, like meeting, action item, chat, message

Role based

Systems like this typically uses concepts of role, such as say Owner, Administrator, Invitee, Participant, and Observer. look at Github organizations, repos and gitter rooms, for example. The app may just define, or the owner configure, which roles can do what. Who can edit the minutes? Who can read the minutes? Who can add someone to the invitees? In the action item tracker, who can comment on the task? If a little chat is started about the task of fulfilling an action item, then who can read, who can write that chat? Who can deem the item done? It is the job of the app in this model

These things may be recursive, -- starting a new issue tracker around a chat, or starting a new chat around n issue, and so on.

The the app, then, lays out a folder for the meeting series and for each meeting, and the parts of the meeting. It then splits the data into resources where specific roles get specific access to specific resources. The owner can edit the config, and participants can read it. This makes it possible for the app to deliver the role-based functionality by mapping it only acyal Solid Groups and intividuals in the access control systems.

When things are created within things, like to-do list within a chat, then typically roles are handed down .. though a chat about as task within a meeting could be further restricted by only allowing people interested in the task to comment, for example. Maybe a meeting allows the spinning off of 1:1 coffee sessions, say, for thee sake of an example. It is obviously crucial that the user interface make it very clear who is in fact involved and has access. Keeping the systems simple makes things easier. By deviating from how things work in real life, or in existing systems, is a risk.

Other Considerations

Granularity

In desktop systems where users are allowed to see the file structure but it mainly managed by the apps, then On the desktop or the downloads folder, things are just dropped in a heap, and left in chronological order (Downloads) or in a 2D space (Desktop). Other things are arranged in more powerful things, like Photo libraries, Address Books, Music & Media Libraries, Libraries of financial data. The folder structure within something like a Photo Library is hidden from the user, who instead can use tools like Photo Album structure, query-based views, and so on. Things like thumbnail generation and caching locally of data from public databases is all hidden from the user. So these systems work well when the type index is used for large granularity systems - Photo Libraries rather than individual photos, recipe collections rather than individual recipes
LibraryIndex by egItem
Address Book name, email Contact Card
Calendar time Event
Photo Library time, author, subject, location Photo
Media Library artist, composer, album, genre Song
Medical Records Library date, source, type Observation, Test, Scan, etc
Fitness Library date, sport Run/Ride, etc

and so on.

This is important, partly because the top level index scales better when filled with a few large things than filled with a lot of small things. But also it in important because within library type, things are internally more richly than just name and type: Music tracks are indexed typically by artist, album, and genre. While the index lookup in future may all be done by SPARQL (etc) queries on the pod, for now apps have make their own indexes and organize stuff in a logical way for the domain.

Incoming links and type indexes

The structure of data of different data in different nested folders above is independent of things like the type indexes (My Stuff), bookmarks, and so on, where anyone can bookmark anything. So a user can decide that some activity deep inside in fact deserves to be listed on their public profile. So there can be incoming links into things in the structures.

Common patterns

Data about different things will have a different shape, shapes using typically properties in different ontologies. However those shapes might things in common, which might merit looking out for patters like summary indexes, inboxes, time series days in dated folders, local cache of remote pod data such as the names of remote things, and so on.

Trust

We should track and use the provenance of data carefully and carefully control is access, to be able to model the way we really trust data, people and bots in this very rich, varied, and valuable system. A sophisticated way of tracking provenance is to have the data digitally signed by keys of agents deemed to be trusted. A more lightweight way if to know they wrote that data because they are the only ones that have write acccess to a resource.

Conclusion

Using the folder structure of Solid pods to arrange different resources involved in collaborative applications, combined with the folder-oriented features of the access control system, and a mapping of app-level role based permissions to different things within an activity, allows a user's pod to be used for all kinds of different activities that we can imagine now, and hopefully those we cannot currently imagine, though others will, in the future. In summary, we find keeping stuff related to the same activity together in a pod -- even though it is of many different types -- is valuable, because the same access and trust conditions apply and it is much simpler to get those right.


References

Up to Design Issues

Tim BL