We describe the development of “Charlie,” a AI assistant working for the user by leveraging personal data securely stored in a Solid Pod. Building on a 2017 proposal for a trustworthy and user-centric AI, the work highlights progress made by engineers at Inrupt in late 2024. Using a simulated dataset for a fictional user, “Bob,” the team integrated an advanced large language model (LLM) with Bob’s personal data to demonstrate the transformative potential of combining AI and SoLiD pods.
By accessing rich, structured, and personalized data, Charlie provided responses far superior to generic AI systems, as shown in a use case involving running shoe recommendations tailored to Bob’s fitness and lifestyle data. This personalization exemplifies the next level of AI capability, offering unprecedented usefulness while maintaining user trust.
We underscore the critical role of robust data infrastructure, including Solid Pods and the semantic web, in driving AI systems, and the dual role of AI in both populating and leveraging linked data stores, paving the way for a dynamic ecosystem where data graphs mediate AI interactions. We paint an exciting vision for integrating personal data, semantic web principles, and advanced AI to create tools that truly serve users.
In 2017 I wrote a piece about how I wanted an AI that worked for me. An AI which I could trust, and which which I could share all my personal data, which would therefore be much more effective than an AI which I did not trust, and which I could trust more than and AI which would try to sell me things.
Toward the end of 2024, engineers at Inrupt had a go at building Charlie. Crucial in this endeavor, they made a fake Bob, the user who uses Charlie in the original blog. They created a simulated data set for Bob's pod. They then work Claude [CL], the Large Language Model from Anthropic, and designed a system though with Bob could run the LLM and pass it his data wallet (Solid Pod).
The Charlie app has a button which allows Bob to give or not give
consent. Here it is with consent not granted.
I need new running shoes. What should I buy?
Readers may be familiar with the sort response when there is no access to private data. It shows a wide knowledge of things public, and good logic in laying out the options in general. Claude explicitly says it needs to know more about Bob.
I'll help you find the right running shoes, but I'll need some key information to make a better recommendation. Could you tell me:
This information will help me suggest shoes that best match your specific needs.
Also, it's worth noting that while I can provide recommendations based on generally well-reviewed shoes and common running needs, the absolute best way to get the right running shoes is to visit a specialty running store where they can analyze your gait and let you try different options. They often provide free gait analysis and can watch how you run to suggest the most appropriate shoes for your specific running style.
Would you mind sharing those details so I can provide more targeted recommendations?
When Bob switches the consent switch to the the other position
and asks the same question, now Carlie has access to his personal data, the results are very different:
Based on your profile, I can make a personalized recommendation for running shoes. Here's what I know about your running habits:
You're a regular runner with:
Relevant physical stats:
Given your profile, I'd recommend:
Top Picks:
Nike Vaporfly 3 or Alphafly 3
Saucony Endorphin Speed 3
Brooks Ghost 15
Additional Considerations:
Charlie bases the answer on its knowledge of Bob's running from his fitness data, but also other buying preferences presumably evident from financial data.
Clearly the effectiveness of Charlie, in the mode with access to Bob's personal data, is devastatingly much more useful than the response without. Even though a couple of years ago people were amazed at how useful ChatGPT or followers like Claude, Pi, and so on, this is up anther level. This is what i want.
There are disclaimers of course around the fact this is all based on simulated data. The data wallet, the pod which Bib shares with the LLM as context is not that of a real person. It has simulated data from banks, form Strava, and so on, and building a real Charlie which anyone can use requires a lot more work. Extracting data from the world of APIs into a pod is a lot of work. But of course it is work which AI can help wit. AI can build apps to import data of all kinds of to your app.
From the system design point of view some things are very clear.Exciting times.