Interview: Ora Lassila on Web Apps Powered by Linked Data

Ora Lassila

Ora Lassila (Nokia) is a long-time W3C participant and member of the W3C Advisory Board. He also is one of the early proponents of the Semantic Web. I spoke with him about getting the best that Web apps and linked data have to offer — together.

IJ: I read your CIDOC 2012 keynote “Love Thy Data (or: Apps Considered Harmful).” I would paraphrase the thesis as “Data have longevity; the life of code is short.” Why do your portray data and apps as being in conflict?

OL: I am not hostile to software per se. But you get the feeling that everything today is about apps! It’s like a curse. To me, it borders on the profane to say you need a specific app for some narrow activity.

IJ: What do you mean?

OL: I see two categories of apps. The first lets you use features of your device, such as your camera. These apps are necessary.

The second kind packages and insulates content from the rest of the Web. Every organization seems to feel it has content that warrants being packaged as an app. But when they insulate their content from the rest of the Web, bad things happen. We lose all the web goodness. Many of the original principles that underlie the web are really good ones, and give us linking, bookmarking, sharing bookmarks, and so on. We lose that when people create islands of content. This is a step backwards to life before the Web.

IJ: Why do you think people do this?

OL: One reason is that some people think that apps are the predominant way to monetize content in the mobile world. A lot of people talk about mobile ads. I’m not entirely convinced that monetization through mobile ads in native apps is happening at a large scale. On the other hand, I do see that advertising on the Web is highly lucrative for some companies.

IJ: And yet native apps have done well.

OL: Apps don’t serve users, they serve publishers of software. They provide a way to package and deploy content. What’s more, I don’t think users care. Users care about good content. And good content is that which integrates with the larger web. Installation and delivery are side issues. In a sense, the whole notion of an app is an antiquated idea.

IJ: Do you see particular obstacles to HTML5 becoming dominant?

OL: I think early on there was a need to create native apps for various reasons such as performance or to achieve a particular required user experience. But things have changed —very rapidly— and now we can provide users with that functionality using web technology. Performance may still be an issue if we are talking about accessing a particular hardware capability. But it’s certainly not an issue in the case of delivering a newspaper article.

The Financial Times has a Web app that is more popular than their native app. Amazon has been building their Kindle cloud reader, their app for consuming Kindle content not on a Kindle. It’s built with HTML5 to replace the native app they had built earlier. That means the content is delivered through the web and not through a native app that has to be installed through an app store.

When you are just talking about delivering content, we have crossed the threshold and it’s now possible to do so using Web technology on mobile devices. As a bonus, you get to integrate your content with other parts of the Web. And I think it’s great that W3C is getting more involved in games. I think that when we truly get to the point where games can be delivered as web apps we’ll have made huge progress.

IJ: Web APIs are proliferating. How does this relate to your ideas for data integration?

OL: I see the Open Web Platform and Linked Data together offering a solution. Good APIs are a step in the right direction. They give apps access to same functionality and data. But eventually things will become very complex because you have a large number of APIs that you need to take into account. The challenge in API-based integration is: what do you do about use cases you did not anticipate?

In the longer term we must provide integration between silos. I want to have data that can move freely, that has been engineered to integrate easily with other data. Linked data standards are a very good candidate solution for achieving this. Other options may come along, but today linked data is the top contender.

IJ: How would that improve integration?

OL: There are 3 parts to an app: data, logic, presentation. In many apps, there is little logic, and it is possible to use Web technology for both the data and presentation. But the way apps are built today in general, all three are insulated from the rest of the world. This makes it harder to learn about and reuse the data. Native platforms do not really support (let alone encourage) integration with other applications. Where you do see native app integration, it is because OS vendors have packaged them.

IJ: So why aren’t the world’s app developers already slurping up masses of linked data to reap the benefits?

OL: Good question. I think the biggest stumbling block for Linked Open Data (LOD) is: what’s the business model? People understand the indirect benefits of linked data, but few people have a handle on how to monetize it directly. For example, because data is often consumed by software agents, attaching advertising to the data is unlikely to be effective. It is also the case that people hesitate to give up their data. For instance, museums may be interested in opening up catalogs and in people being able to combine slices of catalogs. But they may be worried that by opening their catalog they are giving away something.

There’s also a cost issue. Publishing data is great, but not free. For example, there is a cost to running servers capable of handling complex SPARQL queries.

IJ: But people sell data all the time. Why is this more of a challenge with linked data? Do you think the openness of the format plays a role?

OL: Most definitely. But you are correct, people do sell data. This should be possible, but it’s harder a question than it first seems. I think we need to do more work in this area.

IJ: You said that people hesitate to share data. Is there a place for digital rights management here?

OL: Digital rights management with linked data can be tricky because linked data lets you make use of fine-grained data. Current DRM technology is well-suited for much larger chunks. Semantic Web technology allows us to track the provenance (and potentially ownership) of small pieces of data, even after those pieces have been combined with data from many other sources into something bigger. This is absolutely one of the most attractive things about the technologies we’ve defined: fine-grained integration is possible, and you can take apart the data later with reliable provenance information. Provenance information is important in a world of aggregation since we may prefer some sources more than others for various social reasons.

IJ: What else do you think makes linked data technology attractive?

OL: I work in a big data analytics team. We’ve noticed that having descriptions of data —of what the data means, what you can do with it, and so on— adds tremendous value to the data. We want to describe the bits so that others can make use of them. That’s where Semantic Web technology is so attractive. Data silos arise even within a single organization when people don’t describe their data. The better the description, the greater the chances you or anyone else will reuse your data. Ontologies make data semantics accessible.

IJ: What can we do to promote this vision of linked-data driven apps?

OL: W3C has done a good job creating a set of specifications that can be used to build good things. Now I think we need more education to help people make the best use of what we’ve got. For instance, we’ve just started the Linked Data Platform Working Group on the application of these linked data technologies. We also need to help developers reap these benefits through talks and sharing good practices. I’m pleased to see more and more conferences geared to the industrial use of linked data technology. We need to hear the success stories.

IJ: Does JSON give you what you need for data-driven apps?

OL: JSON is easy to use for some situations. Developers like it, the data are easy to use. But JSON doesn’t solve the difficult problems we set out to solve with the Semantic Web.

IJ: Such as?

OL: While there are discussions about improving JSON, the ones I have seen are focused on syntax —like whether JSON parsers should allow the trailing comma. Please let’s elevate the discussion above syntax. All problems with syntax have already been solved. There is no need to define another syntax, ever. Minimally, I am interested in questions like namespaces to facilitate global sharing.

Schema.org is a commendable effort on the data model front, and will help people reuse models and avoid incompatibilities. However, schema.org does not clearly state how to extend their models. When people see data they think they know what it means. They have a hard time discerning what they know from what is actually embodied in the data itself. People end up writing software the encapsulates some of the data semantics, keeping them implicit, and thus limiting reuse. We solved this problem in the early days of the semantic web.

I fear that with schema.org we are in the same situation we were in with microformats, where the lack of extensibility of microformats became an issue. With each new format, you had to write new software, which doesn’t scale. The more software you write the more complex your system becomes. There are questions we should be able to solve once and then move up the stack to solve more interesting problems.

IJ: What do you think is missing from the developer toolkit to build linked data powered apps?

OL: There’s plenty of software already. The obstacle is more mindset than tools. We have some exciting technology to solve these problems. We’ve got some remaining issues but for me the real obstacle remains the business model. I encourage readers to think about solving that problem.

IJ: What mindset would you encourage?

OL: People need to think about engineering for serendipity. It no longer suffices to design for a well-defined and finite set of requirements. You need to design for change and unanticipated reuse.

IJ: What about designing for mobile?

OL: Several years ago Tim Berners-Lee made the case at Nokia for “One Web,” including on mobile. Tim argued that if you separate mobile devices from the web, that you are not creating a utopia for mobile devices, but rather a mobile ghetto. That remark resonated within the company. Since then, Nokia has become a big proponent for making standard Web technologies applicable on mobile devices. This is an area where W3C had a strong impact on Nokia’s activities. When I joined Nokia in 1996, nobody knew what W3C was or what we should do about the Web. I would argue that W3C is now one of the most important standards organizations that we work with. And has been for a while.

IJ: Any other notes on W3C given your experience?

OL: W3C has come a long way with respect to how the standards-making machinery works. In the early years, we discovered ways of working together (both within W3C and between Nokia and W3C). Over the years —thanks significantly to Art Barstow’s contribution— the W3C/Nokia relation has become more clearly defined. The W3C Patent Policy also had a big impact.

IJ: Ora, thanks for your time!

Photo credit: Grace Lassila

3 thoughts on “Interview: Ora Lassila on Web Apps Powered by Linked Data

  1. This interview makes some good points. But I disagree with the following two sentences:

    1. All problems with syntax have already been solved.
    2. There is no need to define another syntax, ever.

    I agree that the syntactic issues for developing efficient compilers for mapping one notation to another were mostly solved in the 1960s. But there are still a huge number of human factors issues and efficiency issues about developing good notations for different purposes on different platforms and devices. By that, I would include graphic notations as well as linear notations. UML, for example, has proved to be much more readable and humanly usable than any of the Semantic Web notations. Much more R & D is required to develop notations and tools that the average programmer can use (and the average teacher can teach).

    I agree that JSON is primarily a syntactic notation. I call it “LISP with curly braces”. I also agree that focusing on syntactic tweaks for JSON is a trivial matter. For human factors, however, JSON is an immense improvement over the XML-based notations.

    I would recommend JSON as the primary notation for publishing RDF resources and OWL ontologies. If you adopt JSON as primary, there is no need for auxiliary notations such as N3 and Turtle. The XML-based notation should be limited to applications that require short XML snippets intermixed with text.

    As for the second sentence, I would say that there is very little need to define a new logic-based semantics. Instead, all the logics used for the Semantic Web should be based on a very general, highly expressive common semantics. RDF, RDFS, OWL, SPARQL, RIF, and many other useful logics could then be defined as restricted syntaxes for expressing different subsets of the common semantics.

    I suggest UML as an example of a family of different graphic syntaxes that are specialized for different purposes. Unfortunately, the UML designers made the same mistake as the W3C: They did not define a common logic-based semantics for all the notations.

    Recommendation: The W3C should focus on a common semantics for all logic-based components. The semantics for each notation, graphic or linear, would be specified by its mapping to the common semantics. Notations such as JSON and UML can be incorporated into the Semantic Web toolkit just by defining a mapping to the common semantics.

    The Object Management Group (OMG) has a great deal of experience in integrating UML and other notations with software development. I suggest that the W3C should collaborate with the OMG in developing a common semantics for both groups. That effort would be of immense value to both — and to the wider software development community.

  2. John: I fear I may not have been clear enough, but I was trying to be a bit provocative with those statements. This was fueled by my long-time frustration about the fact the most people are focused on syntax at the expense of (almost) anything else. Some of the things you say in your comment are evidence of this phenomenon.

  3. Ora, I agree that syntax is the least important part of any semantic system. The vision for the Semantic Web that I found the most compelling is the one that Tim Berners-Lee wrote in 2000: http://www.w3.org/2000/01/sw/DevelopmentProposal

    In that report, he proposed a Semantic Web Logic Language, SWeLL, as a “unifying language for classical logic”. In an early version of the SW layer cake, he showed that SWeLL would include propositional logic, first-order logic, and higher-order logic. Above and below the SWeLL bus, he showed an open-ended variety of heterogeneous components.

    In that report, Tim was not clear about the syntax for SWeLL. The most general approach would be to define SWeLL with an abstract syntax and declare that any concrete notation that had a formal mapping to the abstract syntax would have equal status. That would immediately end all syntax wars.

    That vision was broad enough to accommodate everything that has been implemented so far. But it also makes room for an open-ended variety of innovations. It even makes room for nonclassical logics that could use the SWeLL bus for the classical subsets.

Comments are closed.