banner image

W3C Workshop on Data Models for Transportation

Thursday 12th to Friday 13th September 2019, California

Workshop Report

Table of contents

See also:

Executive summary

Transportation has been seeing considerable disruption and rapid innovation in recent years, creating a multitude of new business opportunities, increasing safety, efficiency and flexibility. The emerging technical trends are being driven by information generated across a wide range of sources spanning all modes of transportation. Parallel advances in sensors, communications, cloud computing, analytics, geophysical mapping, machine learning, mobile devices, user interfaces and related areas have created a rich foundation. The main obstacle to achieving the full potential is a lack of compatibility and interoperability.

This wealth of information can be leveraged to improve the passenger experience, provide entertainment, improve efficiency, maintenance, safety, and convenience. It can reduce traffic congestion, harmful emissions and substantially raise the standard of living for our cities. All of this is possible if we can move away from silo approaches that inhibit information exchanges.

The workshop attracted a broad range of participants from Tech companies, Telematics Service Providers, Tier One supplier, Automotive Manufacturers, Researchers, Regulators, Open Source Initiatives, Geographical Information System Solution Providers, Standards Development Organizations and Startups.

Specifically we had attendees from:

Apple, Autonomics, Baker McKenzie, BMW, Continental Automotive, Deutsche Telekom AG, Drive Sweden, ETSI, EURECOM, Federal Highway Administration, Fujitsu Limited, GENIVI Alliance, Geotab, Ghent University, HERE, iFOSSF, Linux Foundation, Microsoft, Mondeca, NIO USA, NMFTA, Open Geospatial Consortium, Ridar Systems, New York University, SRI International, Toyota, Uber, University of Southampton, University of Toronto, University of Toronto, US Department of Transportation, Volvo, W3C/MIT

We shared experiences, use case studies, new directions and insights on what's needed for the next generation of transportation data standards.

Goal of increasing awareness of efforts and problems was achieved, clearly establishing the need for better coordination and identified concrete next steps to enact as a result of this workshop.


The catalyst for holding this workshop was two fold.

First the W3C Automotive Working Group (WG) is starting to see implementations of its Vehicle Information Service Specification (VISS) being deployed into production vehicles. VISS exposes GENIVI's Vehicle Signal Specification (VSS), a data model to provide a harmonized, standard for the wealth of information available within modern vehicles. Every other auto manufacturer currently does data differently and exposes it with proprietary interfaces. While several automotive manufacturers in addition to those participating in the W3C Auto WG have expressed a need for this standard, they remain hesitant to adopt as rearchitecting their vehicles is a significant investment.

The W3C Auto WG sees the need for harmonizing this information not only within vehicles but in the cloud and with other standards efforts. We want to ensure the data we are exposing in-vehicle and bringing to the cloud fits into a broader transportation data marketplace.

The second motivator for holding this workshop was to address the limited awareness not just of our standards but related ones as well as all the innovative prototypes, experiments and proprietary, commercial solutions. These efforts need to be better aligned in order to achieve the sort of widespread interoperability desireable for modernizing transportation.

Related Standards Efforts

The workshop had attendees from several Standards Development Organizations (SDO) actively creating standards around transportation:

There were a total of twenty four SDO identified working in transportation space, majority of which were not present.

Intelligent Transport Systems standards work has been going on since 1992, yet surprisingly few people attending the workshop were aware of the scope of their work. This was particularly notable given that the audience was largely comprised of technologists working in the transportation space. Similar could be said of the other standards efforts representing a significant, wider problem resulting in parallel instead of coordinated efforts.

In the workshop we learned of need to better coordinate efforts to harmonize data architectures even within individual companies. Too often initiatives create their own architectures instead of leveraging existing solutions, creating islands of information (silos), often lacking semantics, metadata and other data design choices that would significantly increase its future usefulness.

Nation-centric initiatives to harmonize data are compounding the problem, coming at the expense of international standard solutions, forcing manufacturers and solution providers to address regional regulatory requirements.

"Standards are the first step towards the holy grail of an interoperable, plug-and-play world where cities can mix and match solutions from different vendors without fear of lock-in or obsolescence or dead-end initiatives."
- Jesse Berst (Chairman, Smart Cities Council)

It starts with the data! Shared information relies most fundamentally on a common description of the underlying data itself. A major message throughout the workshop was that protocols, serialization, authentication et al of all the various standard and proprietary efforts underway aside, we should endeavor to speak the same language in the form of common data definitions when possible, at least at the cloud ontological level. A functional data marketplace would flourish, broadly enabling varied partnerships and collaborations.

The lack of coordination is inhibiting the widespread interoperability potential, corresponding business opportunities and cascade of benefits to society.


The W3C Transportation Data Workshop covered a wide range of topics, summarized here and in minutes.


Not surprisingly numerous data topics featured prominently at a workshop on transportation data. The term data can be flung around casually, it is not uncommon to hear a company representative saying they have it without quantifying what it is they have or its range of potential uses beyond the original intent.

All data is not created equal. As mentioned, too often data is produced for a single or narrow set of purposes making it less useful if not outright useless in other contexts. What is available was acquired to solve a specific problem, probably lacking metadata on sampling methodology employed, degree of precision and relationships. There are potentially additional data points that might have been available and simply not collected.

The answer isn't to try to attain and store all information either as burden, bandwidth, storage and other costs will be incurred without necessarily adding value. There should be thoughtful curation on what is brought to the cloud, at times evaluating and acting upon information at the edge is what is needed.

"The Data is permanent and enduring, and applications can come and go."
- Adnan Bekan (BMW)

The ability to evolve a solution is directly related to the information available. A survey of all feasible information that can be attained should be taken into consideration when data structures are designed even if some is not to be collected initially.

Information is not knowledge but required in order to attain it. Lack of knowledge is ignorance.

Ignorance \Ig"no*rance\, n. [F., fr. L. ignorantia.]

1. The condition of being ignorant; the lack of knowledge in general, or in relation to a particular subject; the state of being uneducated or uninformed.
[1913 Webster]

2. (Theol.) A willful neglect or refusal to acquire knowledge which one may acquire and it is his duty to have. --Book of Common Prayer.
[1913 Webster]

Information is a commodity but without pertinent information, high quality, consistency and proper assessment it lacks value. There needs to be enough data points and metadata to perform meaningful analysis. The ability to act meaningfully has far greater worth than raw data itself. Sound reasoned assessments can also provide actionable information while stripping away personal identifying or otherwise sensitive information.

Conversely absent pertinent information tends to do more harm than good and result in flawed reasoning. For example a hard breaking event in a vehicle is typically viewed by insurance carrier algorithms as bad driving behavior where the individual was potentially distracted and needed to make a dramatic correction to avoid collision. The driver may have been reacting to something sudden, unforeseeable and outside their control and should be rewarded instead of penalized on their insurance score. Without proper context and full situational understanding this sort of algorithmic bias produces false assumptions and poor business decisions.

A sound approach being pushed for data architecture that should be applied to transportation space is Extensible Entity Data Modeling. Core data models should be used wherever possible, resisting the tendency to reinvent the wheel with a custom, monolithic, single use data design. This allows for a common understanding of the information across vendors and domains without the need to do custom translation or mapping. Custom mapping can bridge paradigms but is brittle, requiring ongoing maintenance as definitions change, further challenged by different deployment timelines and prone to periods of incompatibility. Needs for a given application or service beyond the core model can be added as logical extensions.

These modular ontology extensions can be unique or better shared understanding among mutual stakeholders and become widely used conventions or standards themselves over time. They represent the special needs beyond the core ontologies used, this additional data on relationship edge enable artificial intelligence and data analysts to make more ready use of the information.

Route planning algorithms are built on underlying data models. Different models make comparison or use of information in different algorithms difficult if not impossible. This becomes more important when trying to create multi-modal transportation solutions across different vendors and sources. Adhering to core models with logical extensions as warranted for a given mode or vendor allows for more ready inclusion in multi-modal travel planning.

Defining the Physical World

As this digital transformation of transportation starts to take shape, it is paramount we have shared understanding of the dimensions of time and space. It is all the more important as autonomous vehicles are being added to the mix. When giving coordinates for a vehicle are you talking about its center, left front edge, other? We simply cannot have variation by manufacturer on these fundamental concepts and operate safely.

Open Geospatial Consortium (OGC) and W3C have a strong, existing liaison in the form of the Spatial Data on the Web Interest Group which maintains a number of standards on time and sensor data ontologies along with Best Practices for Spatial Data on the Web. Semantic Sensor Network Ontology specification is aligned with OGC/ISO Observations and Measurements standards.

There are considerable challenges of defining physical space to guide the trajectories of moving objects traveling around this planet. Spatial object relationships are constantly changing. The physical world itself is far from sedentary, rife with changing features from tectonic plate movement, erosion, shifting of magnetic north. There are multiple representations of the physical geometries with different strengths, purposes and intended audiences.

Privacy and data ownership

To make efficient, tailored multi-modal travel recommendations a system needs to know an individual's whereabouts, destination, preferred means of travel, what specific transportation services the user already has subscription accounts with and any special requirements such as accessibility, additional passenger, baggage or cargo needs.

Information is both an asset for service offerings and a liability to the individuals who it is about and the companies entrusted with the safe handling of it. Significant concerns about possible misuse are well warranted as it presents a huge risk of invasion of privacy and subsequently physical safety.

Of the various solutions discussed there were some recurring themes regarding privacy such as empowering users to protect, temporarily entrust, own and migrate their data. Access control has currently advanced with some not only at the record level but regularly down to field granularity. This control is even applied whether the company with the information is acting on it directly themselves or acting as an entrusted provider for sharing with third parties for specific, approved uses.

Researchers at Ghent are using MIT SOLID (Social Linked Data) project for handling user data in their mobility prototype. The SOLID project is led by W3C Director and Web inventor Tim Berners-Lee, it empowers individual data ownership and ability to transport it between data providers. It's a decentralized instead of typically siloed approach, allowing individuals to choose which services to share what data end points with based on preferences, transportation service subscriptions and requirements. As with other efforts discussed it endeavors to align with at least several concepts of GDPR and other privacy legislation and conventions.

Another presenter promoted a solution utilizing proxy re-encryption with policy specific keys for safer handling and storage of personal identifying information with the ability to provide user authorized access when warranted.

The EU SPECIAL (Scalable Policy-awarE linked data arChitecture for prIvacy, trAnsparency and compLiance) project is working to solve the nuanced complexities with policy language, policies themselves and deployment of engines enforcing them based on purpose and clearly captured user consent.

Regulations have consequences and in the EU and North America all heavy vehicles are required to report certain data to regulators to ensure compliance with various laws. Fleet managers use this wealth of information to not only comply with regulations but improve operations. The possibility of a telematics service provider shuttering its business will immediately ground all their fleet customers demonstrating the clear need of portability of data to allow migration across providers.

We covered only some of the wide range of complexities surrounding protecting personal data that needs to be acted on in order to provide appropriate, personalized transportation solutions. Many of the privacy problems are not unique to transportation, only intensified.

Cars on the information superhighway

We delved further into W3C's Automotive specification Vehicle Information Service Specification (VISS) which provides an API for exposing GENIVI's Vehicle Signal Specification (VSS) to applications running in the vehicle or potentially indirectly over the cloud. There are many use cases for needing to operate local logic for sampling data or interacting with the vehicle. Applications running locally with a VISS instance can reside on the In-Vehicle Infotainment (IVI, aka head unit which is the stereo/screen in the center of the dashboard of modern vehicles), a Telematic Control Unit (TCU), other embedded device or a trusted paired one such as a smartphone.

The information coming from connected vehicles are a fundamental building block in the digitized transportation space. Outreach with various potential interested sectors such as regulators, insurance, tech companies is gaining momentum as this standard is now in some production vehicles. The W3C Auto group wants to ensure the data it is producing finds consumers in not only new and existing businesses but can fit as an important piece in other, broader transportation standards' puzzles.


We discussed the need for including Accessibility use cases and requirements in initial designs instead as an afterthought. Incorporating later will be more cumbersome for the solution provider and results likely rife with inadequacies.

This market share is not negligible with an estimated 15% of Americans have some form of accessibility concerns. Depending on individuals' needs, their reliance on various mixed modes of public and private transportation is likely higher. Besides being the right thing to do, designing for Accessibility will reduce potential of lawsuits under the Americans with Disabilities Act (ADA) and similar legislation in other parts of the world.

Special needs of a passenger needs to be conveyed while retaining privacy, alerting for instance a rideshare driver that his passenger is visually impaired an will be unable to recognize the vehicle based on make, model, color and license plate. How intelligent would a SmartCity solution look leaving a wheelchair bound passenger on a subway platform without an elevator egress? We delved briefly into real world examples of current transportation offerings.

Data marketplace

We covered many of the requirements for creating healthy data marketplaces in the data topic, adding usefulness and value to information. Assessments instead of raw data can also be more readily actionable on plus safer to distribute with fewer privacy concerns. The types of commodities and derivatives that can be exchanged in a transportation data marketplace can enable new businesses to form.

Geotab's presentation included the claim that they could practically map all the roads in US and Canada within a day based on the number and geographic spread of connected vehicles they collect information from. They can also identify road hazards, problem intersections, traffic, localized micro-weather among the information available.

We touched upon some of the many use cases for information coming from connected vehicles:


IoT is fragmented as badly if not worse than automotive, lacking interop between thing manufacturers. Of the many standards efforts in IoT space, we discussed SAREF and W3C's Web of Things (WoT).

Smart Appliances REFerence ontology (SAREF) is an “interoperability language" enabling semantic interoperability between different systems again illustrating how well defined data is central to interoperability.

BMW presented VSSo (VSS ontology) an ontology built on top of GENIVI Vehicle Signal Specification (VSS) developed by them in collaboration with EURECOM. It defines a taxonomy for attributes, sensors and actuators of vehicles. In addition to bringing data to the cloud for analytics and providing a solid offering in a data marketplace, VSSo can be leveraged to create a thing description in W3C's Web of Things. A proof of concept demo was shared where this thing description allowed for defining what arbitrary thing capabilities can be readily opened up to IoT interoperability.

VSSo leverages sensor and time ontologies mentioned previously, created as an OGC/W3C collaborating, following extensible design principles. EURECOM is to bring VSSo to W3C for standardization.

We were joined by the chief data architects from the University of Toronto behind the SmartCities standards effort. They shared with us their ontological work arranging at service, city and foundation levels. How they can be leveraged for transportation planning, future infrastructure needs, predicting impact of road construction and making They shared how the silos and proprietary tools approach restricts information access for these broader needs, putting limitations and additional work in trying to bridge understanding.


Improve Coordination

The main recommendation is to increase coordination on data models across standards efforts, starting with the ones present. Since the workshop we have established formal and informal liaisons and have set up a Transportation Ontology Coordination Committee (TOCC). It is initially comprised of participants from ISO SmartCities, ISO Intelligent Transportation Systems, Open Geospatial Consortium (OGC) and W3C. We will seek to involve additional standards bodies involved in transportation as work gets more established.

Data design principles

We agreed on fundamental data architecture design, using core ontologies that will reduce duplication of efforts and are extensible for different, purpose specific representations.

Core ontologies

We came up with a prioritized list of core ontologies needed in transportation space, starting with routing, generic core and extensible for segments using different modes of transportation. OGC has since come up with a Routing API and corresponding ontology in discussion in TOCC. Observations are next order and also has a start within OGC.

Relaunch Business Group

W3C relaunched its Automotive Webplatform Business Group as the Automotive and Transportation Business Group to continue coordination within the broader transportation space. TOCC resides as a joint task force within this Business Group and the W3C Spatial Data on the Web Interest Group (SDWIG) which is an OGC/W3C collaboration.

Automotive and Transportation Business Group Areas of work: [Charter]

The in-vehicle application Best Practices will shape expectations of third party behavior in interacting and collecting information on the vehicle from safety, security and privacy perspectives allowing policies to be defined and enforceable. At the vehicle edge is where data will originate and inform operator of collection, potentially capturing their consent.


Strive to ensure accessibility is taking into consideration at the outset of transportation data modeling, creating structure to hold specific capabilities and information useful in facilitating travel. We are encouraging solution designs to specifically seek accessibility review. As part of our continued outreach after the workshop, we have found individuals focused on this particular area and have encouraged the formation of the Linked Data for Accessibility Community Group that will collaborate with Automotive and Transportation Business Group, Spatial Data on the Web Interest Group and Linked Building Data Community Group.

Increase Awareness

To help stem off duplicate efforts and instead rally them on shared data models, we are encouraging attendees and all other interested parties to act as ambassadors for this shared message. We set up a publicly archived mailing list (@@) for sharing ideas and fostering ongoing collaboration.

W3C staff are working with various trade associations, standards bodies and think tanks to spread the word in aftermarket, manufacturer, fleet, insurance, regulator, transportation services and other sectors. Do please make us aware of opportunities to present.