Automotive Data TF Teleconference -- 31 May 2018

<scribe> Meeting: Automotive Data task force

<scribe> scribenick: ted

<scribe> Scribe: Ted

Ontology

Benjamin: largely the same as I presented at the F2F at Munich

<Karen> Ted: use q+ to ask a question on irc

https://www.w3.org/2018/04/AMM-VSSo.pptx

Benjamin: for those not at the F2F, my PhD topic is semantic web technology on top of vehicle data
... formally define main knowledge, interconnections with web technologies
... in order to enable this we wanted a vehicle ontology. with it I can make some powerful queries
... I started with VSS data model, it is quite open with branches of different logical sections of signals
... leaves are attributes, the signal itself. there is some descriptive aspects to the data, units etc
... VSS is not so small in comparison to other ontologies with about 1k signals
... it is made to be extended with private/unique signals for a specific vehicle or to be overwritten
... SOSA based
... it is pretty small and self contained. example on next slide (6)
... I used VSS information to fill in gaps
... vehicle speed is an observable signal part of the transmission branch
... branches are well defined in VSS
... I made several manual annotations. some times there are names that are not unique such as speed
... it can be either from transmission, infotainment system or other
... for other names that are the same in VSS but not referring to same attribute (eg engine speed is rpm)
... I made them observable, actionable or both
... this means all signals have an actuator or sensors, all are part of top branch
... vss[namespace] being the top branch
... left, right, front and rear are branches and it doesn't make sense to have them as different classes
... seat for example should be defined once and instantiated in different positions
... having data property of branch makes it much shorter
... it also reduces the ontology and is smaller than the original VSS
... I am working with a SPARQL endpoint and playing with connecting it to other devices
... what about OEM specific concepts?
... the question is not about defining a data model that meets all needs but an extensible one
... you can make a private branch

Ted: what are the planned next steps? as I understand you identified a number of inconsistencies but do not believe you have a pull request pending against Genivi VSS

Benjamin: it will take some time, it will never be perfect from a modeling point of view
... we have some consistency policies applied
... try to never depend on a unit but use a percentage

Ted: with imperial still being used by US and UK I understand why you use percentage but that doesn't work for some signals

Benjamin: some automotive standard units don't make sense but that is what we need
... for these cases I have two solutions, one is abstraction of unit system. I am talking about concept liter/kilometer
... there is one unit class that I can instantiate as a datatype
... what I am using right now is a pretty nice implementation
... when it is within a set of potential units I can have and it maps to abstraction (eg kmph to mph)

Ted: do you have a fork somewhere since it isn't a pull request yet to VSS repo

Benjamin: yes, I'll drop a link

https://github.com/klotzbenjamin/vss-ontology

Benjamin: I have an application and use case on analysis and annotation
... also working on private ontology extension and related health check

Ted: that answers my second question, if there were tangible PoC for sparql people could play with

Benjamin: with only a few lines of querying in sparql format you can get concrete results
... I will be further along in a couple weeks

<Karen> +1

<Karen> Ted: from F2F we talked about merging the approaches

<Karen> ...reachitect tree a bit to align with both

<Karen> ...I meant to formally ask the WG

<Karen> ...since it's taking a different approach on convergence

<Karen> ...Get Rudi and Paul's reactions

<Karen> ...I don't see any reason why it would be objectionable to look at the data model in the BG and come back to the Working Group

<Karen> ...Rudi is the gate keeper for the GENIVI VSS repository

<Karen> ...use the group mailing list for responses?

<Karen> @: I don't have an opinion now, and don't want to express a group opinion

PatrickL: I don't really have an opinion nor would speak on behalf of the group

<Karen> scribenick: Ted

PatrickL: flattened view would make sense. it is not really comparable. you can only compare the structure

Registry

Data contracts

Glenn: Geotab is relatively new to W3C so will give a quick context about who we are
... we are a global telematics service provider, an engineering house. we cover light and heavy vehicles plus specialized vehicles
... we have N billion data points collected daily
... Darren will talk more about the data contract. what the developer does between the vehicle and data center
... he can explain how we sample data and the importance of conveying metadata on the means

Darren: Geotab was originally only interested in GPS data and we had a pretty elaborate way on saving that data, we even patented it
... it handles changes in speed, heading, handled certain trigger events
... it still had challenges. you could potentially miss the top speed on a speeding event for example
... you risk either having too much data or cutting corners and missing key information
... we switched to a curve based approach based on Peucker algorithm
... we used this curved approach and provided an accurate profile, capturing peaks, troughs and cornering events
... we added to this curve logic. you collect a set of data and run the curve on it. we wanted to integrate time sensitive aspects
... we do some estimate forecasting on where the data will be
... at a minimum client app would send two sets of data to the server and it could extrapolate from there
... the server would predict where the vehicle would be and if it differentiated substantially it would run the curve algorithm again so as to be send a more accurate update to the server
... this way we could reasonably accurately show location in realtime
... this curve based approach is applicable to other types of data from the vehicle
... our server needs to know how the data was collected
... when we implement we identify how we save data on the device. we have configuration on the server for specific data points to be able to check for accuracy
... if it uses curve logging the server can interpolate. if concept of estimate error is available it can make predictions
... in essence that is the data contract we have between client and server
... we allow other devices on the vehicle to relay information through our on-board devices eg fuel level sensor
... if that third party device was collecting information in a different manner we wanted to know how. it was breaking the contract with the server
... we had to come up with a different dataset for these devices to avoid confusion with information collected via different means
... we solved it in a way that works in our framework and architecture. when it comes from a different device we have to save in a different manner
... we need to know the type of data and how it is being saved. it would be ideal to express and send how the sampling was done

PatrickL: [summarizes as clarification]
... it is more describing what you need in the data in order to assess value

Ulrich: how and why do you save this data?
... do you pass which algorithm is used, specifically where in infrastructure is it sent?

Darren: when we implement saving of the data in the firmware (client device) we identify how it is collected
... it is predefined and understood by the server. every time we add new data we have to configure on the server how to treat it
... if it comes in some unknown manner we flag it as such
... the ability to add this metadata is useful for us

Ulrich: understood. what is the granularity you do this, per device, data element or source?

Darren: the error value dictates the granularity of the data
... we do not pass that either
... each piece of data has a separate error value for the curve. they may have similar values
... fuel level may have an error value of 2% whereas rpm could be 100
... each one has different error values and would depend on the nature of data itself and deemed need for accuracy
... the beauty of the curve logic is between any two points you can fill in any of the in between within that margin of error
... that piece of information in our mind should also be included with the data. I have a given dataset, know how it was collected
... those kinds of pieces of information we envisage including

<Karen> Ted: Going up a higher level, people see need for this

<Karen> ...if puling data for speed, you could miss a hard braking event

<Karen> ...or speed

<Karen> ...if insurance company, you want to know that

<Karen> ...if driver had an accident

<Karen> ...some people in mind; get this conversation minuted

<Karen> ...to reach out to

<Karen> ...and see in Semantic Web world, other data scientists; see how this is communicated

<Karen> ...get a myriad of different ones

<Karen> ...ways to have name space for a given collection of different methodologies used

<Karen> ...I will reach out to some folks, but wonder if anyone on the call is aware of efforts already?

<Karen> ...would be worth looking at

<Karen> ...Not to put Geotab on the spot, and we are running out of time

<Karen> ...Geotab patented their algorithm

<Karen> ...and they may be open to sharing more about that with the group

<Karen> ...and have it be an identified methodology

<Karen> ...will be important; mix of OEMs, mix in cloud

<Karen> ...very important topic

<Karen> ...want to do more homework but wonder if others have existing knowledge in that space?

<Karen> ...Take silence as no, or anyone unmuting?

<Karen> Glenn: Just add we have a short YouTube video that further explains what Darren talked about with the curved logic

<Karen> ...you can share this with the group

<Karen> Ted: not seeing it

<Karen> ...in webex chat

<Karen> Darren: I will paste into the chat

<PatrickLue> https://github.com/klotzbenjamin/vss-ontology

https://www.geotab.com/blog/gps-logging-curve-algorithm/

<Karen> Darren: That is a blog post that Neil, our CEO did on the curve log algorithm

Darren: that gives some more detail on what we mean and how we extend the Peucker algorithm

<Karen> ...in our mind, potentially using this would be good way to get effective data

<Karen> scribenick: Ted

Darren: we want to get affective data without overwhelming the server (or use more bandwidth than needed)

<Karen> Ted: That touches on another of main benefits

<Karen> ...cars will have limited connectivity and bandwidth

<Karen> ...look at edge computing

<Karen> ...whatever heartbeat to do something more intelligent like this

<Karen> ...I think on both of these, we can follow up on the mailing list

<Karen> ...next call is in two weeks

<Karen> ...I will work with chairs on any other topics to delve into

<Karen> ...and do more homework on this data contract

<Karen> ...see if there is some pre-existing work we can leverage, in SW world or elsewhere

<Karen> ...I also on the call on 14th

<Karen> ...I have a conflict

<Karen> ...would of course give preference to this meeting when possible

<Karen> ...if those on call today would be amenable to shift an hour later, that would be appreciated

<Karen> ...if not, I can sacrifice the other meeting; don't want to lose critical mass

<Karen> ...is 12:00pm EDT on 14th June ok?

<Karen> Magnus: ok

<Karen> @: Ok

<Karen> Glenn: ok

<Karen> ...if anyone has a problem, speak up

<Karen> Ted: you can also reach out privately and I will email group for call one hour later on 14 June

<Karen> ...top of hour

<Karen> ...thank you both Benjamin, Glenn, and Darren

<Karen> ...if there are additional materials in advance on either of these topics, please relay them

<Karen> ...Propose we adjourn if ok with chairs

<Karen> [adjourned]

Automotive Data TF Teleconference

31 May 2018

Attendees

Contents

Ontology

Registry

Data contracts

Summary of Action Items

Summary of Resolutions