<scribe> Meeting: Automotive Data task force
<scribe> scribenick: ted
<scribe> Scribe: Ted
Benjamin: largely the same as I presented at the F2F at Munich
<Karen> Ted: use q+ to ask a question on irc
https://www.w3.org/2018/04/AMM-VSSo.pptx
Benjamin: for those not at the
F2F, my PhD topic is semantic web technology on top of vehicle
data
... formally define main knowledge, interconnections with web
technologies
... in order to enable this we wanted a vehicle ontology. with
it I can make some powerful queries
... I started with VSS data model, it is quite open with
branches of different logical sections of signals
... leaves are attributes, the signal itself. there is some
descriptive aspects to the data, units etc
... VSS is not so small in comparison to other ontologies with
about 1k signals
... it is made to be extended with private/unique signals for a
specific vehicle or to be overwritten
... SOSA based
... it is pretty small and self contained. example on next
slide (6)
... I used VSS information to fill in gaps
... vehicle speed is an observable signal part of the
transmission branch
... branches are well defined in VSS
... I made several manual annotations. some times there are
names that are not unique such as speed
... it can be either from transmission, infotainment system or
other
... for other names that are the same in VSS but not referring
to same attribute (eg engine speed is rpm)
... I made them observable, actionable or both
... this means all signals have an actuator or sensors, all are
part of top branch
... vss[namespace] being the top branch
... left, right, front and rear are branches and it doesn't
make sense to have them as different classes
... seat for example should be defined once and instantiated in
different positions
... having data property of branch makes it much shorter
... it also reduces the ontology and is smaller than the
original VSS
... I am working with a SPARQL endpoint and playing with
connecting it to other devices
... what about OEM specific concepts?
... the question is not about defining a data model that meets
all needs but an extensible one
... you can make a private branch
Ted: what are the planned next steps? as I understand you identified a number of inconsistencies but do not believe you have a pull request pending against Genivi VSS
Benjamin: it will take some time,
it will never be perfect from a modeling point of view
... we have some consistency policies applied
... try to never depend on a unit but use a percentage
Ted: with imperial still being used by US and UK I understand why you use percentage but that doesn't work for some signals
Benjamin: some automotive
standard units don't make sense but that is what we need
... for these cases I have two solutions, one is abstraction of
unit system. I am talking about concept liter/kilometer
... there is one unit class that I can instantiate as a
datatype
... what I am using right now is a pretty nice
implementation
... when it is within a set of potential units I can have and
it maps to abstraction (eg kmph to mph)
Ted: do you have a fork somewhere since it isn't a pull request yet to VSS repo
Benjamin: yes, I'll drop a link
https://github.com/klotzbenjamin/vss-ontology
Benjamin: I have an application
and use case on analysis and annotation
... also working on private ontology extension and related
health check
Ted: that answers my second question, if there were tangible PoC for sparql people could play with
Benjamin: with only a few lines
of querying in sparql format you can get concrete results
... I will be further along in a couple weeks
<Karen> +1
<Karen> Ted: from F2F we talked about merging the approaches
<Karen> ...reachitect tree a bit to align with both
<Karen> ...I meant to formally ask the WG
<Karen> ...since it's taking a different approach on convergence
<Karen> ...Get Rudi and Paul's reactions
<Karen> ...I don't see any reason why it would be objectionable to look at the data model in the BG and come back to the Working Group
<Karen> ...Rudi is the gate keeper for the GENIVI VSS repository
<Karen> ...use the group mailing list for responses?
<Karen> @: I don't have an opinion now, and don't want to express a group opinion
PatrickL: I don't really have an opinion nor would speak on behalf of the group
<Karen> scribenick: Ted
PatrickL: flattened view would make sense. it is not really comparable. you can only compare the structure
Glenn: Geotab is relatively new
to W3C so will give a quick context about who we are
... we are a global telematics service provider, an engineering
house. we cover light and heavy vehicles plus specialized
vehicles
... we have N billion data points collected daily
... Darren will talk more about the data contract. what the
developer does between the vehicle and data center
... he can explain how we sample data and the importance of
conveying metadata on the means
Darren: Geotab was originally
only interested in GPS data and we had a pretty elaborate way
on saving that data, we even patented it
... it handles changes in speed, heading, handled certain
trigger events
... it still had challenges. you could potentially miss the top
speed on a speeding event for example
... you risk either having too much data or cutting corners and
missing key information
... we switched to a curve based approach based on Peucker
algorithm
... we used this curved approach and provided an accurate
profile, capturing peaks, troughs and cornering events
... we added to this curve logic. you collect a set of data and
run the curve on it. we wanted to integrate time sensitive
aspects
... we do some estimate forecasting on where the data will
be
... at a minimum client app would send two sets of data to the
server and it could extrapolate from there
... the server would predict where the vehicle would be and if
it differentiated substantially it would run the curve
algorithm again so as to be send a more accurate update to the
server
... this way we could reasonably accurately show location in
realtime
... this curve based approach is applicable to other types of
data from the vehicle
... our server needs to know how the data was collected
... when we implement we identify how we save data on the
device. we have configuration on the server for specific data
points to be able to check for accuracy
... if it uses curve logging the server can interpolate. if
concept of estimate error is available it can make
predictions
... in essence that is the data contract we have between client
and server
... we allow other devices on the vehicle to relay information
through our on-board devices eg fuel level sensor
... if that third party device was collecting information in a
different manner we wanted to know how. it was breaking the
contract with the server
... we had to come up with a different dataset for these
devices to avoid confusion with information collected via
different means
... we solved it in a way that works in our framework and
architecture. when it comes from a different device we have to
save in a different manner
... we need to know the type of data and how it is being saved.
it would be ideal to express and send how the sampling was
done
PatrickL: [summarizes as clarification]
... it is more describing what you need in the data in order to
assess value
Ulrich: how and why do you save
this data?
... do you pass which algorithm is used, specifically where in
infrastructure is it sent?
Darren: when we implement saving
of the data in the firmware (client device) we identify how it
is collected
... it is predefined and understood by the server. every time
we add new data we have to configure on the server how to treat
it
... if it comes in some unknown manner we flag it as such
... the ability to add this metadata is useful for us
Ulrich: understood. what is the granularity you do this, per device, data element or source?
Darren: the error value dictates
the granularity of the data
... we do not pass that either
... each piece of data has a separate error value for the
curve. they may have similar values
... fuel level may have an error value of 2% whereas rpm could
be 100
... each one has different error values and would depend on the
nature of data itself and deemed need for accuracy
... the beauty of the curve logic is between any two points you
can fill in any of the in between within that margin of
error
... that piece of information in our mind should also be
included with the data. I have a given dataset, know how it was
collected
... those kinds of pieces of information we envisage
including
<Karen> Ted: Going up a higher level, people see need for this
<Karen> ...if puling data for speed, you could miss a hard braking event
<Karen> ...or speed
<Karen> ...if insurance company, you want to know that
<Karen> ...if driver had an accident
<Karen> ...some people in mind; get this conversation minuted
<Karen> ...to reach out to
<Karen> ...and see in Semantic Web world, other data scientists; see how this is communicated
<Karen> ...get a myriad of different ones
<Karen> ...ways to have name space for a given collection of different methodologies used
<Karen> ...I will reach out to some folks, but wonder if anyone on the call is aware of efforts already?
<Karen> ...would be worth looking at
<Karen> ...Not to put Geotab on the spot, and we are running out of time
<Karen> ...Geotab patented their algorithm
<Karen> ...and they may be open to sharing more about that with the group
<Karen> ...and have it be an identified methodology
<Karen> ...will be important; mix of OEMs, mix in cloud
<Karen> ...very important topic
<Karen> ...want to do more homework but wonder if others have existing knowledge in that space?
<Karen> ...Take silence as no, or anyone unmuting?
<Karen> Glenn: Just add we have a short YouTube video that further explains what Darren talked about with the curved logic
<Karen> ...you can share this with the group
<Karen> Ted: not seeing it
<Karen> ...in webex chat
<Karen> Darren: I will paste into the chat
<PatrickLue> https://github.com/klotzbenjamin/vss-ontology
https://www.geotab.com/blog/gps-logging-curve-algorithm/
<Karen> Darren: That is a blog post that Neil, our CEO did on the curve log algorithm
Darren: that gives some more detail on what we mean and how we extend the Peucker algorithm
<Karen> ...in our mind, potentially using this would be good way to get effective data
<Karen> scribenick: Ted
Darren: we want to get affective data without overwhelming the server (or use more bandwidth than needed)
<Karen> Ted: That touches on another of main benefits
<Karen> ...cars will have limited connectivity and bandwidth
<Karen> ...look at edge computing
<Karen> ...whatever heartbeat to do something more intelligent like this
<Karen> ...I think on both of these, we can follow up on the mailing list
<Karen> ...next call is in two weeks
<Karen> ...I will work with chairs on any other topics to delve into
<Karen> ...and do more homework on this data contract
<Karen> ...see if there is some pre-existing work we can leverage, in SW world or elsewhere
<Karen> ...I also on the call on 14th
<Karen> ...I have a conflict
<Karen> ...would of course give preference to this meeting when possible
<Karen> ...if those on call today would be amenable to shift an hour later, that would be appreciated
<Karen> ...if not, I can sacrifice the other meeting; don't want to lose critical mass
<Karen> ...is 12:00pm EDT on 14th June ok?
<Karen> Magnus: ok
<Karen> @: Ok
<Karen> Glenn: ok
<Karen> ...if anyone has a problem, speak up
<Karen> Ted: you can also reach out privately and I will email group for call one hour later on 14 June
<Karen> ...top of hour
<Karen> ...thank you both Benjamin, Glenn, and Darren
<Karen> ...if there are additional materials in advance on either of these topics, please relay them
<Karen> ...Propose we adjourn if ok with chairs
<Karen> [adjourned]