See also: IRC log
<phila> Jacco and Phil made general opening welcomes
<phila> Keith: Follows slides which are self describing
<phila> scribe: phila
<scribe> scribeNick: phila
Keith: Talks about context of a
    citation, which includes many facets
    ... Gets in a slight dig about 'the later Dublin Core'
[Not making many notes here as Keith's slides are comprehensive]
AndreaPerego: You said you have mapping, Keith. Temporal dimension seems to be missing?
keith: Yes, it's missing and we know that. Metadata often ignores temporal
AndreaPerego: Do you plan to add, maybe using PROV?
Keith: We're working on Prov-o with Kerry Taylor, and there's an ENVRI+ project
PeterW: Acronym hell. RDA means something else (Resource Description and Access)
Keith: Sorry, yes. resource Description
PWinstanley: So RDA is the body to use for this?
Keith: I did point out the acronym clash
ThomasDH: You talked about
    locations and persons etc. In the EU context we have the Core
    Vocs. I hope we can merge? Across Govt and science?
    ... Don't want differnet standards on different levels.
Keith: Yes, the VRE4EIC project
    has that in its sights. CERIF has concept of declared
    semantics.
    ... Doesn't say you must use these semantics, but provides
    containers for semantics
<PWinstanley> s/Agaist/Against
AG: Talks about The HCLS
    Community Profile: Describing Datasets, Versions, and
    Distributions
    ... Talks about origin in OpenPHACTS project
    ... Highlights ChemBL versioning issues
    ... ChemBL was at version 13, but that number wasn't in our
    data
    ... Still don't know if we used version 8 or 13 in
    OpenPHACTS
    ... Includes provenance feature so you can see where data items
    came from
    ... Now using ChemBL 20
[Slides are self describing]
AG: contrasts DC and VOiD as opposite ends of spectrum, neither met requirements for HCLS
-> https://www.w3.org/TR/hcls-dataset/ Dataset Descriptions: HCLS Community Profile
AG: Talks about mandatory and
    optional properties
    ... This requires tooling
    ... Developed the validata tool, more on that tomorrow.
    ... Several implementations of HCLS profile
    ... Emphasises thjat we need to know about versions
Q: Adopted beyond your community?
AG: Not aware of it but it is generic and could be
Q: In latest version of DCAT-AP covers some of what you say
AG: isVersionOf didn't exist when we were doing this 4 years ago, glad it's in DCAT-AP
Jacco: Debate about whether version is in the URL?
AG: We don't say that, just that there should be different URLs for summary description, etc.
AndreaPerego: Introduces DCAT-AP
-> https://joinup.ec.europa.eu/asset/dcat_application_profile/description DCAT-AP
AndreaPerego: Introduces JRC
[Slide on JRC is self explanatory]
AndreaPerego: Talks about wide
    variety of methods and standards. Some people asking what
    metadata is
    ... Talking about citations. Some people don't care about their
    data being cited.
    ... Prov used for complex/complete info
    ... On Data Citation
    ... Data reproducability is important for policy as well as
    science
    ... Did a mapping exercise between DCAP-AP and DataCite
    ... Mostly good matches
    ... Agent Roles seems particularly hard
    ... May need a registry of roles to use across standards
    ... Skips to Publishing metadata on the Web
    ... Talks about mapping to schema.org
    ... Identified some gaps. But do we need to fill those in
    schema.org?
    ... Do we need to publish all our metadata, or just what
    improves visibility?
AxelPolleres: Is there any effort
    to endorse identifiers like ORCID?
    ... The link with STORK etc. would be interesting, but there's
    no initiative AFAIK
Keith: Often IDs are associated with a role, like ORCID and Driving Licence info
Ivan: Force11 had their general principles. Did you match against those?
AndreaPerego: Yes, we have looked
    at that, and FAIR
    ... Trying to address practical issues
Ivan: Sure they're at a higher level
Markus: I'm release manager of
    DBPedia
    ... We have a lot of data in our releases
[Slides include text]
<AxelPolleres> FWIW, further to my question… there seem to have been some efforts to e.g. link STORK (national eIDs) to ECAS, cf. https://www.eid-stork.eu/index.php?option=com_content&task=view&id=253&Itemid=83 … the reason why I had asked about links to ORCID is that many of the information you have to provide to the EU for ECAS overlap with info covered in ORCID, e.g. publications, grants, etc.
[Slides still self-explanatory]
Markus: Talks about core and extensions in DataID for things like statistics, you need extra fields
phila: You used ODRL a little, but not a lot. Is it lacking?
Markus: Nothing fixed yet, open to change
phila: GOod - ODRL on Rec Track now so speak up!
CM: Work with Soeren Auer
    ... Introduces Smart Services and Industry 4.0
    ... Talks about needing to be aware of privacy and some control
    over data
    ... Want to build reference architecture for secure data
    infrastructure, retaining sovereignty
    ... IDS = Industrial Data Spaces
    ... Industrial Data Space vocab as glue to capture
    domain-spcific semantics
[Slide self explanatory]
CM: IDS defining own protocol
-> http://ids.semantic-interoperability.org/ The Industrial Data Space Metadata Vocabulary
Q: How specific is this to industrial data? Can it work in other domains?
CM: It's about requirements, like security. Things like which vocabs to use for different tasks, not domains
PeterW; have you though of entity resolution. What metadata to associate with their data? They may not know.
CM: No, we've not looked at that. We want to partner with data publishers and help them make their data more easily found on the Web.
nandana: 2 use cases we have
    problems with
    ... Discovery, I don't want to spend a lot of time
    searching.
    ... data for training machine learning is hard to find
    automatically
    ... Another use case - if Imn an ontology engineer, I'd like to
    see how my vocab has been used in a dataset, to see if my
    conceptualisation matches reality
    ... e.g. where is the SSN Ontology used?
    ... This can be done using LOD stats but if you want to know
    how they were used, ranges etc. that's harder
[slides descriptive]
nandana: Wraps up brief presentation and invites questions
AG: Can you give a statistical report about which properties and classes are linked
Keith: Big range of topics. Expressivity etc.
makx: I was hearing things like
    there is this standard, but it didn't work for me.
    ... You need to be aware - we need to try and solve a problem.
    DC tries to ;look at common problems, CERIF tries to go into
    depth
    ... We spent most time trying to solve common problems. DC and
    DCAT start simple and then people complain that things are
    missing
    ... You can extend
    ... You come up with different requirements and you soon come
    up with 50 properties that are never used
MF: I agree. The general approach
    of DCAT has its benefits
    ... But there is important data misisng when we handle
    datasets. For e.g. more specific prov info
    ... Basic pattern of catalogue, dataset and distribution has
    prevailed
    ... But we shojld look at how to improve DCAT and that's whey
    we're here
AG: In HCLS we didn't want to cme up with a standard, just a profile that used existing ones
PeterW: I find lots of people
    talking about differente metadata frameorks but less about the
    data that goes into them
    ... Some ilustrations of marked up stuff. If I have a dataset,
    what are the frameworks that match the pattern of the data that
    I have
    ... Maybe ML techniques can be used.
Keith: The papers have more. The RDA has a metadata standards group that is making a list of hte available metadata schemes. Nots of work coming from Digital Curatiuon Centre (see Kevin Ashley)
Q: Automatic machine readable to access the data itself, not just a URL
MF: Yes, this is a task for
    us
    ... This problem came up a lot
    ... I'm hooing for insights from other directions
Q: Accessing satellite data for e.g. you need to restrict the access to specific subsets.
scribe: There are rest APIs like Swagger, but there's no predefined method
CM: Lotys of approaches for
    describing services on the Web, but haven't had a lot of impact
    for some raeson
    ... Maybe because they introduce complexity
nandana: Hydra CG is in thaty direction
Q: But that's very restricted to Rest.
AndreaPerego: We have a bar camp on this specific topic :-)
MF: It's a big issue. We're dealing with datasets usuall,y not endpoints
phila: Talks about subsetting issue Open Search etc.
Keith: You can't get into the data because it's too big so you don't know what to ask for
AndreaPerego: Talks about different levels that can be addressed. Need to include users
Keith: Geonetwork allows you to peek into the data to see if you're in the right area
CM: Working with industrial partners - tooling is very important. If you have a schema, you need the partners to tell you the detail you need
AG: We developed a very specific
    tool that was user-driven
    ... focussed on user-friendliness so it's not easily
    transferrable
<danbri> :)
<PWinstanley> could panel members please talk to the room rather then just among themselves - it's not easy to hear them without PA sytems
<danbri> there's a microphone on the smaller desk, is it not wired up?
<PWinstanley> @danbri: it is needed at the larger table for the group discussion
Call for greater clarity in some of the DCAT definitions. Also guidance, perhaps a primer. Take various national APs as input
<PWinstanley> @phila: https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Jul/att-0010/DCAT-APimplementationguide.pdf needs to be updated
<danbri> can I respond to the google/schema question?
<danbri> ok will respond later
Dee is in charge, not me :-)
Discussion around data that is not published in rarely versions
Need to handle data that changes all the time (real time data etc.)
<AxelPolleres> +1 volatile/dynamic datasets probably need different metadata than “slower changing” datesets… where more versioning vocab is an issue.
<PWinstanley> mutable vs immutable datasets is relevant information
<AxelPolleres> there are different forms of “mutable”, e.g. (monotone) growing vs. actually changing… is that reflected in any of the existing vocabs?
<PWinstanley> @AxelPolleres: yes, that's my point
<AxelPolleres> for us (use case crawling and tracking changes/evolution) it would be very(!) useful if these were advertised.
<jrvosse> Danielle Bailo: what are the boundaries of DCAT?
<jrvosse> Andreas Kuckartz: DCAT seems less useful for describing binary programs
<jrvosse> Makx: Some people in the WG see DCAT as very general that can describe many things
<newton> PWinstanley and AxelPolleres it's a real issue, we had some discussions about it during DWBP meetings
<newton> I would like to see this addressed on the charter of a new WG
<AxelPolleres> newton, are you aware of any vocabs that actually define this difference? i.e., monotone groth vs. arbitrary changes, changeFrequency, groethrate, etc.?
<antoine> newton, Axel: there used to be a vocabulary for 'accrcual policies' at DC. Mayb e not the right granularity though.
<Caroline_> Present_ Caroline_
<jrvosse> Linda van den Brink from Geonovum on geospatial data
<jrvosse> ... a key problem is that people from outside the geo domain do not understand the standards we use
<jrvosse> * I'm scribing but feel free to add
<scribe> scribe: Jacco
<scribe> scribeNick: jrvosse
Linda is discussing a testbed testing use of mappings in the context of geoDCAT (see https://joinup.ec.europa.eu/node/154143/) and schema.org
see slides for testbed report
Q: Jacco: what do you think the key mission of a new WG be?
A: Lynda: Small core of a standard, for SDI coverage is really key, quality is also very important
Q: Phil: is something like the dcterms spatial concept core?
A: Linda: yes
Q: Daniele Bailo: Is the loss of data in the mappings really an issue for end users on the web?
L: Linda: Maybe not, for discovery it may not be a problem. There are levels of importance
Andrea Perego on GeoDCAT-AP
GeoDCAT-AP not replacing existing standards such as INSPIRE or ISO 19115 metadata for spatial, but providing extra interoperability by providing RDF-binding
scribe: need for http conneg on
    profiles/schemas not just on format
    ... need to model dataset distributions, distinguish data sets
    from data APIs
    ... need for best practices for quality-related descriptions,
    there are too many patterns/standards
Q: Keith Jeffrey: need spatial coordidates both for what is observed and from where, what are your thoughts?
A: Yes, this is a difficult problem, also in crowdsource context and other contexts, but is not addressed at the metadata level, more at the level of the features
Q: Herbert: New iso spec "ResourceSync" from those that made PMH, but more "webby"
Herbert: I'm involved in signposting.org which is also relevant
Otakar Čerba joins panel
Otakar Čerba I'm here because we are developing a smart points of interest RDF dataset with 120M POI published, incl via a SPARQL endpoint
Daniele Bailo joins the panel
Daniele Bailo represents the EPOS geo ESFRI with lots of geospatial data
scribe: with many different types
    of data, this needs to be reflected in the metadata
    ... need to think about who the audience is: general web users
    vs scientists from specific domains?
    ... need to think about who the audience is: general web users
    vs scientists from specific domains?
Q Bart: is just getting the metadata currently not too complicated already?
<AxelPolleres> remark Re: dataset vs service description - this is also to some extent related to the issue we mentioned before some time up in the chat about fast-changing/highly dynamic data (which may be rather seen as a service than a dataset)
A Andrea: Yes, for the general public ISO may be too much, especially if it is just for discovery purposes
scribe: for us , a dataset is what you decided to call a dataset
Linda: I see dcat as something for portals to find and reuse each other data sets, not necessarily as something for the end user
Daniele: I know the scientific user relatively well, typically does not want general web search. The "web use"r could be an software agent or human user.
Otakar: same experience, users often do not use metadata. We have many Czech data portals but few real users
Andrea: my students use Google also because they do not know where the data is, this also makes it important to publish data on the Web
Daniele: I agree, but I'm trying to understand the requirements for doing so. In my community people tend not to use persistent IDs or even URLs. This is a challenge/
Bart: high level conclusion could be that there is too much info from the data in the metadata
Otakar: we also need feature metadata in the geo spatial domain
Andrea: data quality is more general that just spatial, and solutions can be reused for other domains
Linda: spatial coverage is key for first discovery step, use of all other quality and prov metadata is part of a second step
Daniele: what is need is on the scientific side is a huge effort on data and metadata harmonisation
<deirdrelee> scribe: deirdrelee
Show Me The Way session
Searching for data session
Dmytro Potiekhin
CivicOS: Governance & Campaigning Data Standard
dmytro: worked in ukraine
    ... important to work with civil society and citizens is very
    important when there is danger of falsification at
    elections
    ... integrating data is an important issue to protect
    democracy
    ... it is obvious w/out a proper voabulary describing needs of
    civil society, this is impossible
    ... secondly, it is impossible to create such a vocabulary from
    top-down approach
    ... e.g. even with vocabularies that the european commission
    are working on
    ... this vocab or set of interoperability vocabs must be demand
    driven
    ... something that is accepted by the citizens
    ... this is CivicOS
    ... if we can unite efforts around development of such
    vocabularies, I am glad to help and this is what I am trying to
    do with colleagues
    ... e.g. i am collaborating with the Stanford ??
    Institute
    ... this is not just a problem for Ukranians, but it is a
    global problem
    ... my final request would be, not to just give everything to
    the governments.
    ... in democratic societies, it is okay for governments to have
    all this technology, etc.
    ... but in countries still fighting for democracy, this can be
    a problem
    ... for an example, there are often petitions to put pressure
    on governments. but if the petition is done by the governments,
    it is bureaucratic. and it is also giving them a contact list
    of people that disagree with them
    ... undermining civil society and what they are trying to
    achieve
    ... i encourage to keep developing vocabularies, but also to
    retain the activation and development of civil society
kevin: questions?
    ... the situation at the moment isn't ideal for discovery and
    interoperability of data, for the use-case you are talking
    about - empowering citizens
dmytro: for the commercial part
    it is working great, e.g. flight information automatically
    added to google calender
    ... so standards are already working, but this needs to be
    brought to our community
    ... in egovernment, we see this too. but we see a trend to
    focus on egovernment, and not on egovernance or
    ecivilsociety
    ... these platforms should be controlled by civil society, not
    by dictators
    ... even if personal identity issues are resolved, there will
    be interoperability issues
    ... and this integration should not be less successful than
    government or commercial sectors
    ... we need to apply these commercial standards in the
    government and civil society sectors
phila: you mentioned you wanted to integrate with schema.org
dmytro: we are experienced in the
    structures of what makes civil society works
    ... but these are not described in schema.org
    ... e.g. we have a list of 200 different types on non-violent
    actions
    ... the leading vocabularies only document about 5
    ... we would like to collaborate on the development of
    vocabularies and how to incorporate into schema
danbri: you can just go ahead and
    develop a vocabulary. we have built extensions that facilitate
    that
    ... there are some generic descriptions that could potentially
    in schema.org core, and for more detailed terms, extension
    might be best
    ... but happy to chat
attendee: there is some similar
    work being done in the US, by beth novack, called ???
    ... this can have an impact and is similar to what you were
    talking about
<danbri> cf huridocs for human rights documentation
danbri: we were actually
    discussing schema.org and the documentation of hate crimes last
    week
    ... happy to continue discussions
Raf Buyle
Raf: representing the Flemish
    Government
    ... we believe publi services should be centered around
    citizens and businesses
    ... today if you ask for info online about opening times and
    location of publicc building you get it
    ... we think this should go further, e.g. providing info on
    using services
    ... need to link to base registries
    ... flemish governmetn is working on strategy to add markup to
    government portals
    ... we have seen success with schema.org, etc. this can be a
    bridge between public and private sectors
    ... the citizen wants to find the info on the public service
    they want, regardless of public body providing it
    ... we are looking at using and extending open standards, e.g.
    from W3C, ISA, OGC, etc.
    ... the European Interoperability Framework states that you
    should look at all layers of interoperability, semantic,
    technical, etc
    ... base registries are fundamental, but it is very difficult
    to get this data on the web, to integrate it with the private
    sector
    ... imagine if we could ask private company, like google, about
    public services. where you could make an appointment, all the
    information at a user's fingertips
    ... bridging between public and private sectors. schema.org is
    working very well. it has been widely adopted
    ... this could be a strategy to get public services information
    out there
    ... schema.org was first to discover data, but it is also used
    for new data services, e.g. bing and google knowledge
    graph
    ... we have a pilot [slide with architecture diagram]
    ... we would like to combine schema.org with ISA core
    vocabularies
    ... rdfs:seealso pointing from a schema.org resource to a isa
    core voc resource shows that more info is available
    ... we are waiting to rolling this our on local and regional
    level
    ... on the one hand, we are saying it is not difficult to
    annotate data in this way
    ... we also want to see if these annotations are picked up by
    major search engines
    ... and also interested in seeing if the search engines will
    pick up the extra ISA core voc info and display that as
    well
    ... i have some questions [FEEDBACK slide with questions]
kevin: questions?
PWinstanley: what kind of mechanisms can we use to avoid false information getting into system?
Raf: i talked about a feedback loop. perhaps there could be a validation check comparing the original data and data being presented
Luis-Daniel Ibáñez
Luis-Daniel: For better data
    search
    ... we carried out an analysis of data searches by talking to
    data professionals and analysing logs from data portals
    ... [reads feedback from interviews - quotes from data
    professionals]
    ... we found a lot of things that we discussed in previous
    talks
    ... something maybe to highlight is users asking for a
    summary/preview of data
    ... with quantitative results, mainly desktop devices,
    etc....
    ... 68% of queries came from web search engines, suggesting
    that dat search is a work-related activity and people are
    relying on general-purpose search engines
    ... is this because people use what they know or data portals
    are not doing their job properly? still open question for
    us
    ... query characteristics show exploratory search, e.g. 'crime'
    - show me all crime data, not specific query
Artemis Lavasa
Artemis: our aim is to capture,
    analyse and preserve data
    ... we need to preserve the tools, processing steps, etc. we
    capture everything
    ... we want to have as much context as possible so that we can
    recreate thata in future
    ... we capture all that information via our forms
    ... we describe our information using a json-based schema
    ... it can handle complex metadata, which we have
    ... the data capture forms are rich, so can be very long and
    vary a lot from experiment to experiment
    ... [showing slide of example metadata]
    ... we work closely with physicists and callibrate them
    according to their needs
    ... in order to facilitate search, we need this metadata. e.g.
    a physicist might want to look at a particular particle, so
    looking at the title of the metadata is not sufficient
    ... we need intelligent search, very precise
    ... we played around with schema.org and json-ld. we could
    describe the high-level information, but not specialised
    fields.
    ... we would like to use a standardised approach
    ... i have tried to harmonise the schemas we have, but 80% of
    fields are something unique to what a physicist wanted
Alejandra Gonzalez-Beltran
Alejandra: project funded by NIH
    in the US
    ... DATS DatA Tag Suite is used to index data sources in
    datamed
    ... [slide with online links to work]
    ... we focus on the findability and accessiblity of
    datasets
    ... we rely on adoption by data providers
    ... we started by collecting lots of use-cases from the
    community and by looking at existing schemas
    ... we considered multiple existing models, e.g. schema.org,
    datacite, rif-cs, hcls, dcat, etc
    ... these models are lacking some elements in use-cases
    ... we also looked at domain-specific models from biomed
    domain
    ... the DATS model is a combination of elements we needed
    ... we split the model into core entities (adopted elements
    from datacite and Force) and extended entities
    ... we did a mapping to schema.org and looking at elixir
    ... there are adopters of DATS, implementing it in their
    systems
    ... i would like to thank groups that were involved
Richard Nagelmaeker
RRSAgent: draft minutes
Richard: I would like to pitch an
    idea to you
    ... when i started with Linked Data, the idea was to put all
    data in one triple store
    ... the internet has DNS
    ... [shows slide with diagram]
    ... there is data that as an organisation you have control
    over, and can have IRIs part of big picture
    ... but there is also information that as an organisation you
    want to know, but is external to the organisation
    ... e.g. customers, suppliers, etc. but as they are external
    they will have different IRIs
the issue is that behind a sparql endpoint will always contain the IRIs of the domain of the endpoint
scribe: DNS cannot help us
    ... but the problem is similar to what DNS solves, so could it
    potentially help us find Linked Data IRIs?
    ... there are a number of building blocks, e.g. triple stores,
    sparql endpionts, VOID
    ... results slide..... resolves the discrepancy between dataset
    IRIs and IRIs of a SPARQL endpoint
kevin: what evidence do we have
    that any of the efforts we've been talking about today will
    help people find the data they want?
    ... if we don't have evidence, how can we get it?
richard: just do it!
kevin: instead of let's building it and see what happens, well with the web, once something is implemented, how can you measure before?
PWinstanley: there was a project in 2004 on bioinformatics [Cancer Bioinformatics Grid -- caBIG] that disappeared. Were the lessons learned from that picked up by alejandra's project?
alejandra: there was a heavy
    load, you had to build uml model, tag with ontologies, etc. I
    think the lessons learned is that there is a more 'webby'
    approach, lighter, easier
    ... at least in biomedical databases, there is a lot of effort
    in curation. many databases already have ways to find
    data
    ... hopefully we will help people search across databases
kevin: luis, you have looked at what users are actually behaving on open data portals
luis: one observation, for the qualitative part we were with data experts, but with the quantitative part, it was open to all users. those that just wanted an answer, not necessarily 'data'
danbri: there are two very
    different paths, one making data available to billions of
    people, and making data available to the tiny minority of
    people who want to analyse specific data
    ... both are important and can have huge impact, but very
    different.
    ... ultimately, we want computer/google that knows the
    information, not just the data file
attendee: for luis' presentations, the one-word searches might be more related to people just finding an answer, not that there is structured data behind them
luis: we also know what people actually click on, not just search
Andreas Kuckner: will technologies like sparql still play a role in ten years?
richard: it depends how you look
    at IT. I think IT is a tool to help people. in this way i think
    sparql will be there
    ... the way i look at neural networks, they are trying to do
    something by themselves, this is a different kind of IT
Raf: if you can look at rdf and sparql, i think these are approaches, moreso than technology
Artemis: i think in one way or another we all use rdf, so if not in this form it will survive in some form
luis: i think neural networks
    will learn how to use rdf
    ... but the big question...will neural networks replace us
    all!
danbri: sparql is a very practical technology, which tend to stick around. I'm sure it'll be seen as a tool for using data, like sql and purl. but AI might increase more and more
kevin: will it be difficult to enrich data?
luis: it is important to know what has been done to the data, for example with crime data if the data was anonymised on purpose, should there be an effort to uncover the data that was removed?
alejandra: whatever the data is,
    what we care about is finding patterns in the data ...
    ... [question to luis] because you were looking at user search,
    were you constrained by keyword
luis: we wanted to see if people asked questions or used keyword search
phila: In Raf's case, data is
    relevant to everyone in Flanders (public) and Artemis' case is
    relevant to very specialised physicists
    ... danbri said that csv data can be incorporated into google's
    knowledge graph using csv on the web
aremis: there is a cern data
    portal, with huge data releases - TBs and PBs of data
    ... there is also private data, meant for collaborations
    ... the analysis is for specific purposes, people wanted to
    preserve this, but it is very sensitive, it won't be opened.
    most people also wont be interested in this data
    ... aim was to help physicists preserve their analysis. that
    was the demand
Raf: why are public services data
    important?
    ... 1. if I want to move to flanders
    ... and want to set up a business
2. for business intelligence - e.g. you can compare how flanders compares to other regions, e.g. for a place to live
scribe: if this data is on the Web, more people can use these public services, lowering the barriers
danbri: a good thing from private orgs is that even if they can't release the data, you can release software
kevin: and also release info about the dat ais there, so that people can follow up on potential data access
BartvanLeeuwen: Raf, you asked is
    'annotated data the new datset'?
    ... so danbri is this something that will be possible
danbri: you can do
    ?productname?
    ... there will also be dataset search, there is a page onine
    already, will distribute
    ... we are looking at research data, data portals, we'll see
    what we can build
Raf: a lot of portals have feedback channels. should schema.org incorporate that?
danbri: maybe. schema.org is a dictionary, you have to build things with it. we have reviews, rating, etc
Attendee: Describing datasets properly is a problem. also the problem is mapping the questions from natural language to sparql for example. A lot of problems around data discovery relates to where the data is and what kind of data there is. Any comments on natural language to formal queries?
Alejandra: there is one pilot project who are looking into this question of how the user can find datasets
danbri: we did put
    question/answer in schema.org,e.g. stack overflow. and
    researchers are starting to pick up on that
    ... i would hope there would be more focus on social aspects of
    open data portals, which could in turn help discoverability
attendee: What mechanisms can
    address general search but also very focused search?
    ... how does this affect reproducability
Raf: we combine schema.org at a general level for discoverabilty, which we combne with the core vocabularies which help with the specifics
alejandra: in the curation
    practices, generically it is very important to consider this.
    it is very relevant to know higher level terms and dmore
    specific terms that are speific
    ... for reproducabilty it goes much further, you not only need
    the discoverability metadata, but also how the data was
    prepared, etc.
danbri: we recentely added a field in schema.org variable
luis: to me there is the dataset
    search levels in metadata
    ... but to answer a more detailed question, you have to go
    deeper
    ... what is the effort involved in adding this to metadata
AndreaPerego: another piece of
    information on helping to find the data is how the data is
    being used
    ... e.g. feedback from users, this is important data
    ... datasets have been used for purposes other than their
    original purpose
    ... people can see how other people have used the data and it
    might help them decide if it's useful for them
Raf: if you knew information
    about when people physically go to public services, this could
    help advise when people should go
    ... info on how public services are used could help improve the
    service provided
luis: i agree, it's important to know how data is being used, but it's difficult to convince users of this
alejandra: something very important is data citations. it is great to have it, but it is limitations
danbri: when datasets are used
    and discovered, they can go to their funders and justify the
    availability of data
    ... data citation in the scholarly sector is done, but it is
    not common for example in media
    ... this might be turning point
alejandra: it is also important to have contact information
<newton> wonders if W3C WebMention spec could help with this issue of citation and data usage [ https://www.w3.org/TR/webmention/ ]
kevin: it is difficult to measure
    if we had implemented something differently, how would impact
    be different
    ... guidelines like the w3c dwbp and csv on the web have been
    referenced a lot today, they're obviously very useful for the
    community
RRSAgent: generate minutes
Time for wine and canapes!!!
<newton> +1 Dee
This is scribe.perl Revision: 1.148 of Date: 2016/10/11 12:55:14 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/RDA means something else/RDA means something else (Resource Description and Access)/ FAILED: s/Agaist/Against/ Succeeded: s/PeterW/PWinstanley/ Succeeded: s/Dataset Description Mod/Dataset Description Models/ Succeeded: s/coves/covers/ Succeeded: s/ ion / on / Succeeded: s/me Caroline_ yes, apologies for not mentioning all of you!// Succeeded: s/front/smaller/ Succeeded: s/terribly/very(!)/ Succeeded: s/Lynda/Linda/ Succeeded: s/ISO/ISO 19115/ Succeeded: s/Hermert/Herbert/ Succeeded: s/Resources/ResourceSync/ Succeeded: s/This si/This is/ Succeeded: s/beuracratic/bureaucratic/ Succeeded: s/questiions/questions/ Succeeded: s/aboutj/about/ Succeeded: s/publi/public/ Succeeded: s/How we search for data? Towards User-Driven dataset descriptions/Topic: How we search for data? Towards User-Driven dataset descriptions/ Succeeded: s/roll/role/ Succeeded: s/bioinformatics ??/bioinformatics [Cancer Bioinformatics Grid -- caBIG]/ Succeeded: s/Caro// Succeeded: s/communityl/community/ Found Scribe: phila Inferring ScribeNick: phila Found ScribeNick: phila Found Scribe: Jacco Found ScribeNick: jrvosse Found Scribe: deirdrelee Inferring ScribeNick: deirdrelee Scribes: phila, Jacco, deirdrelee ScribeNicks: phila, jrvosse, deirdrelee Present: PWinstanley AndreaPerego nandana phila newton Ivan BartvanLeeuwen Caroline_ brandon damires LarsG Caroline Agenda: https://www.w3.org/2016/11/sdsvoc/agenda Got date from IRC log name: 30 Nov 2016 Guessing minutes URL: http://www.w3.org/2016/11/30-sdsvoc-minutes.html People with action items:[End of scribe.perl diagnostic output]