Government Linked Data Working Group Teleconference

31 May 2012

See also: IRC log


Bernadette, Sandro, Michael_H., John_E, George, Mike_P., Boris, John_Barker_(Wolters_Kluwer), Dave, Richard_C, Deirdre, Martin, gatemezi, MacTed, Tina_G.


<trackbot> Date: 31 May 2012

<olyerickson> Joining shortly...

<bhyland> Today's agenda: http://www.w3.org/2011/gld/wiki/Meetings:Telecon20120531

<bhyland> Guest speaker today is John Barker, see bio http://solutions.wolterskluwer.com/blog/author/john/

<olyerickson> mute me.

<olyerickson> mute me

<boris> http://www.kit.edu/index.php

<olyerickson> @bhyland: Wie gehts?

<olyerickson> 12 people and a robot named Zakim...

<sandro> [ sorry -- last-minute regrets from me. ]

<sandro> [ I think. ]

<bhyland> Propose to accept: http://www.w3.org/2011/gld/wiki/Meetings:Telecon20120524

<mhausenblas> +1

Propose to accept minutes?

<George> +1

<boris> +1

<gatemezi> +1

Minutes accepted

<MartinKaltenb> +1

<scribe> Scribe: Mike_Pendleton

<olyerickson> No. the student we have doing the PROV work is too busy doing the PROV work to talk to me :(

Bernadette intoduced John Barker, Wolters Kluwer

<gatemezi> URL for Wolter Kluwer: http://solutions.wolterskluwer.com/blog/author/john/

<olyerickson> John Barker's related blog post: http://bit.ly/JPaUDW

John leading interactive conversation; open to separate dialog later

<olyerickson> Wolters Kluwer intro

John B: WK is a Dutch company, operations in 35 countries; provide content, now software combined with content; take public domain data, classify, enrich and publish it for customers.

scribe: supplier of data to government and cosumer of government data
... helped US publish IRS code 100 years ago
... topical classification scheme used by multiple governments

<olyerickson> Q: John is saying these silos would be "ideally" linked, but are not necessarily linked (today)

<olyerickson> Q: Does Wolters Kluwer have it's own identification scheme? Does it leverage existing LEIs?

scribe: legislation is comprehensively linked to exec branch regulations; WC collects data from government; glad US is offering free content and investing in more metadata - leads to better outcomes for WK customers. Would like government branches to agree on standards on how different source of law are referenced.

<bhyland> WK = Wolters Kluwer

scribe: WC desire for standards is beyond U.S., and includes China. WK wants a time based element to metadata. Research is spent on determining what version of a statute is being used.

<MartinKaltenb> the mentioned project with WKD (Wolters Kluwer Germany) is LOD2: http://lod2.eu

<olyerickson> Q: What does "working with" mean, esp what additional metadata is being suggested?

<mhausenblas> Michael: FYI - we're working with Christian Dirschl in the LOD2 project http://lod2.eu/

<MartinKaltenb> Hi Michael - parallel thinking...

<olyerickson> Q: How is the described work different than e.g. DataCube?

scribe: Working with Open Knowledge Foundation; trying to help governments ID metadata that should be included with public domain data; how best to semantically enable data for consumption; moving increasingly to making knowlege structures available in the clood

<bhyland> … Legisltation, e.g., HIPPA, Dodd Frank get amended. The provisions retain the same citation however we don't know if the agency or court in what *timeslice* the agency or court was referring to.

<bhyland> … WK can provide more value to customers. WK can help define what metadata on public domain data to semantic model for consumption, in decision trees, intelligent charts, etc.

<bhyland> … WK is moving to cloud. Looking to make available topical indices, thesauri, and other classification resources.

<gatemezi> Q: which ontologies have they developed in WK? Are they public?

<tinagheen> +1 to question from gatemezi

<olyerickson> unmute me.

<bhyland> JohnBarker: "We would like to see gov'ts globally standardize metadata. Make everything comprehensively linked. WK would then rededicate their editors to higher value activities.

scribe: Question - are your ontologies public?

<bhyland> JohnBarker: "If that doesn't happen, we'll have Gov't agencies continue to purchase content from WK because it is more searchable.

WK Germany and WK Netherlands makes some available; Christian Dershal (sp?) is the person to talk to. U.S. domain ontologies are not available, need partners for that.

<MartinKaltenb> WKD will publish 2 SKOS thesauri on places of trial and employment law as open data in about 2-4 weeks as far as I know

<MartinKaltenb> Christian Dirschl

<bhyland> Christian Dershal (WK) is point of contact working with Open Knowledge Foundation on publishing ontologies as LOD.

<bhyland> Christian Dirschl

<bhyland> Content Architect, Head of Content Strategy

<bhyland> Wolters Kluwer Germany

<olyerickson> http://solutions.wolterskluwer.com/blog/author/christian-dirschl/

Bernadette question - how to motivate gov't; how do you make your data available to gov't agencies?

<bhyland> JohnBarker on why gov't agencies use WK content?

<bhyland> * Gov has discarded their content, they haven't archived.

<bhyland> * Gov agency may request content via email or facsimile. Say an agency lawyer.

John: Gov't agencies in some cases have discarded archives, lawyers and regulatory staff sometimes need it; request it from WK
... Gov't agencies sometimes aggregate statutes and make them available on their website (e.g., SEC) ; others don't
... there is no comprehensive aggregation of cases by gov't; for example, courts don't always follow a standard when referencing regulations. WK invests effort to address this. We arrange sources of law by relative authority by topic and explain how they inter-relate.
... If there were links created between content by gov't, that would help; this is why gov't's rely on WK;

<bhyland> … If gov't would LINK them, then WK could just focus on commentary & explanations.

<bhyland> WK has lawyers & CPAs spending their time FINDING content rather than using their expertise to write commentary and explanations.

John: for example, the fed might want to know about legal/regulatory innovations in California or another state. Gov'ts want to know about what other gov'ts are doing in how things are regulated. WK aggregates a cross jurisdictional view - 'smart charts' that help answer questions
... the answers to specific questions are organized by jurisdiction, which give instant answers to questions like 'what are the privacy regulations' - they differ by jurisdictions.
... if governments would put more effort in the metadata, many others would save $ and greater value would be derived.

John E Question: I work at RPI on LOD gov't work, several things you touched on are of interest - linking together datasets - legal entity IDs for different datasets; and problem of regs that specify regulator requirments; and access to other data silos

<olyerickson> olyerickson also mentioned the OrgPedia project (Beth Noveck, Jim Hendler, et.al.)

<olyerickson> mute me

<bhyland> JohnBarker: "What is the law on privacy, e.g., health records, information on a customer? They differ in every juridiction. THere is a lot of open government data available, however it has to be classified, linked, annotated and distributed in a reusable manner.

John: There are business rules that go with metadata entry; if I am a piece of content labeled on regulation; and there is a certain context with a regulation; a common vocab across all data, jurisdictions and governments and business rules so that links how knowledge of the relative authority of each regulation.

<bhyland> … Everyone benefits if the gov't spends tax dollars to make data available using standard vocabularies and best practices, especially around provenance, URI policy & services, standard vocabularies for tax, legal & regulation. Ideally, a common vocabulary across all juridictions ... Also, business rules that are linked expressing relative authority of a piece of data (e.g., case law, regulation, etc)

<olyerickson> unmute me

John: some users are bound by a case, others not, based on relative authority

<olyerickson> unmute me

<olyerickson> never mind

John E Question: Orgpedia is operating to help aggregate open source corporate data to help gov't to get access to data which is not normally used by their agency but is by others. Hinges on exposing identifiers and linking them.

<bhyland> olyerickson: "First use cases are directed at government, people in agencies needing to use inter-government agency content. Requires exposing identifiers and linking them ... none is commonly available. You have to troll to find out what is available.

scribe: none of this is commonly available; in the open space, it is arduous task. There are a lot of inconsistencies between gov't agencies as how companies are identified; disambiguation is hard, and is what SW was created for.

<olyerickson> mute me

<olyerickson> bhyland: No more questions for this witness ;)

John: Standards are important; USC is inconsistent; some titles change the order in which parts appear. WK would like standardization; but challenge is how standardization is maintained and persistence assured. WK used query transformation to address this. Makes the query match the data that exists. WK also adds metadata, markup.
... a third thing is when they are unable to change content in a meaningful way, we have human editors create connections.

<olyerickson> JohnBarker: creating "hybrid content business solutions..." (did I hear that right?)

John: WK creates hybrid content software solutions - drive workflow steps in a software application. Requires accurate info - human layer needed; can't rely on content logic yet

<bhyland> … WK addresses content as follows:

<bhyland> 1) Query transformation, synonyms, root extensions, to make the query match the data that exists.

<bhyland> 2) Enrich content with metadata, fix errors, add markup

<bhyland> 3) If we cannot delegate to the search engine, we have human editors manually create a connection. There are examples where administrative rulings don't reference Code.

<bhyland> … Hybrid content software solutions, where content generates business rules. That drives workflow steps in a software application. We have to have highly accurate content to codify it. A human today must interpret the content that in turn a s/w developer has to put into code. We cannot semantically model the content yet.

<olyerickson> Very concerned about the proprietary nature of e.g. their ontologies (the schemae that make it all sensible)

<olyerickson> we need not only linked open govt data, but linked open govt ontologies...if the vendors want $$$ to understand the data, then it's not open

Bernadette: Clear need for coordination! Web Masters need clear markup instructions. WK and others have a lot to gain with government doing this. What about WK sharing ontologies and helping to drive this.

<gatemezi> Thanks for the presentation, John!

<olyerickson> Yes, thanks JohnBarker

<mhausenblas> Indeed, thanks a lot for the presentation, John

<MartinKaltenb> many thanks John also from my side!

<bhyland> Agreed: To continue the dialog between the international professional publishing sector & gov't & int'l standards group.

<MartinKaltenb> thanks - bye

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/05/31 15:04:11 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.136  of Date: 2011/05/12 12:01:43  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/WC/WK/
Succeeded: s/WC/WK/
Found Scribe: Mike_Pendleton
Inferring ScribeNick: Mike_Pendleton

WARNING: No "Topic:" lines found.

Present: Bernadette Sandro Michael_H. John_E George Mike_P. Boris John_Barker_(Wolters_Kluwer) Dave Richard_C Deirdre Martin gatemezi MacTed Tina_G.
Found Date: 31 May 2012
Guessing minutes URL: http://www.w3.org/2012/05/31-gld-minutes.html
People with action items: 

WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report

[End of scribe.perl diagnostic output]