Warning:
This wiki has been archived and is now read-only.

Best Practices/Commercial Considerations in Open Data Portal Design

From Share-PSI EC Project
Jump to: navigation, search


Name of the Share-PSI workshop: Lisbon Workshop

Title of the Best Practice: Commercial Considerations in Open Data Portal Design

Outline of the best practice

Data Marketplaces

While certainly data markets have existed long before the advent of the internet (matching information supply and demand has been driving many media revolutions from newspapers to the telegraph), the wholesale exchange of large structured dataset is relatively new.

The Deloitte Study "Market assessment of public sector information" for the UK Department for Business, Information and Skills identifies data marketplaces and data enrichment as important business models utilizing open data. Many early entrants have been forced to shift their focus from "statistics on speed" (timetrics) to more cautious approaches or they left the market altogether


What do data marketplaces do?


  • They function as search engines for (public or exclusive) datasets or data services, API access,
  • they provide quality indication for data (often crowdsourced),
  • they facilitate the comparison of datasets,
  • they allow the download of data,


How do they do it?


They build an enormous collection of structured data through

  • automated methods,
  • editorial work and
  • crowd sourced commits and edits.


How do they earn money?


  • They charge (buyers, sellers or both) a commission of any data that is sold on the site or after referral,
  • they charge sellers for listing (akin to the yellow pages),
  • they distribute premium data for a commission or sell value added custumer services (the so called Freemium model),
  • they monetize t traffic with targeted advertising (rarely),
  • they earn money through other cloud services, which become more attractive to developers through the ready availability of third party data,
  • they trade data for data or

Data enrichment providers

Data enrichers - often specialist for a narrow business area (like geo marketing) seem to fare better but here the public data sources are often not spelled out in detail. All this may be because these business models and open data policies geared towards them face a couple of dilemmas often overlooked.

What do market enrichment providers do?

  • The collect data necessary for a certain task
  • They enhance, refine or otherwise improve this data
  • They sell datasets or services specifically designed for the needs of a certain business


How do they do it?

They enhance large datasets through e.g.

  • Extrapolation
  • Error correction
  • Matching with (proprietary) data


How do they earn money?

  • Sale of datasets
  • Service charges
  • Sale of software that uses the data


Challenges and solutions

  1. What X is the thing that should be done to publish or reuse PSI?
  2. Why does X facilitate the publication or reuse of PSI?
  3. How can one achieve X and how can you measure or test it?


Chicken egg vs. market foreclosure Challenge: Possible market foreclosure for data brokers through public data portals

What? Distinguish between the task of a data broker and that of an open data portal.

Why? Because they are not designed for data brokerage, rather they should help government to streamline its own processes, provide a single, reliable point of access for government data and help managing the open data progress (e.g. prevent departments or agencies from prematurely developing their own apps.

Also because concentrating solely on optimizing and enriching today's portals with brokerage functionalities might distract from advancing the technologically possible services (e.g. Machine based discovery - standard access to DCAT repositories - Content delivery Networks)

How? Withstand pressure to come up with brokerage solutions, instead provide services and APIs commercial data brokers can use.


Moving the data vs. moving the computation of data Challenge: Bandwidth issues for large datasets.

What? Start thinking about accepting computational queries and delivering results instead of data (as we already do with our search engines, only much more elaborate).

Why? Because it is much more efficient when it comes to large datasets and might also be a solution to privacy concerns as it allows more control.

How? Don't be content with today's solutions (meta data catalogues), but keep an eye on data industry developments already underway.

Privacy vs. information density Challenge: Overly restrictive application of privacy rules.

What?

Treat privacy issue as a risk to be managed, not a yes/dichotomy.

Why? Because privacy concerns can easily be used as a smokescreen for other motivations not to publish PSI, but is really a matter of the granularity of data.

How? Keep in mind cultural differences. Not everything that is considered possible in one country is feasible in another. Include the prohibition of re-identification in the use condition, or make it even a criminal offence.

TRP and share alike vs. monetization of added value Challenge: • Obstacles for commercial re-use through technical restriction prohibition and share alike clauses

What? Do not demand share alike for open data an don't prohibit technical restrictions for sharing the data.

Why? It will prevent any commercial usage that relies on selling the identical dataset to many customers

How? Choosing appropriate licensing terms


  1. Management summary


Challenge

Early commercial data marketplaces and data enrichers have faced a number of challenges (sometimes forcing them do abandon their business models), that were aggravated because the possible commercial re-use of PSI was not adequately addresses in the design of public sector data portals as well as in licensing policies. Among the challenges are:


  • Possible market foreclosure for data brokers through public data portals
  • Bandwidth issues for large datasets.
  • Overly restrictive application of privacy rules.
  • Obstacles for commercial re-use through Technical restriction prohibition and share alike clauses


Solution

  • Distinguish between the task of a data broker and that of an open data portal.
  • Start thinking about accepting computational queries and delivering results instead of data.
  • Treat privacy issue as a risk to be managed, not a yes/dichotomy
  • Do not demand share alike for open data an don't prohibit technical restrictions for sharing the data.
  1. Best Practice Identification


Why is this a Best Practice? What's the impact of the Best Practice?

Commercial re-use of PSI is an important driver for Open Data. The suggested measures remove common hurdles to commercial re-use an thus strengthen the demand-side of the Open Data process.


Link to the PSI Directive

(Please use one or more of the categories listed on the last page of this document, as many as relevant)

  • Open Data platform(s) / Publication and deployment of information/data and metadata
  • Licensing of information/data and metadata



Why is there a need for this Best Practice?

Not addressing these issues will hinder important areas of commercial re-use of PSI.

  1. What do you need for this Best Practice?

Freedom in the design of Open Data Portals (not out-of-the box solutions) and license regimes.

  1. Applicability by other member states?

The best practice can be applied in any member state on all levels of government. It should be especially useful for projects with at least moderate budgets that can afford a custom design of their open data portal.

  1. Contact info - record of the person to be contacted for additional information or advice.


Dietmar Gattwinkel

Projektleiter Open Government Data \| Project Manager Open Government Data

STAATSBETRIEB SÄCHSISCHE INFORMATIK DIENSTE \| SAXON IT SERVICES

Fachbereich 3.1 \| E-Government- und Querschnittverfahren

Riesaer Straße 7 Haus D \| 01129 Dresden

Tel.: +49 351 20545259 \| Fax: +49 351 451 3264 310

dietmar.gattwinkel@sid.sachsen.de \| www.sid.sachsen.de