Warning:
This wiki has been archived and is now read-only.

Lisbon/Paper Reviews

From Share-PSI EC Project
Jump to: navigation, search
Title Review 1 Review 2 Link
The LAPSI Sessions

I looked at the descriptions of the sessions:

  1. What should licenses for re-use should look like?
  2. Steps to a suitable redress mechanism
  3. Access and Accessibility for Data
  4. The most known challenges of PSI Access and Re-USE: Intellectual Property (and Data Protection)

Overall, these seem to be sensible topics. One suggestion that I want to make is that the LAPSI project should try to make these issues digestible for an audience that might not appreciate or understand the finer details of legal discourse.

Specific comments:

  1. What should licenses for re-use should look like?
    • Title contains 'should' twice (fixed - PA).
    • The description of the session does not address licensing issues for metadata.
  2. Steps to a suitable redress mechanism
    • The description of the session mentions the 'PSI framework' without explaining what this is. A link should be helpful.
  3. Access and Accessibility for Data
    • The [paper] link does not work, so I have not reviewed that
    • I wonder if it is correct that the session will discuss "the relationship between Directive 2003/98/EC (PSI Directive) and national access regimes", or that this should be the 2013 Directive?
  4. The most known challenges of PSI Access and Re-USE: Intellectual Property (and Data Protection)
    • No comments

My feedback on the four LAPSI sessions is as follows:

  1. What should licenses for re-use should look like?
    • Title: is this what should licenses for reuse look like *or* what licenses for reuse should look like? The emphasis makes for quite a different session... either way, the extra should needs removing
    • Focus seems to be solely on publishers perspective, will there be a discussion around the re-users perspective?
    • Could be clearer on the outcomes they have in mind. Will the discussion around the challenges then feed into the LAPSI licensing guidelines? How do we best train publishers on understanding this? Does it feed into current offerings (like the ODI's UK course)
  2. Steps to a suitable redress mechanism
    • Really like this, clearly set out with good outcomes. Agree that setting out good/bad examples for across EU would be good
    • They may want to look at the ODI's consultation response on this.
    • Would also be helpful to have an idea as to the audience for this?
  3. Access and accessibility for data
    • This session should focus on charging mechanisms (no other mention of this in LAPSI sessions, seems the best place for this) and how damaging this could be to effective publication of data/reuse of data
  4. The most known challenges for PSI access and re-use (IP and data protection)
    • It should be clearer if this session is looking at rights issues within datasets that public bodies want to license, or if there are other issues to focus on? ie: the point of open licenses is that it moves past the IP rights issues - otherwise theres a slight paradox between access/reuse and IP, and the rules for rights ownership don't come into play.

I think all sessions should be included within the workshop and will have good, challenging and detailed discussions that come out from the content.

Parallel A
Finodex

I can see this being a very popular session - free money everyone! It will be interesting to see what kind of questions come from the participants. The danger is, I think, that the session will dwell too long on the Findoex process. this is currently scheduled to take up a quarter of the session together with the intro to the whole thing - it's a lot. More relevant to the wider audience will be how the funders see the entrepreneurs and vice versa. What are the common concerns, are they speaking the same language? Please ensure that there is sufficient time to discuss the wider points, no doubt with follow up conversations in the coffee break(s) to follow up on Findoex specifics.

Example Parallel A
Open Laws

Since the overall theme is 'Encouraging open data usage by commercial developers' the topic and the discussion the would fit perfectly. The emphasis on concrete business model concepts and framework sounds promising and inspiring, however, there are no specific details given is the short description of the agenda. The success of the discussion very much depends on whether the discussion leader is able to provide accurate and specific examples for the mentioned business models. The facilitator should focus on that.

Additional review

The abstract for this session is very brief but hints at a very interesting discussion. Business models are, of course, crucial to the success of innovative and sustainable use of open data. It would be good to have a little more detail about what attendees might get out of this, especially in comparison with the other parallel sessions. Is the crossover with, say, the Open Data Startups session such that they could usefully be merged, or are the topics different?


The session proposal is very interesting and relevant to the topics targeted by the workshop.The session description would benefit from further detail regarding the business models to be discussed, for instance through examples which allow attendants determining how innovative and appealing to them will be the concepts to be discussed in the session.

Parallel B
Location session Example Example Parallel D
LIDER While the topic of multilingual data is interesting it does not clearly state how it is related to the conference theme. This proposal should either be reformulated to clarify the specifics of commercial use of data or saved for a later conference.

Additional Review

  • the session description should stress the assumption that multilingual and interoperable PSI data is more attractive to commercial developers
  • maybe having a commercial developer in the session who describes his/her problems with PSI data only available in a certain language would make it possible benefits easier to grasp
  • try to have a structure showing interactive elements of your session, e.g. mixed break out groups with commercial developers and data publishers discussing requirements for creating multilingual PSI data

The session proposal addresses an interesting topic. The proposers suggest practical solutions can be presented and discussed. The session could result in a summary of the issues related to multilingual PSI and existing tools to help address them. The session can be of great interest to the audience. However the relation to the specific topic of the workshop on commercial reuse of data should be highlighted. The session would benefit from addressing the multilingualism topic not only form the perspective of data providers but also from the perspective of service designers.The expected audience of the session should be specified.

Additional Review Sounds like an important topic for discussion. Obvious relation to the fundamental questions (and challenges) related to open data quality, meta data compliance and general interopability – which are still relevant changes regarding the current state of affairs. The multilingual argument is almost self explanatory (we live in a global economy, etc.) but the biggest challenge lies in convincing stakeholders that this is something they should look at. Case examples could also be useful, where applicable.

Parallel A
Open Market Dilemmas This looks like an interesting session as it raises questions that don't generally come up. The idea of 'attribution leakage' sounds like a call for something like an open data version of the Chatham House Rule - although I think that would raise implementation challenges. And triangulation leading to personal info being identified is a risk that we need to tackle - somehow. So this should be a good session.

This proposal looks very interesting and relates clearly to the conference theme. Specifically the third item in the proposal (Technical restriction prohibition and share alike vs. monetization of added value) may have severe impact on the practicality of commercial re-use. It may be the case that data providers choose a license without thinking about the consequences for commercial re-use as both attribution and share alike may create severe obstacles. Looking forward to this discussion.

Parallel D
Open Data Economy: from ‘Wow’ to ‘How’ Excellent and well thought through paper, a workshop which will without any doubt explore the right issues with regards to encouraging the re-use of Open Data.

A clear overview of background and the topics to be discussed plus a well-defined but too ambitious time frame.

The first part of the workshop looks almost like a lecture/presentation. The facts and figures are known by most people attending. If this a necessary part of the workshop, 10 minutes is perhaps too much, keeping in mind that the rest of the workshop is going to be focus.

Looking at demand and supply side of the open data economy is an excellent idea and a necessary part of the approach taken. A high interactive slot with cases presented by participants takes time. To do both in 25 minutes is perhaps asking too much.

The content and the aims of the envisaged discussions are very ambitious. In his present format, the workshop could very well take a full day, let alone one hour. What about focusing on the enablers, barriers and policy guidelines?

Data supply, demand, enablers, barriers etc. - seems like a too wide range of topics to be discussed in one hour. Most of these "general" issues regarding open data cycle have been talked about already many times before. Taking a helicopter view will barely help solve any specific problems, having in mind that there are so many different countries and so many different issues in EU. Hard to believe a substantial 5p. essay proposing any new solutions could be delivered in a discussion that is less than an hour long with such a wide scope.

Would suggest focusing on one of these topics instead of all at once and narrowing the scope of the session to address some specific problem of Open Data implementation.

Parallel C
Roadblocks in Commercial Open Data Usage Very short session proposal with very little detail.

Title(s) is/are confusing. "Create your Open Data Business Plan" has an another finality than "Roadblocks in Commercial Open Data Usage" and different angles in approaching the challenges facing commercial re-use of Open Data. What is it the authors of this paper would like the participants to walk away with after the workshop?

The aim of the sessions in Lisbon is to define ways and methods of encouraging commercial use and re-use of PSI and data. Pointing out the differences in mind set of civil servants vs commercial users could certainly facilitate understanding of the challenges all stakeholders face. The planned workshop in his present format is perhaps too high-level and conceptual. The paper doesn't give enough detail to elicit a worthwhile debate. Explaining what a business plan should look like or is, is of little value. The focus should rather be on the last part, detecting roadblocks.

Even though short, this session proposal might provide a very interesting discussion since it is an attempt to simulate a real business case. Some examples of already written such business plans would be interesting if there exist any. Also, more details on the industries and business sectors that would better fit this simulation could be discussed. Overall - its a short yet comprehensive proposal with a very clear goal.

Parallel C
COMSODE

I don't find appropriate to use the name of the project as session title; one should use a specific title for the proposed topics; otherwise, Share-PSI participants may confuse this for a COMSODE project meeting that is not the case here

The title of the paper itself is "COMSODE FP7 Project - position paper for LAPSI workshop" should be "(...) for Share-PSI 2.0 workshop"

After reading the session description on the Web page and the paper itself, I'm feeling a bit confused: unless I didn't get it all wrong, the session's goal is to discuss about the main barriers (missing clear message for investors, critical mass of quality open data is still not available etc.) hindering commercial usage, plus to discuss the lessons learned during COMSODE project execution in past year; this is not going to be a project presentation session, am I right? Please emphasis the right perspective in the Web page otherwise readers might be get it wrong (starting with the title)

The paper also presents the commercial products developed inside COMSODE project that enable end-users to use/transform (open) datasets, but I think that we should focus our discussion on the experiences COMSODE partners have been having while investing in the development of commercial products for open data reuse (why have you going this way? how are your commercial approach perceived by datasets owners? how successful are your experiments with open datasets?) and address in our discussions the barriers outlined in the abstract on the Web page.

Don't forget that the upcoming Share-PSI workshop will be focused on 'Identifying datasets for publication', hence in-depth analysis of this topic is not in the scope of this workshop.

Overall, the experiences of partners of COMSODE project are more then welcome in this second Share-PSI workshop and presenting and discussing them in a session is appropriate. Nevertheless, *this* should be emphasized in Web page description over the current project description. Try to focus the discussion around the lessons learned, use cases and implementations, instead of plain/flat tools presentations.

Example Parallel D
Open Data Startups- Catalyzing open data demand for commercial usage This session covers a lot of the key issues of the event. The Alvarez paper refers to the ODI Startup Programme so the two go well together. As with other project-centred sessions, the temptation will be to spend the time talking about how great the existing work is rather than eliciting ideas from others about what needs to be put in place to support incubators or how best to run them.


What I really like about the paper of Martin Alvarez-Espinar is, that he keeps the discussion broadly without anticipating any conclusions. He combines suspenseful fundamental questions with specific examples, without highlighting his own sight though. That encourages the recipient to follow and participate in the discussion and therefore the paper seems to be inspiring and promising. The only problem is that, due to the broad positioning of the discussion, there could be a lack of structure. This should be avoided.

The ODI paper seems to be very promising, inspiring and well-structured. There isn't much to criticise about it. However, the discussion and the presentation seems to be very detailed and so there is danger that people will just listen instead of participating actively in the discussion. This very much depends on how the discussion will be lead. The facilitator should keep an eye on that. The scaling of the successful products and their opportunities for collaborations across the EU seems very promising.

Parallel B
Events, hackathons and challenge series - stimulating open data reuse The session proposal "Events, hackathons and challenge series - stimulating open data reuse" tackles interesting movement of public competitions that raise awareness about targeted domain and contributes to development of communities at the same time. There is no doubt that topic is challenging but a little bit deeper insight would contribute to the discussion. This remark is mainly related to some of the numbers which would relate public events with development of open data market. We might have surprising results that market development is not significantly affected by them because agile enterprises seize early opportunities and further promotion is good but does not reveal innovative business models. Such and similar opinions might be confirmed or rejected by a little bit deeper analysis of the topic and the discussion would be more focused. The session proposal is very appealing and matches very well the topic of the workshop. A specification of the intended audience would be useful to potential attendants. Finally if a concrete outcome is expected to be produced, such as the definition of strategies to improve the startup incubation process this should be presented.

Additional Comment

Not much to add here. Only one thought: what about all the businesses that exist already and "just" add open data to improve their already existing businesses. There is definitely more than just ONE best app for a set of data. And business success is not always directly related to the initial data set.

Parallel A
Model-Driven Engineering for Data Harvesters

I thought the paper provided a strong technical description of the issue and the approach taken. I liked the way that the issue was developed by way of a practical example, and there were some good learning points made. I expect that a discussion of the approaches that other people have taken when tackling the same issue will be very productive in Lisbon.

There is a lot of content in the paper and the proposers will need to think carefully how the key learning points can be conveyed in 5-10 minutes, especially as the main purpose of these sessions is to lead a discussion and not to deliver a presentation.

There are a few areas where I found the description difficult to follow. Examples included:

  • the text speaks about a CKAN Registry as well as a CKAN Repository whereas the diagram shows neither of these, but does have the CKAN Catalog. Are these all the same thing?
  • In the text it is the meta-data that is being harvested. In the diagram "datasets, documents and apps" are referred to

Being able to relate the textual description to the diagram will help participants understand the explanation a lot more easily and I think the description could be re-written so as to directly refer to the diagram.

I thought the comparison between Python and Java could have been more methodical - is there a table with the pros and cons of the two languages that could be produced? If the proposers want to invite any discussion on that choice in Lisbon providing one would make it easier for participants to join in.

I was expecting to read something in the paper about the relationship between the MDE approach and DCAT, CKAN include it in their list of Open Standards - http://ckan.org/open-standards/ - and given the likely audience for this session in Lisbon I suggest the proposers are prepared to discuss this.

I liked that the example of commercial developers were primarily the providers of tools to assist in the harvesting of data. I think that participants in the workshop will find that of particular interest in Lisbon.

This is a very relevant and interesting topic. Standardization regarding harvesting, the role of meta-portals and the quality of data are all important (and recurrent) challenges, both for national wide entities and the data consumers. This paper focus on the particular problem of meta-data harvesting is very relevant.

There are only two particular points that I would think we would benefit from being a little more expanded, especially at the live discussion. The first one is about the challenges of implementing the MDE tools. Within government / public administration there is often the need for clearing up who assumes the costs related to implementation (not only money, but also time, staff, etc) regarding these kind of investments. Is this something that relies exclusively on the side of the meta-portal? Or not really? Some suggestions / alternatives could be provided to further discussion.

The second point relates to how MDE relates to encouraging data users by commercial developers. I understand how the model would work, but maybe some clarification on why better metadata means better data? (in the case that you can get your data from different sources?).

Would also like some clarification on why / how MDE could leverage "additional taxes" from companies and industry; as in the sense of increasing marginal costs of releasing data?

Parallel B
COOLTURA: scalable services for cultural engagement through the cloud A very well defined paper with clear goals and envisaged results.

This workshop fits in with the goals of the PSI workshops. COOLTURA is an excellent example of a platform which will encourage open data use and re-use by the development community.

It would be nice to have an overview of the results/conclusions of the user consultation, i.e. to know what triggered the scenario framework. How did you capture the insights and expectations of the visitors in the 3 different sites? The target audience should be not just commercial developers though, since development can't be seen separated from the other stakeholders. We can expect an interactive discussion which will hopefully also focus on the supply- / demand side with enough attention to feedback from users.

An ambitious timeframe though.

Many definitions of "personalized" appear in the paper without real examples. For personalization a platform would need user data that could hardly be generated by Europeana project - therefore a discussion of integration with already existing popular platforms or apps (like Facebook or Twitter) would be interesting, e.g. the possibility to log in to some cultural profile using the login of Twitter or FB.

Also the commercial value of such cultural data is very unclear - maybe some more focus could be set on the GLAM (galleries, libraries, archives, museums) market sector analysis.

From a technical perspective it would be interesting to hear if any tools that are already known in the market of semantic technologies are used for building this platform.

Parallel C
Do current licensing practices hinder the commercial reuse of open data?
  • Writing down the discussion points, the author is aiming, for is good. What is missing though are suggestions for the structure of the session, through which it becomes clear how the session participants are going to be involved.
  • One idea for the interactive part of the session could be to have break out groups actually look at a few different datasets from different countries/data publishers and see what obstacles they can find in the licensing texts to re-using them
  • Another idea for the interactive part of the session could be to have some real world examples where licensing conditions made the re-use of data more cumbersome. Let the participants look at the examples in small groups and reflect on their own experiences and possible solutions and report these back to the whole session
  • The paper title makes the reader expect answers on the posed question whether licensing practices hinder commercial re-use. Instead results from a workshop series are delineated without a clear final outcome. Maybe adding a subtitle like „Pros and cons from an ongoing evaluation“ would help clarify the perspective
  • one addition for the references could be this publication on the disadvantages of non-commercial use only restrictions.
This is a good topic to include for our workshop in Lisbon when we will be having parallel sessions with LAPSI.

It's good to see the results of the analysis that PWC have completed on licenses applied to open data platform content. I'd like to see some more development of where custom licenses are and aren't open - this could be a fruitful discussion in Lisbon.

Some of the reasons given by administrations for writing custom licenses would appear to be covered by open licenses already. In particular, I am thinking of CC-BY which specifies attribution. Did licensors apply custom licenses because they were unaware of a suitable open license, they did not trust an open license or for another reason.

I am pleased to see that the paper has been written from the point of view of creating a discussion in Lisbon. It gives delegates some really good discussion points to get the session going.

Parallel C
Open Data Business Model Patterns and Value Disciplines Example Example Paper
Realising an Open Data Marketplace in Greece

This paper is relevant to the workshop's aim of encouraging commercial developers to work with open data. It is clearly written and I found the arguments well developed and easy to follow. I think they could possibly have been made stronger with some more examples added.

There is a lot of content in the paper and the proposers will need to think carefully how the key learning points can be conveyed in 5-10 minutes, especially as the main purpose of these sessions is to lead a discussion and not to deliver a presentation.

I felt the reference to social media in the introduction was unexplored. The connection between the two was asserted, or aluded to, without explanation. Social media integration is referred to later, but the benefits (or otherwise) are not covered.

I agree with the proposers' statement that the focus to date has been more on publishing open data than encouraging its use by others, but maybe there are some examples that could have been considered for inclusion in the paper. It could be an open question for delegates who attend the session: what other examples do you know of?

I'd have like to see some expansion on the revenue model that Gov4All has developed. This could be a really interesting discussion for the session.

Overall, I thought it was a strong paper that provides much for delegates to discuss

Example Paper
Spanish Infomediary Sector Characteristics The panel proposal SPANISH INFOMEDIARY SECTOR CHARACTERISTICS is very well elaborated and can serve as excellent basis for discussion. The abstract itself is entirely sufficient for presentation of situation in Spain. For quality of discussion it would be very helpful if authors could ask some of their colleagues in other countries if they have any similar data (at least some of the indicators). This would really involve audience in discussion and start certain comparative analysis. The topic of the paper Spanish Infomediary is fitting very well with the goal of the SharePSI workshop in Lisbon. It presents multiple perspectives (company related, PSI related, business related etc). Please find below some suggestions for talk, or discussions during/after the session:
  • if you have any figures related to the percentage of the PSI-related business out of the company's total turnover/revenue (distribution of companies based on this criteria?) would be interesting to show
  • if your study has any evidence on the reasons behind a company strategy to engage in a PSI-related business would be interesting to present; can we find good practices to better involve companies in reusing PSI based on Spanish experience?
  • if you are aware of similar studies at EU level, it could be interesting to contrast Spanish figures to those similar; e.g. http://www.epsiplatform.eu/content/review-recent-psi-re-use-studies-published
  • focus on key insights that can be extracted from the study
  • if there were some surprising outcomes, do highlight them
  • include some recommendations for SMEs / entrepreneurs possibly emerging from study

Below are some specific issues I found misleading while reading the paper:

  • most of the figures are percentages, but it's not clear how many companies were surveyed?
  • the table with companies age (first page) is unclear; how shall I read it? horizontal and vertical axes refer to the same domain; (I know it's copy/paste from the original report, but please explain during presentation how shall we read it)
  • the figure with number of jobs is not clear for year 2011 where it shows some MEUR?
  • on top of the figure with PSI distribution per type, it is stated that "85% of reused information is related to: Business, ..." (page 1), fact that I am not able to read nor infer from the figure itself
  • another tricky thing is the fact that the graph with (Internet, Email, Courier, Mail, Telephone, ...) in page 2 is suggested to be a zoom-in into Electronic format channel; is it true, or am I misreading the graphs? If true then how Courier is an electronic format?
paper
A Scenario for Business Benefit from Public Data
  • Don't use "information" and "data" as a synonym -- it's not.
  • Chaos rules, if all available data / information is provided in the same format, where is the diversity? Data should be made available in open formats. Interoperability wins, not monopolies.
  • Concerning the Open Public Sector Data Business Scenario it would be interesting to discuss whether this can be applied to SMEs or even micro-businesses as well.

The context that this paper draws up on a perceived lack of business interest regarding open data sets, and how it can lead to an unfulfilled potential is a question that lingers over every open data provider. As such it is most on-topic, and related to the topics on discussion.

The paper argues a possible solution for this problem, relying on an enterprise architecture methodology and drawing up a business scenario. It is not entirely clear in the paper how this would work in practice, and in what level of maturity is the technique. As such, it would be interesting to refer some real life scenarios (related to open data or not) where something similar has been used, or even how it is already being used, as of now.

paper
An OSGi-based Model-driven Data Management Module for Robust Open Data Harvesting

Elements of this were presented in Samos by Johann Höchtl and Peter Parycek where the reaction was the same as mine is now: why has this project not used DCAT and the EC's DCAT Application profile? But that's a side issue here. My main problem with this paper is that it is focussed on data publishing/metadata and not on the workshop theme of encouraging use of the data by commercial developers. Therefore I do not think it would be appropriate to include a plenary talk based on this work at *this* workshop.

This is an interesting paper from a technical point of view, but I don't see much in it that relates to encouraging commercial development. It proposes a particular technical architecture for harvesting PSI. Clearly, a good technical architecture will facilitate commercial takeup, but that is not the main focus of the paper.

If we are looking for a commercial takeup topic that this paper would contribute to, we could consider something like "Data Harvesting Architectures to Encourage Commercial Re-Use." From this perspective, the authors should think about whether the claimed advantages, such as scalability and modularity, would benefit commercial re-users, or whether they would just benefit the public sector bodies that make the data available.

There are probably other things that would be important to re-users. I would find the implications of the MDE approach on the re-use APIs, and the ability of re-users to take advantage of shared metadata, a particularly interesting aspect. Even this is rather technical, however. We should look at whether it would improve discovery of relevant data for re-use, speed up re-use application development, reduce the cost of this development, and so on.

Overall, I think that there are commercial aspects that could be brought out using this paper as a starting point, but that the paper itself is about the technology rather than the business aspects of open data usage by commercial developers.

File:Prakash Rajaguru Tcholtchev.pdf