Warning:
This wiki has been archived and is now read-only.

Best Practices/Human Readability and Machine Processing

From Share-PSI EC Project
Jump to: navigation, search

Overview

Published information may be available to human readers or for machine processing or both.

Why

Human readership and machine processing are both important but require different publication formats.

Intended Outcome

Published information can be understood by human readers and effectively processed by machines as appropriate.

Possible Approach

Publishers MUST consider whether the published information will be read by humans, interpreted by computer programs, or both.

Information that is to be read by humans SHOULD be prepared so that its intended audience can read and understand it easily. It CAN be made available in parallel versions in different languages. It MUST follow the grammatical rules of the languages concerned. It MUST use open standard data formats that can be interpreted by reader programs that are widely available from multiple sources and are likely to remain so for the lifetime of the information. It SHOULD use open standard data formats that can be interpreted by reader programs that are available free of charge.

Information that is to be interpreted by computer programs MUST use open standard data formats. The formats concerned MUST be identified, and their application to the information MUST be explained.

Access mechanisms, for example for authorization of the accessing human or computer system, MUST be clearly documented.

How to Test

Readability of information to be read by humans SHOULD be tested by reviewing it. The effort expended on review CAN vary to reflect the importance of the information.

Suitability of data formats used for information that is to be read by humans or interpreted by computer programs CAN be tested by design reviews.

Ability to access information SHOULD be tested by making trial accesses.

Evidence

Mistakes and confusion due to information that is hard to understand are common experiences of many people. Similarly, the additional cost of developing systems that use badly-documented interfaces is a commonplace in the world of information technology. There are many examples where use of proprietary data formats has been to the commercial benefit of the companies owning them but not to the overall benefit of society as a whole.

Human Readability and Machine Processing is an identified requirement of the Open Public Sector Data Business Scenario.

Machine processable dataset are also advised by the EU PSI directive (art. 5 c.1), and by the EU commission notice 2014/C 240/01 paragraph 3.2 point d

Lifecycle Stage

Planning (with testing at publication time).

Audience

Everyone responsible for the creation or publication of public-sector information.

Related Best Practices

(To be added once the list of best practices is complete.)