The World Wide Web is now an essential infrastructure for the global economy. Increasingly it is being used for a new generation of applications built around access to a Web of machine interpretable information. The next step will be to apply this to meet the demand for faster and more accurate reporting, analysis and integration of business and financial information. This is being spurred on by the demand for greater transparency by regulatory authorities, investors and businesses. This arises in part from the global financial crisis of 2008-2009, but also from the opportunity to boost investment in all sizes of businesses, through greater awareness of the investment opportunities, and the potential for using capital more effectively within businesses.
To begin to address these challenges, two leading standards organizations, XBRL International, Inc. and the World Wide Web Consortium (W3C) came together to organize a workshop of diverse stakeholders. XBRL International develops XBRL, an electronic format for communication of business and financial data, which is being adopted worldwide for various reporting applications. W3C has developed a series of Specifications for a Semantic Web that allows for deeper analysis and combination of large data sets. XBRL International and W3C believe there is a significant opportunity to engage business, government, technical and user communities, and to work collaboratively to create new solutions to priority problems. We invite public comment on this report and encourage interested parties to reach out to the leadership of XBRL International and W3C.
This page summarizes the discussions and outcomes from the Workshop on Improving Access to Financial Data on the Web. The Workshop brought together people from a wide range of backgrounds, and gave a glimpse of the broad potential for applying Semantic technologies for improved access to financial information. XBRL is leading the way for financial reporting, but we are still at a very early stage when it comes to what could realized for analysing and combining different sources of information both public and inside the enterprise.
One of the main outcomes was a realization of the need for encouraging the use of best practices for entity naming schemes. This could potentially be followed up with an Incubator Group. The workshop break-out sessions also called for work on enabling improved search for financial information through richer metadata, and for outreach on success stories for financial transparency, and the development of clear and compelling value proposition for all audiences, as the means to resource further work.
W3C is investigating further activities with the financial community. We would expect such activities to involve at least one full time equivalent of additional staff effort, which could be funded through a fellowship or through additional membership contributions or grants. Organizations that are interested in helping to enable this work should contact Dave Raggett <firstname.lastname@example.org>. Additionally, interested W3C member companies can charter Incubator Groups to explore specific technical work items, this could be an appropriate route for following up on the entity naming issue.
The Semantic Web and XBRL, a brief overview
The Semantic Web makes it easier to work with information sources in many formats and data models. The key is to develop filters to map such information into binary relations that are independent of the syntax. By building bridges between communities, and sharing best practices for data models, shared vocabularies can be developed, that facilitate mashing information from different sources, and encourage the development of innovative applications.
XBRL originated in efforts to standardize the business reporting supply chain. The Semantic Web, by contrast, emerged from work on formal representations of knowledge for a machine interpretable Web of data. Bringing the two communities together would be of value to both and could point the way to the future for work on modelling financial and business information as part of a broader ecosystem.
A starting point would be to build upon existing work on the relationship between XBRL and OWL, and defining a standard for business and financial data that is syntax independent. This would bring opportunities for using richer semantics in XBRL taxonomies, and for introducing more compact and easier to process formats for filing reports. These could be phased in gradually since they would share the same underlying abstract data model. There are also opportunities for an exchange of ideas between the two communities, e.g. on versioning and integrity constraints.
XBRL offers great flexibility, but necessitates lots of queries to create the data structures needed by applications for rendering charts and tables etc. There is an opportunity for defining intermediate data models that will make applications easier to develop and more efficient to run. Such intermediate models could alleviate the complexities involved in combining data from different taxonomies.
There were around 100 participants including about 8 remote, although on the day there were a few no shows. The participants came from international (Central Bank of Brazil) and US government agencies (SEC, FDIC, FRB, EPA, FSTC, NIEM), banks, stock exchanges and other other businesses (Singapore Stock Exchange, Deloitte & Touche, IBM, Oracle) and academia. There were lots of thank you's e.g. from David Blaszkowsky, Director, Office of Interactive Disclosure, Securities and Exchange Commission who was very encouraging as well as offering to host a follow on meeting.
The workshop proved a good opportunity for people from different government agencies to exchange ideas, and there was a shared interest in sorting out the naming issue for financial entities, something that is essential for combining data from different sources. Workshop participants expressed a desire for work on best practices for harmonizing names across agencies. Each agency has slightly different requirements, as a result, a single naming authority is unlikely, nonetheless, it is important to be able to relate names across agencies.
As a joint W3C/XBRL International event, participants were able to swap ideas on the XBRL and the Semantic Web. Dave Raggett gave a short introduction to the Semantic Web. The idea that the Semantic Web is an abstraction layer over existing data sources went down well, along with being able to express richer semantics than is currently possible in XBRL. Walter Hamscher (SEC) suggested that class-based inheritance of semantics would be of value.
XBRL is just one part of the universe of financial information, and Hatsu Kim (Thomson Reuters) noted that analysts can't rely on annual and quarterly filings alone and need to combine this with information on markets and exchange rates etc. when it comes to valuing companies. Daniel Bennett (eCitizen Foundation) talked to us about LegislativeXML and inserting XBRL into the appropriations supply chain as well as emphasising the power of URLs for combining data. Linda Powell (Federal Reserve) talked about the need for harmonization on metadata standards and referred to work on Micro Data Reference Manual (MDRM). This followed a talk on the National Information Exchange Model (NIEM), which is a joint DOJ / DHS / S&L program, started in 2005 to promote the standardization of XML information exchanges.
Cate Long (Multiple Markets) gave us a presentation of how credit rating agencies this area works, and the opportunities for using XBRL as part of the rating process. Within the US Congress a bipartisan bill H.R. 2392 aims to make XBRL the standard for disclosure to the U.S. government. It has been approved in committee and reported to the full House of Representatives for consideration.
David Watson reported work for the Singapore stock exchange on a web-based tool for financial analysis and benchmarking of companies that submit XBRL corporate financial statement filings to the Accounting Corporate Regulatory Authority of Singapore (ACRA). The goal of Open Analytics is to distill large volumes of standardised financial data into simple, visual business intelligence. This showed the kind of end-user experience that can only be dreamed about elsewhere in the world. Singapore has a tightly controlled reporting model that simplifies comparisons, but nonetheless, it provides a glimpse of the future.
Herm Fischer (UBmatrix) presented some work on combining data across XBRL filings from Japanese companies. Loading these into memory as XML proved to be very memory intensive, and he suggested the use of relational databases as a more efficient solution. He says he will now take a look at the Semantic Web as a more embracing approach for analysing and combining data.
Brand Niemann (EPA, SAIC) gave a short talk where he highlighted the federal segment architecture methodology. He had a nice slide positioning different approaches to data representation. This provided workshop attendees with some of the reasons on why to pick Semantic Web technologies instead of inventing yet another markup language.
There were several talks on academic work on XBRL. Edward Curry (DERI) talked about the challenges for combining different sources of financial data. Sean O'Riain (DERI) described the potential benefits for apply the Social Web to financial data (blogs, wikis and tweets) and the connection to linked data. Matheus Silqueira talked to us about multidimensional queries on financial data, and the relationship with OLAP and the LMDQL query language, Christian Leibold presented work in the MUSING project on combining XBRL and Semantic Web data. Finally Roberto García (Universitat de Lleida) talked to us about lessons from the Rhizomik Initiative, and some of the ontological challenges that have emerged for financial data.
The big question is how to move forward. Some of the topics raised include:
Business and Financial Transparency, and Analysis
The Web enables groups of people to collaborate and this opens the way to collaborative analytics where communities create analyses, mash different sources of information together including financial reports, and news stories on companies and markets, and comment on it in various ways including tweets and blogs. What is needed to facilitate this?
Transparency is important within enterprises for effective management of corporate resources. The Web made it easy to present information that hitherto had been isolated within the silos defined by different information systems. This involved the creation of middleware that could access the existing information system, render it to HTML and expose it with HTTP for human interpretation via a Web browser.
The Semantic Web goes a step further and exposes the semantics of information so that it can be interpreted by computers. The Semantic Web has the potential to make it easier to collate business and financial data and metadata right across the enterprise via the corporate intranet. XBRL and regulatory compliance have paved the way for a new wave of middleware that unlocks corporate information systems.
Community Education and Outreach
Perhaps W3C could play a role in defining best practices in specific areas and at an international level, depending on obtaining the resources needed to support such work. Tim McNamar (E-Certus) has offered some suggestions in a break-out that could be followed up in a further exploration for funding.
Goverment agencies are typically large and relatively slow moving, and progress is dependent on continued involvement.
The work on transparency relating to financial reporting is related to a limited extent to the W3C work on eGovernment, and we need a better way to talk about this. Financial and business information over the Web is sufficiently different in nature to the current focus of the W3C eGov work that simply merging the two areas may be counter productive.
Realizing the goal of transparency is generally embraced, however, it was recognized that it also has a disruptive impact by making it much easier to pull together information that was previously possible.
W3C has an opportunity to facilitate realization of the potential of business and financial information on the Web, and it could become a key application area for the Semantic Web, but the challenge will be how to fund a W3C initiative on linked business and financial information, and to draw in new members to work on it. The first step is this workshop report. We also plan to conduct an online poll to determine the level of support for a possible Interest Group, including suggestions for how to fund the associated W3C staffing costs.
Proposed New Technical Work
Practices for naming business and financial entities and the associated metadata as a basis for comparing and combining different sources of information. Patrick Slattery provided a summary of the entity naming break out session. After a review of the workshop report Walter Hamscher provided additional comments on naming.
Best practices for harmonizing vocabularies, and the need for a continuing dialog across government agencies.
The openess of the Web can lead to abuse, e.g. attempts to manipulate stock prices. This makes it important for there to be a robust treatment of provenance. What does this imply for business and financial information in the context of the social Web?
There is a need for further technical work, e.g. on extending OWL to support richer integrity constraints (e.g. to mirror the capabilities of XBRL Formula), and intermediate data models aimed at simplifying the task of Web application developers (reflecting the difficulties of working with fully normalized data).
In conclusion, There is broad interest in realizing the potential for increased financial transparency of government and business organizations. This would be facilitated through bringing together people from a range of communities. A number of topics were identified during the course of the Workshop, including entity naming, harmonization of vocabularies, outreach, and technical work on XBRL and the Semantic Web. A significant challenge will be to obtain the necessary funding for coordinating such work.
The Workshop was a joint W3C and XBRL International affair, and our host Richard Campbell provided a truly excellent facility at the FDIC training center. He also arranged to video the proceedings which helped with reviewing the minutes as taken in IRC, but we lacked the funds to arrange for a transcription. We also want to thank JustSystems, Inc. for their organizational sponsorship, funding Workshop co-chairs Diane Mueller and Dave Raggett. Karen Myers from W3C did a stirling job taking minutes on IRC and coordinating participant attendance. We would also like to express our thanks to Ralph Swick for helping to sort out problems with the teleconference bridge for remote participants. We wish to thank Tony Fragnito, XBRL International CEO, Mike Willis, XBRL International Board, and Mark Bolgiano, XBRL US CEO for their active participation in framing issues and challenges from an XBRL perspective. Finally, we want to thank all of the attendees for their participation, presentations and discussions, and special acknowledgement of Patrick Slattery and Walter Hamscher for their additions to this report.