By Mike Polyakov, California Common Sense, Research Director
Like any conference on a topic that has yet to coalesce into a discipline or concrete program, the 2012 International Open Government Data Conference was at times chaotic, vague, inspiring, vacuous, amazingly profound and tenuously groping for meaning. Organized jointly by the World Bank and Data.gov, it brought together government officials from across the globe, data analysis and visualization companies, and civil society groups to discuss the promotion and expansion of “Open Data.”
Open Data is largely a philosophy of unrestricted accessibility of data, be it governmental, scientific or user-generated. Though it builds on the success of the older Open Source community, where software is deemed "open" if there are no restrictions on its use and distribution, Open Data itself is younger, only coming into its own with the maturation of the internet.
Following the 2009 launch of the US's data.gov, nearly two dozen other national governments followed suit by a setting up consolidated data portals at the national level, not to mention dozens governments doing so at the city and state levels in the US. Most recently, San Francisco and New York City have launched substantial data portals, on top of which a variety of mobile applications have already been built. The trend is spawning new concepts like "data journalism" that would have made little sense just five years ago. Open Data might be contrasted with commercial Big Data, such as the massive sets of user behavior statistics that companies like Facebook and Google collect for operational and commercial purposes.
However, if my experience at the three day conference made one thing clear, it is that making data easily available is just the first step on a long road to real impact. What is to be done with this sudden flood of available information? Presentations by governments from Mexico, Brazil, Moldova, and others, demonstrated that the concept has yet to fully emerge from its buzzword stage. Panelists visibly struggled to inject meaning into their flow charts that arranged ‘citizens’ ‘business’, ‘academia’, ‘government’ in various sequences, joined with colorful arrows labeled ‘collaboration’, ‘tool’, ‘skills’, ‘initiatives.’ That is not to say that Open Data is a vacuous concept any more than Big Data. But unlike Open Source, where software (typically) has direct value, Open Data is still a means searching for an end.
A number of insights that emerged from the conference offer hope that revolutionary progress is just around the corner:
- There is a very explicit recognition that the burgeoning field of data needs better organization and labeling. One of the participants compared asking people to use the data to build businesses to dropping off someone at Costco. We'd drop them off without recipes, without knowledge of the products, without knowing how they taste, and then say, “now cook something.” Making data useful for citizens is a process that is just beginning; as evidenced by the preponderance of government officials participating at the conference, Open Data civil society groups are only starting to coalesce and engage.
- Open Data will always need a translation step before it becomes actually useful to anyone; data portals are never going to be used directly by the general public. Recognizing this, governments are tapping into the entrepreneurial sphere. New York, Buenos Aires, San Diego, and other cities have recently held competitions or hackathons to create innovative and useful applications of their data.
- Open Data is not about creating one giant dataset. Decentralization allows small datasets to be as effective, if they are properly categorized and linked. Crowd-sourcing is in fact becoming a way for augmenting centrally collected data.
- Vital to the success of the movement is discovering effective ways to measure positive impact of the data being served. It is not enough, for example, to count visits or downloads directly from the portal. A way of measuring the impact directly is required: how many times people read the NYT article based on your data, how many apps build on it, whether it has affected decisions made. Standardizing and implementing these measures is a challenge whose solution still evades us.
Aside from the technical obstacles and uncertainties, conference participants noted some obvious and not so obvious challenges. A Mexican official shared the story of a project to put all government spatial data in one place – after two years, it had moved nowhere due to institutional resistance and messiness of the data. Even where powerful actors in a government back Open Data initiatives, institutional inertia and a culture resistant to transparency will need to be overcome.
Of even greater concern is the question Is Open Data always a good thing? Will it be used to amplify the digital divide? A conference participant told the story of an African country in which a release of property survey information allowed the upper classes to manipulate the system to wrestle land from poorer landowners. There is a very real possibility that increased information empowers those already best positioned to use it.
Among these questions though, navigating the growing ocean of data remains the most pressing technical challenge. Later, the second part of this post will explore the formidable technical side of Open Data.