Democratizing our Data
A Significant Contribution to an Important Issue
In Democratizing Our Data, Julia Lane argues that good data are essential for democracy. She believes that public policy choices can only be made intelligently when the people making the decisions have accurate and objective statistical information to inform them of the choices they face and the results of choices they make.
“We must rethink ways to democratize data. There are successful models to follow and new legislation that can help effect change. The private sector's Data Revolution—where new types of data are collected and new measurements created by the private sector to build machine learning and artificial intelligence algorithms—can be mirrored by a public sector Data Revolution, one that is characterized by attention to counting all who should be counted, measuring what should be measured, and protecting privacy and confidentiality. Just as US private sector companies—Google, Amazon, Microsoft, Apple, and Facebook—have led the world in the use of data for profit, the US can show the world how to produce data for the public good.”
Lane’s book really only covers the US. It is very focused on the institutional problems there in chapter 3, and features a couple of good case studies on developing useful data sets from disparate sources in chapter 4. While the problems collecting and managing data for national statistics in the US is unique, broader issues around extent and quality are not. Chapter 2 addresses those issues, and looks at why measurement is difficult, and why it is hard for agencies to innovate (no incentive) and develop (no funding) new measures. Its very much an insiders account. I thought it a big improvement on recent books on GDP etc that tend to highlight the problems, it’s a good read (and quick, at 120 pages).
There is discussion on new data sources, and how the private sector finds ways to use it. However, because public data requires confidentiality agencies need new tools and skills to be able to use it. That is chapter 5, and chapter 6 proposes a new organizational model. Lane makes a compelling argument for building a new public data system in order to safeguard privacy and improve the US government's ability to implement policy initiatives.
I suspect National Statistical Agencies everywhere are under pressure.
Lane emphasises the increasing costs and diminishing returns for surveys, the traditional source of data. However, bureaucratic inertia and vested interests, lack of funding for pilot projects, and privacy and confidentiality issues combine to make developing new sources and products difficult. How difficult in different countries I don’t know. I’d like to think most would have something based on administrative data by now, but am probably being way too optimistic.
As the cover says “A Manifesto”. For those who care about data and the statistics used for policy decisions on the economy, health, education, transport, community and social assistance and so on, this book is a must read.