Data Transparency via Data.gov
Government data permeates our lives. The atomic clock at the National Institute of Standards and Technology (NIST) standardizes our time, dictating when we arrive at meetings and take our children to soccer practice. The Centers for Disease Control and Prevention (CDC) provides our doctors and media outlets with information about how to keep our families healthy when there is a new public health concern, such as the H1N1 (swine flu) virus.
Data is powerful. It informs and it creates opportunities. It promotes transparency and it helps to ensure accountability. Yet, it is a challenge to collect, organize, and communicate the vast stores of data maintained across the government.
The Administration is committed to moving past these barriers in providing the American public withunprecedented access to useful, unfiltered government data. An important part of that effort is Data.gov, a platform for free access to data generated across all federal agencies. Through Data.gov, we aim to provide an open architecture and to make data available in multiple formats. The goal of Data.gov is to enable better decision-making, drive transparency, and help to power innovation for a stronger America. If you haven’t yet checked it out, I encourage you to do so. Whether for a school research project, developing a new application, or evaluating a business opportunity, you might just be surprised by what you find.
What We Learned From Phase I
The Brainstorm phase of our open government outreach yielded a number of suggestions about how to improve access to data and metadata. Some general themes included:
- Set clear agency targets for bringing agency data online;
- Maintain a transparency dashboard to show progress towards releasing data;
- Find new, standardized ways to inventory and prioritize agency data for publication in open, downloadable formats;
- Collaborate with the third parties to continually improve Data.gov;
- Make Data.gov as comprehensive as possible for non-classified information; and
- Adopt data dictionaries to ensure that terms have the same meaning across agencies.
We also received some more technical suggestions, such as:
- Adopt the latest innovative technologies for disseminating data, including RSS data feeds;
- Create permalinks on the paragraph level to make documents easier to cite;
- Standardize discovery and method calls to data sets;
- Adopt better software for comparing relevance and meaning of documents to make government information more searchable;
- Allow citizens to build their own applications on top of government online services; for example, using a "Services Oriented Architecture" approach;
- Convert Depository Libraries around the country into Regional Data Centers;
- Make the National Archives and Records Administration (NARA) the off-site electronic backup data center for all agency e-record systems;
- Make contributed data subject to a waiver of copyright and database rights using the "CCO" scheme from Creative Commons; and
- Digitize all government research reports and make them available free via the National Technical Information Service (NTIS).
How You Can Help
Now that we’ve moved from brainstorming to this discussion phase, there are three specific ways you can help:
- Our goal is to improve collection, storage, and dissemination of data government-wide. We’d appreciate your feedback on how to improve and grow Data.gov over time: How should we ask agencies to contribute data sets to Data.gov? Should we have them inventory and prioritize all their data? Or set a fixed number of data sets that must be published each year? Or set a voluntary target?
- While our focus here is on developing government-wide policy for data transparency, we are also interested in hearing what new data you’d like to see on Data.gov and why. We’d also like to encourage you to make suggestions directly to Data.gov here.
- Finally, tell us what types of applications you’d like to see built to leverage all this data. Share with us a little about why you think those applications might be compelling. Better yet, if you are a software developer, we encourage you to start using Data.gov to build applications useful to businesses, government, and the American people!
Thanks for your dedication to this effort and we look forward to hearing your thoughts over at the OSTP blog.
Vivek Kundra is the U.S. Chief Information Officer.