In March 2012, the Obama Administration launched a $200 million Big Data Research and Development Initiative. By improving our ability to extract knowledge and insights from large and complex collections of digital data, the initiative promises to help accelerate the pace of discovery in science and engineering, strengthen our national security, transform teaching and learning, and improve health outcomes while lowering costs.
Earlier this month, the National Institutes of Health announced that the President’s FY14 budget proposal will provide at least $40 million to launch a new Big Data to Knowledge (BD2K) program, significantly expanding NIH’s participation in the Administration’s initiative. This program will:
- Facilitate the broad use and sharing of large, complex biomedical data sets through the development of policies, resources and standards;
- Develop and disseminate new analytical methods and software;
- Enhance training of data scientists, computer engineers, and bioinformaticians; and
- Establish Centers of Excellence to develop generalizable approaches that address important problems in biomedical analytics, computational biology, and medical informatics.
The case for this investment is clear. Biomedical researchers, health care professionals and patients are generating huge amounts of data from an array of devices such as genomic sequencing machines, high-resolution medical imagers, electronic health records, and smart phone applications that monitor patient health. The ability to visualize, manipulate, and mine Big Data provides opportunities to enhance our understanding of disease onset and progression, identify new therapeutic avenues, and speed the translation of new discoveries into improved health and health care.
In addition to the BD2K program, NIH is supporting a variety of other initiatives to accelerate the pace of discovery through the use of Big Data. For example:
- The Human Connectome Project and the BRAIN Initiative (announced by President Obama earlier this month) are efforts to map neural pathways that underlie human brain function. These will set the stage to discover abnormal brain circuits that contribute to neurological and psychiatric disorders.
- The Cancer Genome Atlas project applies large-scale genome sequencing to accelerate our understanding of the molecular basis of cancer
- PhysioNet offers online access to large collections of complex physiologic signals, such as cardiac rhythms and gait dynamics, and related open-source data analysis software to catalyze research advances in the underlying mechanisms of health, disease, and aging.
As Big Data challenges in biomedical research are shared with other areas of scientific research, BD2K will also require effective collaboration and coordination with other government agencies tackling similar challenges, including the National Science Foundation and the Department of Energy, as well as privately funded efforts. Big Data can accelerate the translation of data to bedside applications that advance the detection, diagnosis, treatment and prevention of disease. With proper investments and coordination with other government agencies and private sector stakeholders, NIH can help realize the health benefits of the Big Data revolution.
Tom Kalil is Deputy Director for Technology and Innovation at OSTP
Eric Green is Director of the National Human Genome Research Institute