Genbank and Pubmed Central: Creating the Tools for Scientific Discovery

David Lipman

David Lipman is being honored as a Champion of Change for the vision he has demonstrated and for his commitment to open science.

The National Institutes of Health (NIH) has a long history of supporting open access to the research it funds. That approach, I believe, recognizes the fact that science is cumulative, and that the greatest benefit to public health will be achieved if scientists can rapidly and easily access the research that has come before them. As NIH explains in a 2003 policy statement, “data sharing is essential for expedited translation of research results into knowledge, products, and procedures to improve human health.” I wholeheartedly agree.

At the National Center for Biotechnology Information (NCBI), the division of NIH’s National Library of Medicine (NLM) that I direct, we produce and make available more than 40 online databases. All of our databases are freely available to the public. However, two that people often think of when considering open access are GenBank, our database of all publically available DNA sequences (including the sequences from the Human Genome Project), and PubMed Central, our online archive of peer-reviewed biomedical sciences literature.

PubMed Central, or PMC as we commonly call it, is also the repository for articles submitted in compliance with the NIH Public Access Policy. The policy, which was implemented in 2008 as a result of legislation, requires that papers arising from NIH-funded research be made publicly available in PMC within 12 months of publication. As of result of the legislation and policy, hundreds of thousands of research papers have been made available to researchers, medical professionals, educators/students, and the general public. 

PMC, however, is more than just a repository for scientific articles: it is an integral part of a larger information infrastructure that aims to accelerate biomedical discovery. One of the key concepts we focus on here at NCBI is trying to surface information that is relevant to a user’s query, but that they may not have thought to look for. That is, we try to help them look under that rock that they might have otherwise passed by. We are able to do this because of the underlying integration of the data and information in our databases. That integration also allows users to easily move between different types of related data and information -- for example going from a genetic sequence to a published article that cites that sequence, and then to the structure of a protein related to that same sequence.

Our hope, and my belief, is that these efforts further enhance the ability to make discoveries and bring added value to open-access information. I appreciate receiving this Champions of Change award, but I'd like to emphasize that it is the talented and hard-working folks at NCBI that have made our databases and services so well-received.

David Lipman, Founding Director of the National Center for Biotechnology Information (NCBI) at the National Institutes of Health’s National Library of Medicine (NLM).

Your Federal Tax Receipt