Skip to Content

Michael Price ,

New Federal Big Data Initiative to Drive Computational Training

The Obama administration, in partnership with several federal agencies including the National Institutes of Health (NIH) and the National Science Foundation (NSF), today announced the creation of the Big Data Research and Development Initiative to improve the government’s, academia’s, and private industry’s ability to collect and make sense of the vast amounts of data pouring in from health records, national labs, consumer-based reporting, and other sources.

Representatives from the federal agencies involved gathered for a
briefing this afternoon at the American Association for the Advancement
of Science (which publishes Science Careers) in Washington, DC.
John Holdren, director of the White House Office of Science and
Technology Policy, said there is a critical need in the United States
for an increased “ability to move from data to knowledge to action.”
the private sector will clearly take the lead in developing
big-data-related products and services, the government can play an
important role by supporting long-term R&D, investing in the
big-data workforce, using big-data approaches to make progress on key
national challenges, and increasing access to the government’s own
data,” Holdren said.
So far, the federal government has
under-invested in this capacity, he said, so the government has
assembled a number of new programs to help build the infrastructure
necessary to big-data collection and analysis, and to train the
scientists necessary to analyze it. Together, the programs total more
than $200 million in new investments.
As part
of that effort, NSF Director Subra Suresh revealed that a $2 million
award will be given to a research training group to design an
undergraduate curriculum that teaches students how to use complex
graphical and visualization tools for giant data sets. An unspecified
amount will be spent encouraging and providing support for institutions
to develop interdisciplinary graduate programs dedicated to training
scientists and engineers to work with such data.
Suresh added that NSF will use its existing Integrative Graduate Education and Research Traineeship Program to support training and education for researchers who work with very large data sets.
“Data increasingly serve as the primary driver for discovery and decision-making,” he said.
course, once you train this next generation of data scientists,
theoretically there should be jobs available, ready to take advantage of
their skills. Will there be companies ready and willing to hire these
scientists in a few years?
According to James
Manyika, director of the McKinsey Global Institute (MGI), a business and
economics research firm, a report his organization published last year
estimates that within a few years there will be a shortage of between
150,000 and 190,000 people in the United States with “deep analytical
skills” who can work with very large data sets; some 300,000 to 400,000
people needed for skilled technician and support staff positions; and
1.5 million people needed to be “data-savvy managers and
decision-makers.” Most of these jobs, the report estimates, will be
available in health care, drug discovery and development, software
engineering, retail, and manufacturing.
predictions, though, are difficult to match up with reality. They’re
based on analyses of data from the U.S. Bureau of Labor Statistics, the
U.S. Census, and MGI’s own interviews with companies. Whether companies
will actually offer that many jobs to people with big-data skills will
depend on a number of factors that can’t yet be determined. The pace of
the economic recovery will surely play a role in whether MGI’s numbers
bear out. For a briefing dedicated to gigantic amounts of data, the data
supporting the existence of future jobs for big-data trainees was
surprisingly sparse.
As for the immediate
demand, it’s highly regional. Companies in the Silicon Valley region of
California can’t hire enough people to fill the demand for workers who
can work with large data sets, said Daphne Koller, a computer scientist
at Stanford University. But that’s much less true in parts of the
country that lack Silicon Valley’s richness of biotech and start-up
Nevertheless, NIH Director Francis
Collins was bullish about future trainees’ job prospects. “If I were a
college senior or a first-year graduate student interested in biology, I
would migrate as fast as I could into computational biology,” he said.
“It is a very appealing career path.”
also mentioned that NIH is dedicated to providing training programs for
current scientists to acquire the kinds of big-data skills that would
allow them to compete for these jobs, though he did not say what these
programs would look like.