The Science of Science and Innovation Policy (SciSIP) program at the National Science Foundation (NSF) supports research designed to advance the scientific basis of science and innovation policy. The program was established at NSF in 2005 in response to a call from Dr. John Marburger III, then science advisor to the U.S. President, for a “science” of science policy. As of January 2011, it has co-funded 162 awards that aim to develop, improve, and expand data, analytical tools, and models that can be directly applied in the science policy decision making process. The long-term goals of the SciSIP program are to provide a scientifically rigorous and quantitative basis for science policy and to establish an international community of practice. The program has an active listserv that, as of January 2011, has almost 700 members from academia, government, and industry. This study analyzed all SciSIP awards (through January 2011) to identify existing collaboration networks and co-funding relations between SciSIP and other areas of science. In addition, listserv data was downloaded and analyzed to derive complementary discourse information. Key results include evidence of rich diversity in communication and funding networks and effective strategies for interlinking researcher and science policy makers, prompting discussion, and resource sharing.
Authors:Hanning Guo, Scott Weingart, and Katy Börner
This study presents a mixed model that combines different indicators to describe and predict key structural and dynamic features of emerging research areas. Three indicators are combined: sudden increases in the frequency of specific words; the number and speed by which new authors are attracted to an emerging research area, and changes in the interdisciplinarity of cited references. The mixed model is applied to four emerging research areas: RNAi, Nano, h-Index, and Impact Factor research using papers published in the Proceedings of the National Academy of Sciences of the United States of America (1982–2009) and in Scientometrics (1978–2009). Results are compared in terms of strengths and temporal dynamics. Results show that the indicators are indicative of emerging areas and they exhibit interesting temporal correlations: new authors enter the area first, then the interdisciplinarity of paper references increases, then word bursts occur. All workflows are reported in a manner that supports replication and extension by others.
Authors:Kevin Boyack, Katy Börner, and Richard Klavans
How does our collective scholarly knowledge grow over time? What major areas of science exist and how are they interlinked?
Which areas are major knowledge producers; which ones are consumers? Computational scientometrics — the application of bibliometric/scientometric
methods to large-scale scholarly datasets — and the communication of results via maps of science might help us answer these
questions. This paper represents the results of a prototype study that aims to map the structure and evolution of chemistry
research over a 30 year time frame. Information from the combined Science (SCIE) and Social Science (SSCI) Citations Indexes
from 2002 was used to generate a disciplinary map of 7,227 journals and 671 journal clusters. Clusters relevant to study the
structure and evolution of chemistry were identified using JCR categories and were further clustered into 14 disciplines.
The changing scientific composition of these 14 disciplines and their knowledge exchange via citation linkages was computed.
Major changes on the dominance, influence, and role of Chemistry, Biology, Biochemistry, and Bioengineering over these 30
years are discussed. The paper concludes with suggestions for future work.
Authors:Kevin W. Boyack, Richard Klavans, and Katy Börner
Summary This paper presents a new map representing the structure of all of science, based on journal articles, including both the natural and social sciences. Similar to cartographic maps of our world, the map of science provides a bird’s eye view of today’s scientific landscape. It can be used to visually identify major areas of science, their size, similarity, and interconnectedness. In order to be useful, the map needs to be accurate on a local and on a global scale. While our recent work has focused on the former aspect,1 this paper summarizes results on how to achieve structural accuracy. Eight alternative measures of journal similarity were applied to a data set of 7,121 journals covering over 1 million documents in the combined Science Citation and Social Science Citation Indexes. For each journal similarity measure we generated two-dimensional spatial layouts using the force-directed graph layout tool, VxOrd. Next, mutual information values were calculated for each graph at different clustering levels to give a measure of structural accuracy for each map. The best co-citation and inter-citation maps according to local and structural accuracy were selected and are presented and characterized. These two maps are compared to establish robustness. The inter-citation map is then used to examine linkages between disciplines. Biochemistry appears as the most interdisciplinary discipline in science.
Authors:Katy Börner, Shashikant Penumarthy, Mark Meiss, and Weimao Ke
This paper reports the results of a large scale data analysis that aims to identify the production, diffusion, and consumption
of scholarly knowledge among top research institutions in the United States. A 20-year publication data set was analyzed to
identify the 500 most cited research institutions and spatio-temporal changes in their inter-citation patterns. A novel approach
to analyzing the dual role of institutions as producers and consumers of scholarly knowledge and to study the diffusion of
knowledge among them is introduced. A geographic visualization metaphor is used to visually depict the production and consumption
of knowledge. The highest producers and their consumers as well as the highest consumers and their producers are identified
and mapped. Surprisingly, the introduction of the Internet does not seem to affect the distance over which scholarly knowledge
diffuses as manifested by citation links. The citation linkages between institutions fall off with the distance between them,
and there is a strong linear relationship between the log of the citation counts and the log of the distance. The paper concludes
with a discussion of these results and future work.
Authors:Gavin LaRowe, Sumeet Ambre, John Burgoon, Weimao Ke, and Katy Börner
The Scholarly Database aims to serve researchers and practitioners interested in the analysis, modelling, and visualization
of large-scale data sets. A specific focus of this database is to support macro-evolutionary studies of science and to communicate
findings via knowledge-domain visualizations. Currently, the database provides access to about 18 million publications, patents,
and grants. About 90% of the publications are available in full text. Except for some datasets with restricted access conditions,
the data can be retrieved in raw or pre-processed formats using either a web-based or a relational database client. This paper
motivates the need for the database from the perspective of bibliometric/scientometric research. It explains the database
design, setup, etc., and reports the temporal, geographical, and topic coverage of data sets currently served via the database.
Planned work and the potential for this database to become a global testbed for information science research are discussed
at the end of the paper.
Authors:Katy Börner, Weixia Huang, Micah Linnemeier, Russell Duhon, Patrick Phillips, Nianli Ma, Angela Zoss, Hanning Guo, and Mark Price
The enormous increase in digital scholarly data and computing power combined with recent advances in text mining, linguistics,
network science, and scientometrics make it possible to scientifically study the structure and evolution of science on a large
scale. This paper discusses the challenges of this ‘BIG science of science’—also called ‘computational scientometrics’ research—in
terms of data access, algorithm scalability, repeatability, as well as result communication and interpretation. It then introduces
two infrastructures: (1) the Scholarly Database (SDB) (http://sdb.slis.indiana.edu), which provides free online access to 22 million scholarly records—papers, patents, and funding awards which can be cross-searched
and downloaded as dumps, and (2) Scientometrics-relevant plug-ins of the open-source Network Workbench (NWB) Tool (http://nwb.slis.indiana.edu). The utility of these infrastructures is then exemplarily demonstrated in three studies: a comparison of the funding portfolios
and co-investigator networks of different universities, an examination of paper-citation and co-author networks of major network
science researchers, and an analysis of topic bursts in streams of text. The article concludes with a discussion of related
work that aims to provide practically useful and theoretically grounded cyberinfrastructure in support of computational scientometrics
research, education and practice.