Background: Citation analysis for evaluative purposes typically requires normalization against some control group of similar papers. Selection of this control group is an open question. Objectives: Gain a better understanding of control group requirements for credible normalization. Approach: Performed citation analysis on prior publications of two proposing research units, to help estimate team research quality. Compared citations of each unit"s publications to citations received by thematically and temporally similar papers. Results: Identification of thematically similar papers was very complex and labor intensive, even with relatively few control papers selected. Conclusions: A credible citation analysis for determining performer or team quality should have the following components: – Multiple technical experts to average out individual bias and subjectivity; – A process for comparing performer or team output papers with a normalization base of similar papers; – A process for retrieving a substantial fraction of candidate normalization base papers; Manual evaluation of many candidate normalization base papers to obtain high thematic similarity and statistical representation.
Characteristics of highly and poorly cited research articles (with Abstracts) published in The Lancet over a three-year period were examined. These characteristics included numerical (numbers of authors, references, citations,
Abstract words, journal pages), organizational (first author country, institution type, institution name), and medical (medical
condition, study approach, study type, sample size, study outcome). Compared to the least cited articles, the most cited have
three to five times the median number of authors per article, fifty to six hundred percent greater median number of references
per article, 110 to 490 times the median number of citations per article, 2.5 to almost seven times the median number of Abstract
words per article, and 2.5 to 3.5 times the median number of pages per article.
The most cited articles’ medical themes emphasize breast cancer, diabetes, coronary circulation, and HIV immune system problems,
focusing on large-scale clinical trials of drugs. The least cited articles’ themes essentially do not address the above medical
issues, especially from a clinical trials perspective, cover a much broader range of topics, and have much more emphasis on
social and reproductive health issues. Finally, for sample sizes of clinical trials specifically, those of the most cited
articles ranged from a median of about 1500 to 2500, whereas those of the least cited articles ranged from 30 to 40.
Text mining was used to extract technical intelligence from the open source global SARS research literature. A SARS-focused query was applied to the Science Citation Index (SCI) (SCI 2008) database for the period 1998–early 2008. The SARS research literature infrastructure (prolific authors, key journals/institutions/countries, most cited authors/journals/documents) was obtained using bibliometrics, and the SARS research literature technical structure (hierarchical taxonomy) was obtained using computational linguistics/document clustering.
Authors:Ronald Kostoff, Raymond Koytcheff, and Clifford Lau
Text mining was used to extract technical intelligence from the open source global nanotechnology and nanoscience research
literature. An extensive nanotechnology/nanoscience-focused query was applied to the Science Citation Index/Social Science
Citation Index (SCI/SSCI) databases. The nanotechnology/nanoscience research literature infrastructure (prolific authors,
key journals/institutions/countries, most cited authors/journals/documents) was obtained using bibliometrics. A novel addition
was the use of institution and country auto-correlation maps to show co-publishing networks among institutions and among countries,
and the use of institution-phrase and country-phrase cross-correlation maps to show institution networks and country networks
based on use of common terminology (proxy for common interests). The use of factor matrices quantified further the strength
of the linkages among institutions and among countries, and validated the co-publishing networks shown graphically on the
Authors:Ronald Kostoff, Ryan Barth, and Clifford Lau
This study evaluates trends in quality of nanotechnology and nanoscience papers produced by South Korean authors. The metric
used to gauge quality is ratio of highly cited nanotechnology papers to total nanotechnology papers produced in sequential
time frames. In the first part of this paper, citations (and publications) for nanotechnology documents published by major
producing nations and major producing global institutions in four uneven time frames are examined. All nanotechnology documents
in the Science Citation Index [SCI, 2006] for 1998, 1999–2000, 2001–2002, 2003 were retrieved and analyzed in March 2007.
In the second part of this paper, all the nanotechnology documents produced by South Korean institutions were retrieved and
examined. All nanotechnology documents produced in South Korea (each document had at least one author with a South Korea address)
in each of the above time frames were retrieved and analyzed. The South Korean institutions were extracted, and their fraction
of total highly cited documents was compared to their fraction of total published documents. Non-Korean institutions that
co-authored papers were included as well, to offer some perspective on the value of collaboration.