Statistical hypothesis testing has many limitations and problems, and its extensive use in the social sciences is often criticized. For an introduction into the literature on this issue, see for example Kline (2004).
All confidence intervals that we report were calculated using a bootstrapping approach (e.g., Efron and Tibshirani 1993).
For comparison, suppose the 147 research groups would be sorted in increasing order of their CPP/FCSm score, and suppose the first 30 groups would be given a quality score of 3, the next 78 groups would be given a quality score of 4, and the final 39 groups would be given a quality score of 5. The mean CPP/FCSm scores of the groups with a quality score of 3, 4, and 5 would then be 0.75, 1.37, and 2.55, respectively. Hence, for groups with a quality score of 5 and groups with a quality score of 4, the difference would be 1.18 (rather than 0.44). For groups with a quality score of 4 and groups with a quality score of 3, the difference would be 0.62 (rather than 0.53).
The correlation of 0.45 is somewhat lower than the correlations reported by Moed (2005, p. 241) for a number of similar data sets. It should be noted that because of the many ties in the quality scores it is impossible to obtain a Spearman rank correlation of one. A more appropriate correlation measure would be the variant of the Kendall rank correlation discussed by Adler (1957). Using this measure, it is always possible to obtain a correlation of one. We obtain a correlation of 0.46 (95% conf. int.: 0.32–0.59) using this measure.
Adler, LM 1957 A modification of Kendall's tau for the case of arbitrary ties in both rankings. Journal of the American Statistical Association 52 277 33–35 .
Hirsch, JE 2005 An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences 102 46 16569–16572 .
Kline, RB 2004 Beyond significance testing: reforming data analysis methods in behavioral research American Psychological Association Washington .
Opthof, T., & Leydesdorff, L. (2011). A comment to the paper by Waltman et al., Scientometrics, 87, 467-481, 2011. Scientometrics. doi: .
Rinia, EJ Th N Van Leeuwen HG Van Vuren AFJ Van Raan 1998 Comparative analysis of a set of bibliometric indicators and central peer review criteria: evaluation of condensed matter physics in the Netherlands. Research Policy 27 1 95–107 .
- Search Google Scholar
- Export Citation
Rinia, EJ Th N Van Leeuwen HG Van Vuren AFJ Van Raan 1998 Comparative analysis of a set of bibliometric indicators and central peer review criteria: evaluation of condensed matter physics in the Netherlands. Research Policy 27 1 95– 107 10.1016/S0048-7333(98)00026-2.
AFJ Van Raan 2006 Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics 67 3 491–502.
Waltman, L NJ Van Eck TN Van Leeuwen Visser, MS AFJ Van Raan 2011 Towards a new crown indicator: an empirical analysis. Scientometrics 87 3 467–481 .