In this paper, the machine learning tools were used to identify key features influencing citation impact. Both the papers’ external and quality information were considered in constructing papers’ feature space. Based on the feature space, the soft fuzzy rough set was used to generate a series of associated feature subsets. Then, the KNN classifier was used to find the feature subset with the best classification performance. The results show that citation impact could be predicted by objectively assessed factors. Both the papers’ quality and external features, mainly represented as the reputation of the first author, are contributed to future citation impact.
Aksnes, DW. Characteristics of highly cited papers. Research Evaluation2003123159–170.
Case, DO, Higgins, GM. How can we investigate citation behavior? A study of reasons for citing literature in communication. 2000517635–64510.1002/(SICI)1097-4571(2000)51:7<635::AID-ASI6>3.0.CO;2-H.)| false
Danell, R. Can the quality of scientific work be predicted using information on the author's track record?. Journal of the American Society for Information Science and Technology201162150–6010.1002/asi.21454.)| false
Fu, L, Aliferis, C. Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature. Scientometrics201085:257–27010.1007/s11192-010-0160-5.)| false
Glänzel, W, Schlemmer, B, Thijs, B. Better late than never? On the chance to become highly cited only beyond the standard bibliometric time horizon. Scientometrics2003583571–58610.1023/B:SCIE.0000006881.30700.ea.)| false
Hewings, A, Lillis, T, Vladimirou, D. Who's citing whose writings? A corpus based study of citations as interpersonal resource in English medium national and English medium international journals. Journal of English for Academic Purposes201092102–115.
Hewings, A, Lillis, T, Vladimirou, D. Who's citing whose writings? A corpus based study of citations as interpersonal resource in English medium national and English medium international journals. Journal of English for Academic Purposes201092102–11510.1016/j.jeap.2010.02.005.)| false
Levitt, JM, Thelwall, M. Is multidisciplinary research more highly cited? A macrolevel study. Journal of the American Society for Information Science and Technology200859121973–198410.1002/asi.20914.)| false
Levitt, JM, Thelwall, M. The most highly cited Library and Information Science articles: Interdisciplinarity, first authors and citation patterns. Scientometrics200978145–6710.1007/s11192-007-1927-1.)| false
Penas, CS, Willett, P. Gender differences in publication and citation counts in librarianship and information science research. Journal of Information Science2006325480–48510.1177/0165551506066058.)| false