Author:
Dalibor Fiala University of West Bohemia, Univerzitní 8, 30614, Plzeň, Czech Republic

Search for other papers by Dalibor Fiala in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

The CiteSeer digital library is a useful source of bibliographic information. It allows for retrieving citations, co-authorships, addresses, and affiliations of authors and publications. In spite of this, it has been relatively rarely used for automated citation analyses. This article describes our findings after extensively mining from the CiteSeer data. We explored citations between authors and determined rankings of influential scientists using various evaluation methods including citation and in-degree counts, HITS, PageRank, and its variations based on both the citation and collaboration graphs. We compare the resulting rankings with lists of computer science award winners and find out that award recipients are almost always ranked high. We conclude that CiteSeer is a valuable, yet not fully appreciated, repository of citation data and is appropriate for testing novel bibliometric methods.

Supplementary Materials

    • Supplementary Material
  • An, Y, Janssen, J, Milios, EE 2004 Characterizing and mining the citation graph of the computer science literature. Knowledge and Information Systems 6 6 664678 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bar-Ilan, J 2006 An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes. Information Processing and Management 42 6 15531566 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th World Wide Web Conference (pp. 107117). Brisbane, Australia.

    • Search Google Scholar
    • Export Citation
  • Chakrabarti, S, Agarwal, A 2006 Learning parameters in entity relationship graphs from ranking preferences. Lecture Notes in Computer Science 4213:91102 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, C. (2000). Domain visualization for digital libraries. In Proceedings of the international conference on information visualization (IV2000) (pp. 261267). London, UK.

    • Search Google Scholar
    • Export Citation
  • Feitelson, DG, Yovel, U 2004 Predictive ranking of computer scientists using CiteSeer data. Journal of Documentation 60 1 4461 .

  • Fiala, D, Rousselot, F, Ježek, K 2008 PageRank for bibliographic networks. Scientometrics 76 1 135158 .

  • Franceschet, M 2010 A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar. Scientometrics 83 1 243258 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giles, CL, Councill, IG 2004 Who gets acknowledged: Measuring scientific contributions through automatic acknowledgment indexing. Proceedings of the National Academy of Sciences of the United States of America 101 51 1759917604 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goodrum, AA, McCain, KW, Lawrence, S, Giles, CL 2001 Scholarly publishing in the Internet age: A citation analysis of computer science literature. Information Processing and Management 37 5 661675 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hopcroft, J, Khan, O, Kulis, B, Selman, B 2004 Tracking evolving communities in large linked networks. Proceedings of the National Academy of Sciences of the United States of America 101 suppl. 1 52495253 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ježek, K., Fiala, D., & Steinberger, J. (2008). Exploration and evaluation of citation networks. In Proceedings of the 12th international conference on electronic publishing (pp. 351362). Toronto, Canada.

    • Search Google Scholar
    • Export Citation
  • Kleinberg, J 1999 Authoritative sources in a hyperlinked environment. Journal of the ACM 46 5 604632 .

  • Meho, LI, Yang, K 2007 Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus scopus and google scholar. Journal of the American Society for Information Science and Technology 58 13 21052125 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Popescul, A., Ungar, L. H., Lawrence, S., & Pennock, D. M. (2003). Statistical relational learning for document mining. In Proceedings of the third IEEE international conference on data mining (ICDM’03) (pp. 275282). Melbourne, Florida, USA.

    • Search Google Scholar
    • Export Citation
  • Sidiropoulos, A, Manolopoulos, Y 2005 A citation-based system to assist prize awarding. SIGMOD Record 34 4 5460 .

  • Šingliar, T, Hauskrecht, M 2006 Noisy-OR component analysis and its application to link analysis. Journal of Machine Learning Research 7:21892213.

    • Search Google Scholar
    • Export Citation
  • Zhao, D 2005 Challenges of scholarly publications on the Web to the evaluation of science—A comparison of author visibility on the Web and in print journals. Information Processing & Management 41 6 14031418 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhao, D, Logan, E 2002 Citation analysis using scientific publications on the Web as data source: A case study in the XML research area. Scientometrics 54 3 449472 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhao, D, Strotmann, A 2007 Can citation analysis of web publications better detect research fronts?. Journal of the American Society for Information Science and Technology 58 9 12851302 .

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhou, D., Councill, I., Zha, H., & Giles, C. L. (2007). Discovering temporal communities from social network documents. In Proceedings of the seventh IEEE international conference on data mining (ICDM’07) (pp. 745750). Omaha, Nebraska, USA.

    • Search Google Scholar
    • Export Citation
  • Collapse
  • Expand

To see the editorial board, please visit the website of Springer Nature.

Manuscript submission: http://www.editorialmanager.com/scim/

For subscription options, please visit the website of Springer Nature.

Scientometrics
Language English
Size B5
Year of
Foundation
1978
Volumes
per Year
1
Issues
per Year
12
Founder Akadémiai Kiadó
Founder's
Address
H-1117 Budapest, Hungary 1516 Budapest, PO Box 245.
Publisher Akadémiai Kiadó
Springer Nature Switzerland AG
Publisher's
Address
H-1117 Budapest, Hungary 1516 Budapest, PO Box 245.
CH-6330 Cham, Switzerland Gewerbestrasse 11.
Responsible
Publisher
Chief Executive Officer, Akadémiai Kiadó
ISSN 0138-9130 (Print)
ISSN 1588-2861 (Online)