View More View Less
  • 1 Observatoire des sciences et des techniques (OST) 93 rue de Vaugirard 75006 Paris France
  • | 2 INRA SAE2 LERECO—Unit 1134 Nantes France
Restricted access

Cross Mark

Abstract  

In advanced methods of delineation and mapping of scientific fields, hybrid methods open a promising path to the capitalisation of advantages of approaches based on words and citations. One way to validate the hybrid approaches is to work in cooperation with experts of the fields under scrutiny. We report here an experiment in the field of genomics, where a corpus of documents has been built by a hybrid citation-lexical method, and then clustered into research themes. Experts of the field were associated in the various stages of the process: lexical queries for building the initial set of documents, the seed; citation-based extension aiming at reducing silence; final clustering to identify noise and allow discussion on border areas. The analysis of experts’ advices show a high level of validation of the process, which combines a high-precision and low-recall seed, obtained by journal and lexical queries, and a citation-based extension enhancing the recall. This findings on the genomics field suggest that hybrid methods can efficiently retrieve a corpus of relevant literature, even in complex and emerging fields.