Authors:L. Quoniam, F. Balme, H. Rostaing, E. Giraud, and J. Dou
Zipf's law was used to qualify all the key-words of documents in a data set. This qualification was used to build a graphical
representation of the resulting indicator in each document. The graphical resolution leads to a document dispatch in a three
dimensional space. This graphical representation was used as an information retrieval tool without using any keyword. The
presentation of a case study is internet available. The graph is drawn in Virtual Reality Markup Language (VRML) allowing
a dynamic picture which is linked to a Database Management System (FreeWais). The experimentation was drawn to get a first
impression of documents data set by querying without any keyword.
Authors:Ying Ding, Gobinda Chowdhury, and Schubert Foo
A journal co-citation analysis of fifty journals and other publications in the information retrieval (IR) discipline was conducted over three periods spanning the years of 1987 to 1997. Relevant data retrieved from the Science Citation Index (SCI) and Social Science Citation Index (SSCI) are analysed according to the highly cited journals in various disciplines, especially in the Library & Information Science area. The results are compared with previous research that covered the data only from the Social Science Citation Index (SSCI). The analysis reveals that there is no distinct difference between these two sets of results. The results of current study show that IR speciality is multi-disciplinary with broad relations with other specialities. The field of IR is a mature field, as the journals used for research communication remained quite stable during the study period.
The most popular method for judging the impact of biomedical articles is citation count which is the number of citations received.
The most significant limitation of citation count is that it cannot evaluate articles at the time of publication since citations
accumulate over time. This work presents computer models that accurately predict citation counts of biomedical publications
within a deep horizon of 10 years using only predictive information available at publication time. Our experiments show that
it is indeed feasible to accurately predict future citation counts with a mixture of content-based and bibliometric features
using machine learning methods. The models pave the way for practical prediction of the long-term impact of publication, and
their statistical analysis provides greater insight into citation behavior.
Authors:Patricia Laurens, Michel Zitt, and Elise Bassecoulard
In advanced methods of delineation and mapping of scientific fields, hybrid methods open a promising path to the capitalisation
of advantages of approaches based on words and citations. One way to validate the hybrid approaches is to work in cooperation
with experts of the fields under scrutiny. We report here an experiment in the field of genomics, where a corpus of documents
has been built by a hybrid citation-lexical method, and then clustered into research themes. Experts of the field were associated
in the various stages of the process: lexical queries for building the initial set of documents, the seed; citation-based
extension aiming at reducing silence; final clustering to identify noise and allow discussion on border areas. The analysis
of experts’ advices show a high level of validation of the process, which combines a high-precision and low-recall seed, obtained
by journal and lexical queries, and a citation-based extension enhancing the recall. This findings on the genomics field suggest
that hybrid methods can efficiently retrieve a corpus of relevant literature, even in complex and emerging fields.
Authors:Peter Mutschke, Philipp Mayr, Philipp Schaer, and York Sure
focuses on the applicability of science models in scholarly InformationRetrieval (IR) with regard to the improvement of search strategies in growing scientific information spaces. Introducing an IR perspective in science modeling is motivated by the fact
-citation analysis (ACA) is a bibliometric technique that provides an understanding of the intellectual structure of disciplines (White and Griffith 1982 ). It has been applied to understand intellectual structure in many fields, such as informationretrieval (Ding
In this paper, the internal law of delay in the secondary literature publishing process is presented. The process is demonstrated
to abide by the partial differential equation of periodical literature publishing process. A definite solution of the publishing
delay process is derived. Accordingly, the expression of average publication delay indicator based on the particular solution
is deduced. Then the problem is studied that some information of primary literatures is missed in information retrieval, and
the relationship is established between the average delay indicator and the miss ratio of primary literatures in the index
periodicals or databases. Also it is proposed that the primary literature should be used as a supplemental tool in information
retrieval to guarantee the recall ratio.
Authors:Marie-Angèle De Looze and Juliette Lemarié
Different corpuses are analysed by means of co-word analysis, in the framework of technological watch of the industrial valorization
of plant proteins. The comparison of keyword clusters reveals unequal results, raising the question of the relevance of information
retrieval. The corpuses compiled do not provide all the important signals that can be expected from this type of study. Research
on several data bases (five) provides increasingly detailed images which allow for rapid progress, with the experts, towards
critical points of information.