Authors:Thomas Gurney, Edwin Horlings, and Peter van den Besselaar
)—where variance in names is substantially lower—reinforce the need for an automated approach to author disambiguation.
There is a need for algorithms designed to extract patterns of similarity from different variables, patterns that can set one
Authors:Hyunseok Park, Janghyeok Yoon, and Kwangsoo Kim
industries require more technologies for individual products (Carree et al. 2000 ). Thus, an automated method is necessary to support experts.
In fact, various factors such as patentability, technological similarity, and scope of claims should be
In several scientific areas—like ecology, information retrieval, machine learning, psychology and scientometrics—the measurement of similarity between objects plays a role. In scientometrics, examples of considered
Authors:Qiuju Zhou, Ronald Rousseau, Liying Yang, Ting Yue, and Guoliang Yang
science, technology and society.
The field of ecology has a long tradition of studies related to diversity. Consequently, there exists an extensive literature on measures of diversity within populations/communities and dissimilarity or similarity
reliable. However, the analysis of patents on the basis of combined concepts gives rise to three questions for research:
How can combined concepts be extracted and built from patents? How can textual similarities between patents be assessed on the basis of
asking researchers to rate the papers they read, and so on. Finally, inter-researcher similarity measures are calculated on these data to identify the scientists sharing most interests with the user of the SLRS.
Although mature and appealing in
Journals covered by the 2006 Science Citation Index Journal Citation Reports database have been subjected to a clustering
procedure utilizing h-similarity as the underlying similarity measure. Clustering complemented with a prototyping routine
provided well-conceivable results that are both compatible with and further refine existing taxonomies of science.
Hirsch’s concept of h-index was used to define a similarity measure for journals. The h-similarity is easy to calculate from
the publicly available data of the Journal Citation Reports, and allows for plausible interpretation. On the basis of h-similarity,
a relative eminence indicator of journals was determined: the ratio of the JCR impact factor to the weighted average of that
of similar journals. This standardization allows journals from disciplines with lower average citation level (mathematics,
engineering, etc.) to get into the top lists.
The measurement of textual patent similarities is crucial for important tasks in patent management, be it prior art analysis,
infringement analysis, or patent mapping. In this paper the common theory of similarity measurement is applied to the field
of patents, using solitary concepts as basic textual elements of patents. After unfolding the term ‘similarity’ in a content
and formal oriented level and presenting a basic model of understanding, a segmented approach to the measurement of underlying
variables, similarity coefficients, and the criteria-related profiles of their combinations is lined out. This leads to a
guided way to the application of textual patent similarities, interesting both for theory and practice.
This paper investigates the utility of the Inclusion Index, the Jaccard Index and the Cosine Index for calculating similarities
of documents, as used for mapping science and technology. It is shown that, provided that the same content is searched across
various documents, the Inclusion Index generally delivers more exact results, in particular when computing the degree of similarity
based on citation data. In addition, various methodologies such as co-word analysis, Subject-Action-Object (SAO) structures,
bibliographic coupling, co-citation analysis, and self-citation links are compared. We find that the two former ones tend
to describe rather semantic similarities that differ from knowledge flows as expressed by the citation-based methodologies.