and textmining methods. Garfield, after establishing Science Citation Index (SCI) in 1950s with assistance of his colleagues in Institute of Scientific Information (ISI, currently Thomson Reuters), initiated new methods for creating science maps with
Authors:Tom Magerman, Bart Van Looy, and Xiaoyan Song
In this study, we examine and validate the use of existing text mining techniques (based on the vector space model and latent
semantic indexing) to detect similarities between patent documents and scientific publications. Clearly, experts involved
in domain studies would benefit from techniques that allow similarity to be detected—and hence facilitate mapping, categorization
and classification efforts. In addition, given current debates on the relevance and appropriateness of academic patenting,
the ability to assess content-relatedness between sets of documents—in this case, patents and publications—might become relevant
and useful. We list several options available to arrive at content based similarity measures. Different options of a vector
space model and latent semantic indexing approach have been selected and applied to the publications and patents of a sample
of academic inventors (n = 6). We also validated the outcomes by using independently obtained validation scores of human raters. While we conclude
that text mining techniques can be valuable for detecting similarities between patents and publications, our findings also
indicate that the various options available to arrive at similarity measures vary considerably in terms of accuracy: some
generally accepted text mining options, like dimensionality reduction and LSA, do not yield the best results when working
with smaller document sets. Implications and directions for further research are discussed.
Authors:Byungun Yoon, Sungjoo Lee, and Gwanghee Lee
With the growing recognition of the importance of knowledge creation, knowledge maps are being regarded as a critical tool
for successful knowledge management. However, the various methods of developing knowledge maps mostly depend on unsystematic
processes and the judgment of domain experts with a wide range of untapped information. Thus, this research aims to propose
a new approach to generate knowledge maps by mining document databases that have hardly been examined, thereby enabling an
automatic development process and the extraction of significant implications from the maps. To this end, the accepted research
proposal database of the Korea Research Foundation (KRF), which includes a huge knowledge repository of research, is investigated
for inducing a keyword-based knowledge map. During the developmental process, text mining plays an important role in extracting
meaningful information from documents, and network analysis is applied to visualize the relations between research categories
and measure the value of network indices. Five types of knowledge maps (core R&D map, R&D trend map, R&D concentration map,
R&D relation map, and R&D cluster map) are developed to explore the main research themes, monitor research trends, discover
relations between R&D areas, regions, and universities, and derive clusters of research categories. The results can be used
to establish a policy to support promising R&D areas and devise a long-term research plan.
Authors:Xinhai Liu, Wolfgang Glänzel, and Bart De Moor
In this section, we apply our algorithms in the real application of the WoS journal set analysis. Our objective is to map these journals into different subjects by clustering algorithms. Recently, many researchers have applied textmining
of textmining for this purpose is not far to seek. The easiest way of monitoring the emergence of research topics is certainly considering the growing frequency of specific terms within a given research area. However, textual similarity based on
Authors:Xinhai Liu, Wolfgang Glänzel, and Bart De Moor
both at least 50 papers and more than 30 citations. After pre-processing, we obtain 8,305 journals as the data set adopted in this research.
In a first step, we have retrieved lexical and citation
is using TF-IDF terms or keywords. Because of the above-mentioned problems more sophisticated solutions have been developed but most are based on textmining (e.g., Lamirel and Attik 2008 ). In the present study we propose an alternative method to
naturally occurring agents.
To demonstrate the use of textmining to gain a better understanding of SARS and SARS-CoV, and how this information could be used to impact preparedness and response activities, a three-part examination of the global SARS
: University of Illinois Press . (Topics in the Digital Humanities).
Jockers , Matthew L. - Underwood , Ted 2016 TextMining and the Humanities . In Schreibman , Susan - Siemens , Raymond George - Unsworth , John (eds.) A New Companion to
The paper examines the applicability of informetric methods to trace the pattern of debate about the three main critical issues of the modern Welfare State in Denmark: economic aspects, legitimacy and functionality. The methodology of issue tracking is used to follow the developments of these issues in periods through national databases of various types covering information about the research, implementation, press and legislation aspects. The approach taken is novel in that it implements and tests issue tracking in this area of social sciences, and tries to reduce subjectivity in the analysis of trends influencing social policy and public opinion. The study aims to show how the emerging data and text mining techniques can be applied to integrate downloaded bibliographic data with other types of information in a strategic mix.