In this paper we investigate — at a country level — the relationship between the science intensity of patents and technological
productivity, taking into account differences in terms of scientific productivity. The number of non patent references in
patents is considered as an approximation of the science intensity of technology whereas a country’s technological and scientific
performance is measured in terms of productivity (i.e., number of patents and publications per capita). We use USPTO patent-data
pertaining to biotechnology for 20 countries covering the time period 1992–1999. Our findings reveal mutual positive relationships
between scientific and technological productivity for the respective countries involved. At the same time technological productivity
is associated positively with the science intensity of patients. These results are confirmed when introducing time effects.
These observations corroborate the construct validity of science intensity as a distinctive indicator and suggest its usefulness
for assessing science and technology dynamics.
In this study, we examine and validate the use of existing text mining techniques (based on the vector space model and latent
semantic indexing) to detect similarities between patent documents and scientific publications. Clearly, experts involved
in domain studies would benefit from techniques that allow similarity to be detected—and hence facilitate mapping, categorization
and classification efforts. In addition, given current debates on the relevance and appropriateness of academic patenting,
the ability to assess content-relatedness between sets of documents—in this case, patents and publications—might become relevant
and useful. We list several options available to arrive at content based similarity measures. Different options of a vector
space model and latent semantic indexing approach have been selected and applied to the publications and patents of a sample
of academic inventors (n = 6). We also validated the outcomes by using independently obtained validation scores of human raters. While we conclude
that text mining techniques can be valuable for detecting similarities between patents and publications, our findings also
indicate that the various options available to arrive at similarity measures vary considerably in terms of accuracy: some
generally accepted text mining options, like dimensionality reduction and LSA, do not yield the best results when working
with smaller document sets. Implications and directions for further research are discussed.