Search Results

You are looking at 1 - 10 of 29 items for :

  • "machine learning" x
  • Mathematics and Statistics x
  • All content x
Clear All

Abstract  

The most popular method for judging the impact of biomedical articles is citation count which is the number of citations received. The most significant limitation of citation count is that it cannot evaluate articles at the time of publication since citations accumulate over time. This work presents computer models that accurately predict citation counts of biomedical publications within a deep horizon of 10 years using only predictive information available at publication time. Our experiments show that it is indeed feasible to accurately predict future citation counts with a mixture of content-based and bibliometric features using machine learning methods. The models pave the way for practical prediction of the long-term impact of publication, and their statistical analysis provides greater insight into citation behavior.

Restricted access

Abstract  

Patents represent the technological or inventive activity and output across different fields, regions, and time. The analysis of information from patents could be used to help focus efforts in research and the economy; however, the roles of the factors that can be extracted from patent records are still not entirely understood. To better understand the impact of these factors on patent value, machine learning techniques such as feature selection and classification are used to analyze patents in a sample industry, nanotechnology. Each nanotechnology patent was represented by a comprehensive set of numerical features that describe inventors, assignees, patent classification, and outgoing references. After careful design that included selection of the most relevant features, selection and optimization of the accuracy of classification models that aimed at finding most valuable (top-performing) patents, we used the generated models to analyze which factors allow to differentiate between the top-performing and the remaining nanotechnology patents. A few interesting findings surface as important such as the past performance of inventors and assignees, and the count of referenced patents.

Restricted access

 Then, the KNN classifier is used to cross-validate the classification accuracy of the feature subsets. Classification by KNN classifier The KNN algorithm is amongst the simplest of all machine learning algorithms for

Restricted access

compare the obtained indicators in terms of occurrence and contingencies. Overall, our observations reveal non-trivial differences for both indicators. Methodology for characterizing NPRs A supervised machine learning approach

Restricted access

, Ch. M ., Pattern recognition and machine learning , Springer Verlag 2006 . [4] Fisher , R ., The use of multiple measurements in taxonomic problems , Annals of Eugenics , 7 ( 1936 ), 179 – 188 . [5] Fukunaga , K ., Introduction to

Restricted access

Abstract  

Recently, philosophers of science have argued that the epistemological requirements of different scientific fields lead necessarily to differences in scientific method. In this paper, we examine possible variation in how language is used in peer-reviewed journal articles from various fields to see if features of such variation may help to elucidate and support claims of methodological variation among the sciences. We hypothesize that significant methodological differences will be reflected in related differences in scientists’ language style. This paper reports a corpus-based study of peer-reviewed articles from twelve separate journals in six fields of experimental and historical sciences. Machine learning methods were applied to compare the discourse styles of articles in different fields, based on easily-extracted linguistic features of the text. Features included function word frequencies, as used often in computational stylistics, as well as lexical features based on systemic functional linguistics, which affords rich resources for comparative textual analysis. We found that indeed the style of writing in the historical sciences is readily distinguishable from that of the experimental sciences. Furthermore, the most significant linguistic features of these distinctive styles are directly related to the methodological differences posited by philosophers of science between historical and experimental sciences, lending empirical weight to their contentions.

Restricted access

Introduction Data mining Data mining is an interdisciplinary field that combines artificial intelligence, database management, data visualization, machine learning, mathematic algorithms, and statistics

Restricted access

CiteSeer data which we will briefly mention. Zhou et al. ( 2007 ) have investigated documents from CiteSeer to discover temporal social network communities in the domains of databases and machine learning. On the other hand, Hopcroft et al. ( 2004

Restricted access

Academic Publishers Dordrecht, The Netherlands . Quinlan , JR 1993 C4.5: Programs for machine learning Morgan Kaufmann San Francisco, CA, USA

Restricted access

work quickly often prefer to publish in conferences and workshops proceedings. Many computer science subareas have their own top conferences and journals. For example, in machine learning, the two conferences Intl. Conference on Machine

Restricted access