Search Results

You are looking at 1 - 2 of 2 items for :

  • Author or Editor: Bálint Sass x
  • Arts and Humanities x
  • Refine by Access: All Content x
Clear All Modify Search

Abstract

Nowadays, it is quite common in linguistics to base research on data instead of introspection. There are countless corpora – both raw and linguistically annotated – available to us which provide essential data needed. Corpora are large in most cases, ranging from several million words to some billion words in size, clearly not suitable to investigate word by word by close reading. Basically, there are two ways to retrieve data from them: (1) through a query interface or (2) directly by automatic text processing. Here we present principles on how to soundly and effectively collect linguistic data from corpora by querying i.e. without knowledge of programming to directly manipulate the data. What is worth thinking about, which tools to use, what to do by default and how to solve problematic cases. In sum, how to obtain correct and complete data from corpora to do linguistic research.

Open access

Abstract

The paper gives a detailed description of the “A egy N” construction in Hungarian based on a thorough investigation of carefully collected corpus data. Utterances containing this construction express a speaker-related (mostly derogatory, but sometimes appreciative) value judgement. The morphological, syntactic, and pragmatic characteristics of the construction are presented. Furthermore, some formally and pragmatically similar constructions are also discussed and some misleading pieces of information in the earlier literature are debunked.

Open access