Foundations of generative linguistics

: This paper gives a broad overview of the ideas underlying the Chomskyan approach to linguistics. It identiﬁes the main innovation of generative grammar in linguistics and clariﬁes some recurring misunderstandings about the language faculty, recursion and language universals. The paper also dis-cusses some of the main empirical results of generative syntax.


Introduction
This paper aims to give a broad overview of the foundations underlying the Chomskyan approach to linguistics. Instead of focusing on the technical apparatus and analytical details of either Government and Binding theory (Chomsky 1981) or the Minimalist Program (Chomsky 1993;, I put the spotlight on how Chomsky changed the research questions and research agenda of linguistics (section 2). 1 In section 3 I clarify some stubborn misunderstandings which regularly appear about Chomskyan linguistics in the literature of competing frameworks. Section 4 is dedicated to the fields of application and data collecting methods of generative grammar. Some of the main empirical results of generative syntax are discussed in section 5.

Generative grammar and its relation to transformations
The linguistic theory that grew out of Chomsky's work is called generative grammar. In this context, 'grammar' refers to all levels of language with a structure, that is, it includes phonology, morphology and syntax as well as (compositional, non-lexical) semantics. The foundations of this theory were Éva Dékány laid down in Syntactic structures (Chomsky 1957). (The newest edition of this book, Hornstein et al. 2018 -supplemented with several state-ofthe-art chapters reflecting on the original book's impact on the field -is reviewed in this issue in Den Dikken 2019.)

Transformations
'Generative grammar' is also often called 'transformational generative grammar'. This is because a substantial amount of work in generative grammar assumes that the directly observable data constitute a so-called surface structure, which is derived from a more abstract, not directly observable deep structure via a conversion mechanism (or mechanisms). 'Transformation' is the technical term for this linguistic conversion apparatus.
Transformations in phonology include insertion, deletion, assimilation, dissimilation and metathesis. These are illustrated in (1) with examples from Hungarian. The best-known syntactic transformation is movement, a conversion that operates on the word order of phrases and clauses. Let us consider an example with English PPs. In the basic word order the adposition precedes its nominal complement.
(2) Kate was talking [ PP to the boy] If we would like to use Kate was talking to X as a predicate of a relative clause, then the prepositional object must be replaced by a wh-item.

Foundations of generative linguistics 311
But in a well-formed relative clause, the wh-constituent must also appear at the beginning of the clause either as in (4a)  Of particular interest is (4b), in which the P and the prepositional object are not in the same constitutent, and within the clause, the prepositional object, in fact, precedes the adposition. In transformational generative grammar such sentences start out as (3), with the wh-item inserted after the preposition, and then the wh-item is fronted via a movement transformation. The view that at deep structure the wh-item is inside the PP explains why to -rather exceptionally for this P -appears to be used intransitively (at deep structure, it does have a complement). Generative grammar is often identified as a grammar that uses transformations. This, however, is mistaken for two reasons. Firstly, transformations were not introduced by Chomsky: they were already part of the work of Zellig Harris, Chomsky's linguistics professor (albeit in the form of static statements rather than dynamic conversions, cf. Harris 1951, vi). For instance, Harris derived constituent questions from the corresponding declarative (kernel) sentences similary to our relative pronoun preposing examples above (Harris 1957, 317).
Secondly, there are generative grammars that do not use transformations at all. Lexical Functional Grammar (Bresnan 1982) and Head-driven Phrase Structure Grammar (Pollard & Sag 1987) are cases in point: these are generative frameworks that offer alternatives to Chomsky's particular way of thinking about language. Moreover, even within Chomskyan grammar there are so-called representational approaches which do not involve transformations (see e.g., Brody 1995;2000;2002). In these analyses the information encoded in 'deep structure' and 'surface structure' are part of the same syntactic representation. In the case of constituent questions, for instance, the wh-constituent is inserted at the beginning of the sentence; and the position that a corresponding non-wh-item would occupy in a declarative clause is filled by an unpronounced pronominal element that is co-indexed with the interrogative phrase. All transformational analyses can be rendered with such representations, as derivations and representations are essentially just notational variants of each other. As Hale (1999, 1) put it, they are 'formally equivalent, mutually intertranslatable ways of characterizing sets of linguistic expressions'.

Éva Dékány
To summarize, transformations were not introduced by generative grammar, and they are not necessarily part of a generative grammar either, so they cannot be the real innovation brought by generative grammar.

The real novelty of generative grammar
A generative grammar is a grammar which models language with formal and explicit rules. It can generate (or predict the grammaticality of) all grammatical expressions in the language but does not over-generate, that is, its rules do not output any ungrammatical expressions. In addition, it can also explain the logical problem of language acquisition.
The logical problem of language acquisition (also called the poverty of stimulus or Plato's problem) means that any child with a normal development can learn any language with equal ease, without explicit instruction, while the linguistic input is poorer in both quantity and quality than the grammar arrived at at the end of the acquisition process. Differently put, the input under-determines the final acquired grammar; speakers know rules of their language that they could not have deduced just on the basis of the input. 2 Rules which cannot be acquired only on the basis of the input fall into two types: those concerning the grammaticality of sentences and those concerning the interpretation of sentences. Let us illustrate both types of rules.
As for rules affecting grammaticality, let us start with the observation that English is not a pro-drop language: as a rule, neither subject nor object pronouns can be silent. Thus while (5c) is grammatical, it is an elliptical sentence. a.
I know this/that. b. I know that you saw him.
c. I know.
2 That speakers make use of hierarchy-based generalizations in contexts for which data are not available has been particularly clearly shown by the artificial language learning experiments of Culbertson & Adger (2014), in which critical syntactic evidence for the tested construction was deliberately withheld from the participants. The poverty of stimulus, of course, does not mean that the input is not critical for language acquisition; without input, there cannot be any language acquisition. It is not the case either that no rule of a language can be learned on the basis of only the input. The basic word order or the case-alignment system of a language, for instance, can be learned by relying only on the input. The poverty of the stimulus means that not all linguistic rules are like this. While the language acquirer may conclude from the contrast between (6a) and (6b) that a constituent question in the matrix clause licenses object drop in the embedded clause, the rule is not so simple: in (6b) a subject wh-question in the matrix does not allow for object drop in the downstairs clause.
Moreover, (8a) and (8b) are also grammatical, but they contain no constituent question. In (6b), the object of the superordinate clause is a wh-element, in (8a) it is a relative pronoun, and in (8b) it is focused. What is common to the distribution of interrogative phrases, relative pronouns and focussed phrases is that they are logical operators, and they have to the preposed to a left-peripheral position in the clause. As the basic word order in SVO, in the superordinate sentences in question the object is not in its default post-verbal position. That position is occupied by a 'gap' (in derivational approaches, the trace of the moved phrase, while in representational approaches a null pronominal element). This post-verbal 'gap' is what licenses the pro-drop of object pronouns in the relevant sentences. Gaps in embedded clauses which depend on an operator-related gap in the higher clause (such as pro-drop of English object pronouns) are called parasitic gaps. The rules governing the distribution of parasitic gaps would be extremely difficult to acquire based on the input alone. The language learner would have to hear very many examples to arrive at the conclusion that object drop is licensed by an operator-related object gap in the higher clause, and an operator-related subject gap, for instance, is not sufficient for this purpose (9). It is, however, extremely unlikely that the child would hear enough (or even one) relevant example(s) to be able to distill this rule from the data. Yet at the end of the language acquisition process every native speaker knows whether parasitic gaps are allowed in their grammar in general and in individual sentences in particular. Parasitic gaps thus constitute an important argument for the poverty of stimulus (cf. Adger 2015b and Allott & Rey 2017, among others).
Let us now turn to rules affecting the interpretation of sentences. Consider (10), which contains the wh-pronoun where in the matrix clause.
(10) Where did John say that Mary should get off?
(10) is ambiguous: 'where' can be understood to belong to the matrix clause (asking about the place of saying) or to the embedded clause (asking about the place of getting off).
(11) is largely parallel to (10) on the surface, but this time Mary is a stuctural focus in the embedded clause.
(11) Where did John say that it is Mary that should get off?
In contrast to (10), however, (11) is not ambiguous. 'Where' must be understood to belong to the matrix clause ('where did John say'); asking about the embedded event ('where should Mary get off') is not an available interpretation. Based on the input alone, it would be very difficult to figure out for the language learner that the latter reading is squarely excluded and it is not the case that this is an ambiguous sentence but all similar sentences of this type that s/he has encountered so far accidentally happened to have the 'upstairs interrogative' reading. Minimal pairs like (10) and (11) thus constitute another type of strong argument for the povery of stimulus (see also Hoekstra &Kooij 1988 andNewmeyer 2013, among others). 4 Generative grammar holds that the reason why language acquisition can be successful in spite of the poverty of stimulus is that humans have an innate capacity for language. This capacity is species-specific, i.e., unique to humans, and it is domain-specific, that is, it is specialized for language and is not involved in other cognitive tasks. This innate language capacity, called the language faculty, codes properties which characterize all possible human languages. As these properties do not have to be learned, the child has a head-start in the language acqusition process. When the data underdetermine a particular rule, the language faculty helps to narrow down the hypothesis-space. The human language faculty is often referred to as Universal Grammar.
The radical novelty of Chomsky's generative grammar is that it gives center stage to the child's language acquisition ability, and in doing so, it uses formal and explicit rules. 5 The core question of generative grammar is how children can learn language, therefore when the linguist formulates grammatical rules, it is important to consider if it is realistic for children to learn the rule in that form. If it is not, then the rule has to be re-formulated or replaced even if it is empirically perfectly adequate in its original form. This makes for a sharp contrast with American Structuralism, the immediate predecessor of generative grammar. In the structuralist approach the central problem was what rules the linguist had to formulate to correctly capture the data. Whether the rules could realistically be part of speakers' competence in the form given by linguists was not an issue.
To summarize, the aim of generative grammar is to understand the human linguistic capacity and the mind that has this capacity. The object of inquiry is thus linguistic competence (Internal-language) rather than linguistic performance (External-language). 6 The study of individual languages is ultimately just a tool to discover the properties of the human mind. The shift in focus from language as used in society to language in the individual's mind also significantly contributed to the cognitive turn in psychology, detailed in Pléh's (2019) contribution to this issue.
To achieve this goal, generative grammar studies individual languages and the language faculty with the scientific method that also characterizes the natural sciences. After examining the data the initial hypotheses (linguistic rules) are set up. A good hypothesis has predictions, which are tested on a wider range of data, and if necessary, the hypothesis is finetuned or completely replaced with a new one. The predictions of the refined hypothesis are subject to testing again. The process continues until the rule can generate all the relevant grammatical examples but at the same time 316 Éva Dékány it does not over-generate. This process allows the researcher to make a lot of new empirical discoveries which would not have been possible with the methods of traditional descriptive linguistics. Thus in addition to proposing abstract formal models of language, generative grammar also enriches our empirical knowledge of language. 7

Recurring misunderstandings
Several central tenets of generative grammar are routinely misunderstood or misrepresented by the critiques of the framework. In this section we will discuss three such misunderstandings in detail, and clarify what the actual claims and predictions are.

The language faculty
In the previous section we have seen that according to generative grammar, humans have an inborn, species-and domain-specific capacity for language. This hypothesis makes several predictions, discussed in e.g., Allott & Rey (2017). Firstly, innateness predicts that any child with a normal development can learn any human language with equal ease. However, not all artificial languages could be learned: those which do not conform to the properties coded in the language faculty would be unlearnable. A case in point would be languages with rules that make reference to linear order rather than structural hierarchy. Artificial langugages that fit this descrip- tion have indeed been shown to be unlearnable qua language proper (Smith & Tsimpli 1995). 8 Secondly, the species-specific qualification predicts that animals, including primates, cannot learn language (where language is understood as the use of a finite set of basic building blocks to produce a potentially infinite set of complex structures). To this date, no compelling evidence has emerged that primates or other animals have the capacity for the discrete infinity seen in human language.
Thirdly, the domain-specific qualification predicts the possibility of specific language impairment, that is, cases in which a child has difficulties with acquiring language without having auditory or speech-production problems or any other physical or cognitive disorders or delays. Some children indeed have such problems which affect only language (cf. Gillam & Kamhi 2010 for an overview).
Fourthly, having a specific capacity for language is also a necessary (though not sufficient) condition for having a critical period for language acquisition, the existence of which is well known (see Hartshorne et al. 2018 on some of the latest results). There are also predictions regarding specific linguistic phenomena: for instance, all non-local (i.e., long-distance) linguistic dependencies are predicted to have identical properties regardless of whether they occur in wh-dependencies, relative clauses, etc. (see section 5).
There are, however, several predictions which are wrongly attributed to generative grammar. For instance, an innate capacity for language does not mean that this capacity must be localized in a specific part of the brain: there exist innate abilities which are not localized (see Bates 1994). Complex information processing, such as language, generally involves distinct areas of the brain working together (Kandel & Hudspeth 2013, 16). Kandel and Hudspeth (ibid.,17), in fact, state that 'we now think that all cognitive abilities result from the interaction of many processing mechanisms distributed in several regions of the brain. Specific brain regions are not responsible for specific mental faculties but instead are elementary processing units' (original emphasis). An innate capacity for language does not mean that there is a 'language-gene' either (Mendívil-Giró 2018).
Having a language faculty means that some cognitive functions are specialized for language. This, however, does not mean that they must be unrelated to more general cognitive principles. Adger and Svenonius 318 Éva Dékány (2015, 11), for instance, raise the possibility that linguistic principles such as Merge, the silence of copies or periodical transfer of structure to the interfaces may be 'language-specialized versions of very general cognitive and computational factors' rather than enitrely language-specific principles with no analogues elsewhere in cognition.
An inborn language faculty predicts that certain very basic properties will be shared by all natural languages. Crucially, this does not mean that languages will not have important differences, too, or that 'all languages are like English'. The aim of generative grammar is to understand what is common to all human languages and what the points of parametric variation are. These objectives are also reflected in the name Principles and Parameters Theory (used for the Chomskyan framework in the 60's through the 80's): the principles are the invariant, shared properties and the parameters are the properties subject to cross-linguistic variation. Given this dual goal, generative grammar is necessarily a comparative enterprise rather than English-centric (as is often claimed).
As already mentioned in section 2, the language faculty is often referred to as Universal Grammar. In everyday use, a 'grammar' lists the inventory of the (phonological, morphological and syntactic) categories of a particular language together with the rules that govern their distribution. This use of 'grammar' is thus tightly related to surface phoneme, morpheme and word order. Universal Grammar, however, is not a 'grammar' in this sense: it encodes the general organizational principles of human language and makes no reference to surface order. One of these general linguistic principles is that syntactic rules cannot make reference to phonological features (e.g., rounded, voiced), but phonological rules (e.g., English wanna contraction) can make reference to syntactic structures. Another such principle is Endocentricity: a syntagm comprising categories A and B will itself belong to category A or B rather than a third, unrelated category, e.g., C or D. For instance, the verb run and the nominal phrase the race can be combined into the phrase run the race. This syntagm shares its category and external distribution with run, one of its sub-components. A verb and a nominal phrase could not combine into a syntagm with an exocentric AP or PP label. What we can conclude from this is that Universal Grammar is not a 'grammar' in the usual sense: it codes structural, hierarchial properties of language rather than surface order. We shall return to this point in section 3.3.
Another misunderstanding relating to the language faculty is that generativists consider language to be similar to the instincts found in the animal kingdom. In the title of his widely read popular science book, Steven Pinker indeed refers to language as an 'instinct' (Pinker 1994). This is a metaphor, however, which should not be taken literally. 9 Instincts are involuntary reactions to stimuli, they need not be learned and remain unchanged throughout the individual's lifespan. Language, on the other hand, is acquired via learning (even if this is a special type of learning), and the child's grammar constantly changes until the end of the acquisition process. The language faculty is thus not an instinct in generative theory either (see Adger 2015a).

Recursion
The term 'recursion' has (at least) two different uses in the generative literature. Category recursion means that a grammatical category (α) can have among its sub-constituents a category of the same type (that is, one with the label α). This type of recursion is also known as self-embedding. One of the best-known examples of category recursion is the ability of finite clauses to contain finite clauses within themselves. Category recursion may not occur in every language, and within a language, its availability depends on the properties of individual lexical items (for instance, some verbs take a clausal complement while others do not). Grammatical recursion is a completely different notion, entirely unrelated to categories. It means that humans can use a finite set of basic building blocks (morphemes, words) to build a potentially infinite set of structures. The tool for this is called Merge. Merge takes two linguistic units, A and B, and combines them into a larger linguistic unit, C. In the next step Merge can combine C with a new unit, D, to create an even larger unit. That is, Merge can use its output in the previous round of application as an input in the next round of application. For instance, gray Éva Dékány and cat can be combined into gray cat, and this output can be Merged with this to create this gray cat, which can again be Merged with meows to produce this gray cat meows. This means that grammatical recursion characterizes the linguistic computational system a such. Hauser et al. (2002) claim that recursion is part of Universal Grammar, that is, the human language faculty. Several critiques of generative grammar have argued that recursion cannot be a universal property of human language because there are languages which apparently have no clausal embedding (e.g., Everett 2005; Evans 2014, cf. also Christiansen & Chater 2015). 10 Firstly, self-embedding is not restricted to fintie clauses: infinitives may also embed infinitives, NPs may embed NPs, PPs may embed PPs, etc. Lack of clausal embedding thus does not automatically mean that the language in question lacks self-embedding as such. Secondly, even if languages without any self-embedding exist, they are irrelevant to Hauser et al.'s claims. Hauser et al. talk about grammatical recursion: the proposal is that all languages are such that they have a finite set of building blocks which can be used to produce a potentially infinite set of structures. 11 They make no claim about category recursion, thus languages without it cannot be brought to bear on the existence or nature of the language faculty (cf. Arsenijević & Hinzen 2012;Chomsky 2014, Chomsky et al. to appear, fn. 4, Mendívil-Giró 2018. 12 10 The most famous of these is the Brazilian indigenous language Pirahã, which, according to Everett (2005), lacks embedded clauses. With detailed analysis of the data, Nevins et al. (2009) have shown that this view cannot be upheld. Interestingly, in his earlier work Everett also described Pirahã as a language with embedded clauses (Everett 1983, chapter 14), but Everett (2005) makes no reference to this fact and does not explain why that earlier analysis was wrong; the lack of clausal embedding is stated rather than argued for. 11 Cf. the following quote from p. 1571: 'All approaches agree that a core property of FLN

Linguistic universals
The term 'universal' is also used in two different senses in the literature. Typological universals (also called Greenberg universals) are generalizations concerning surface word order which appear to hold in every language. One of the best known examples is Greenberg's Universal 20 (Greenberg 1963), which states the possible orders of demonstratives, numerals and adjectives with respect to each other and the noun in the Noun Phrase. Greenberg's generalization is that prenominally only the Dem-Num-A-N order is found cross-linguistically as a neutral order, while post-nominally both N-Dem-Num-A and N-A-Num-Dem occur. 13 Such universals can be established on the basis of descriptive linguistics. The universals of Universal Grammar (also known as Chomsky universals) are not concerned with word order: they characterize the human language faculty and the underlying hierarchical structure of language. Such a universal is the Principle of Endocentricity already mentioned above, Principles A, B and C of the Binding Theory (which regulate the hierarchical relationship between a Noun Phrase and an anaphor, pronoun or referential expression co-referent with it) or the Empty Category Principle (which places well-formedness constraints on traces). The fine details of these principles need not concern us here, what is important is that they are about structural relationships rather than word order, and they can only be identified via structural analysis of linguistic data.
Typological universals are usually borne out as a tendency; they are hardly ever without any exceptions. This fact has led some to the conclusion that there cannot be a Universal Grammar either (Tomasello 2005;Evans 2014, chap. 3, among others). This line of thinking confuses typological universals with Chomsky universals, however (see also Hornstein 2013, fn. 4, Adger 2015a). As explained above, Chomsky universals are about structural relationships. The idea that languages share hierarchical, organizational traits does not predict shared surface word order properties (just like the shared structural property of a backbone does not predict similarities between the appearance of fish, amphibians, reptiles, birds and thus the Author still fails to make a distinction between grammatical recursion and category recursion. 13 Later work has uncovered that the post-nominal part of this generalization does not hold in this form: Hawkins (1983) suggests that there are no constraints on postnominal order, while according to Cinque (2005) more post-nomial orders occur, yet there are logically possible but unattested neutral orders post-nominally, too. The pre-nominal part of the original generalization remains uncontested, however.

Fields of application and methods
In its early days generative grammar was mostly applied in the synchronic study of syntax and phonology. Since then, the theory has been adopted in morphological and diachronic analyses 14 as well as in the study of sign languages, heritage languages, dialectal micro-variation, (monolingual, bilingual and L2) language acquisition and language deficits (e.g., aphasia and SLI). Data from these different fields constantly inform the theory and improve its hypotheses. The data themselves have a variety of sources. In addition to utilizing introspection and grammaticality judgments from native speakers, data also come from corpora, from (picture or video prompt based) directed production tasks and from psycho-linguistic experimental methods (including eye-tracking). All of these sources have usefulness and validity as well as limitations.
Grammaticality judgments as a source of data has long been in criticized on two major fronts by proponents of alternative frameworks. Firstly, such judgments are often regarded to be unstable and non-replicable. However, introspective judgments have been shown to be stable, reliable and replicable (Cowart 1997;Sprouse & Almeida 2012;. To be sure, there are better and less good ways to collect such judgments. Instead of asking for a binary grammatical versus ungrammatical decision, a seven-point Likert scale, the magnitude estimation methodology or continuous sliders can be applied, instructions should be carefully formulated and both subject-related and task-related factors should be controlled for (Schütze 1996;Sprouse & Almeida 2017;Marty et al. 2019, among others). 15 While these issues should be paid heed to, there is no reason to abandon grammaticality judgments (and as we shall see below, it is not even possible if we want to pursue the goals of generative grammar).
The second major point of criticism is that sentences which speakers are asked to judge are often not natural and do not occur in spontaneous communication (Tomasello 1998;Miller & Weinert 1998). This claim is factually wrong: Newmeyer (2010) shows that sentences with a highly complex structure do occur in natural spontaneous communication, we just have to use a large enough corpus to find them. 16 Furthermore, the structures appearing in spontaneous conversations and written registers are, in essence, identical; the detectable differences are quantitative rather than qualitative (Biber 1988). The claim about the absence of complex constructions in spontaneous language use is also misguided and irrelevant. Generative grammar aims to model the language faculty: its object of inquiry is what is possible rather than what actually occurs.
This brings us back to the issue already mentioned above: grammaticality judgments are essential in order to tap into the language faculty of speakers. As discussed in Schütze (1996), work relying exclusively on corpora cannot reliably filter out noise in the data, cannot identify constructions which are grammatical but happen not to be represented in the corpus by accident, and lack crucial negative information about grammar (i.e., cannot identify with certainty what is ungrammatical).
Constructed examples which speakers assess for acceptability can provide negative information and have the advantage that they contain only the target construction, thus superfluous structural components which potentially have an adverse effect on processing but at the same time are not relevant to the studied issue are eliminated.
Constructed examples should be thought of as controlled experiments: with them we create the conditions under which a particular object of study can occur, such that we can observe, study and understand it, and check the predictions of the theory. In the natural sciences it is self-evident that such experiments are necessary. For instance, nobody would chide physicists for building the Large Hadron Collider in order to create a controlled environment in which they can bring about and observe the phenomena of interest to them. Little to no progress could be expected if they would just wait for those phenomena to occur spontaneously instead. Constructed examples work analogously to the experiments of the natural sciences: they help us establish the limits of what is possible in human language. It is hard to imagine how any theory of language could be seriously pursued without knowledge of this. 16 Newmeyer uses the Fisher English Training Transcripts for his study, which contains transcripts of telephone conversations (more than 6,700,000 words). He shows that among the complex constructions occurring in natural speech are cross-clausal wh-dependencies (discussed in the next section), deeply embedded gaps in relative clauses, backward anaphora, gerunds with possessive subjects, gapping (ellipsis of the verb), sluicing (ellipsis after a wh-phrase), etc.

Some empirical results of generative linguistics
As already discussed in section 2, generative grammar has made both theoretical and empirical contributions to our understanding of language. In footnote 7 we mentioned the Comprehensive Grammar Resources series as an example of an empirical contribution to the description of particular languages. In this section we will look at some of the significant midlevel syntactic generalizations which came out of work in generative syntax (including Government and Binding Theory, LFG as well as HPSG) and which could not have been discovered without the methods and goal of generative linguistics. The full list of the 53 generalizations, which can be found in the unpublished notes of Svenonius (2016) and D'Alessandro (to appear), was drawn up collectively by the participants of the Generative Syntax in the 21st Century: The Road Ahead conference in 2015 in Athens.

Unified constraints on preposings to the left-periphery
We have seen in (4a) and (4b) that relative pronouns in English appear at left periphery of the clause and are associated to a gap further down in the sentence. In many languages contrastive foci, topics and interrogative phrases are also preposed. While traditional grammars have treated these types of preposings as entirely unrelated types of dependencies, generative grammar has uncovered that they are governed by the same types of rules. It is true for all of them, for instance, that if the preposed phrase is doubled by an element clause-internally, then that element is always a pronoun rather than a particle specialized for doubling (Ross 1967). 17 (13) illustrates this with topic-doubling in English.
(13) John, I like him very much.
Wh-phrases, contrastive foci, topics and relative phrases can also be preposed into structurally higher clauses. The resulting cross-clausal dependencies are also similar in nature (they are subject to the same licensing conditions, cf. section 5.2, and are blocked in the same types of environments, see section 5.3). Preposed wh-phrases, contrastive foci, topics and relative phrases thus form a natural class (Chomsky 1977).
Foundations of generative linguistics 325

Cross-clausal dependencies
We have seen that constituents may be syntactically and semantically related to positions other than their surface position. A special instantiation of this possibility is when a constituent appearing in the matrix clause is syntactically and interpretationally related to the embedded clause. 18 This is the case with the interrogative phrase in (14b), for instance. Cross-clausal dependencies must always be licensed by special matrix predicates (the so-called bridge verbs of Ross 1967). Furthermore, if crossclausal dependencies are allowed between a matrix clause and its complement clause, then they are also allowed between a matrix clause and the complement clause of its complement clause, and so on. In other words, cross-clausal dependencies are unbounded; grammar does not count. (15) illustrates this with constituent questions. Unboundedness characterizes long-distance interrogative, topic, focus, and relative dependencies alike. Importantly, similar cross-clausal dependencies are unattested with finite verbs. While finite verbs can leave the verb phrase, as illustrated in (16b)

Limits on cross-clausal dependencies: syntactic islands
While long-distance dependencies are potentially unbounded, they can only be established across particular types of syntactic domains (Ross 1967 Syntactic domains which disallow extraction are called islands, and a crucial discovery of generative syntax is that they also have a detectable effect in languages which do not prepose their wh-phrases. As shown in (20) In spite of the lack of preposing, a wh-phrase in an embedded clause can establish an interpretational relationship with a superordinate clause: in the first translation of (22), for instance, the wh-phrase is interpreted as being in the matrix clause. Yet this is possible only as long as the wh-item is in a complement clause and not in an adjunct clause, a relative clause, a subject clause or a coordination, that is, in a syntactic domain that disallows extraction in English (Ross 1967 Huang et al. 2009, 263) Cross-clausal dependencies thus cannot cross certain syntactic domains, and this has reflexes both in wh-ex-situ and wh-in-situ languages. 20 328 Éva Dékány

Particles, adverbs and suffixes: isomorphy between syntax and morphology
It has been known for quite some time from typology that if a language expresses the categories of tense, aspect, mood and modality (TMA) with pre-verbal particles, then they appear in the order 'mood-tense-modalityaspect' (Bickerton 1974).
(24) Pyè te deja ap dòmi. Pyè PST already PROG sleep 'Pyè was already sleeping.' (Cinque 1999, 63) (Haitian Creole) If two or more particles within the same category can co-occur, then they also do so in a fixed order. For instance, if the particles of the habitual and continuous aspect co-occur, then the habitual always precedes the continuous (Cinque 1999).
(25) Àsíbá ná nɔ̀tò kpikpon vi lé go. Àsíbá FUT HAB PROG take care of the children 'Àsíbá will frequently be taking care of the children.' (ibid., 65) Generative grammar (esp. Cinque 1999) has uncovered that tense, aspect, modality and mood related adverbs in languages without TMA particles also conform to this order, and that this represents the underlying hierarchy of sentences. In (26) we see that in English the adverb of evaluative mood (unfortunately) precedes the adverb of alethic possibility modality (possibly), which in turn precedes the adverb of completive aspect (completely).
(26) Unfortunately John possibly completely forgot to water the plants.
Furthermore, if TMA categories are expressed by verbal suffixes, then they appear in the reverse order 'aspect-modality-tense-mood'; that is, morphology mirrors syntax. 21 island sensitive and island non-sensitive wh-phrases have structural differences (Tsai 1994;Murphy 2017). This does not diminish the importance of the discovery that islands are also in effect in wh-in situ grammars. 21 This is seen particularly clearly in languages in which pre-verbal particles and verbal suffixes co-occur (i).

Summary
The most important innovation of generative grammar is shifting the research focus to the child's language learning ability, and trying to capture this ability with formal, explicit rules. Generative grammars assume that the reason why children can acquire language with remarkable success in spite of the poverty of stimulus is that humans have a species-specific and domain-specific language faculty. The language faculty imposes structural restrictions on possible human languages, and thus restricts the possible hypothesis space regarding what the grammar of particular languages may look like. This, in turn, aids the language learner in the discovery of the underlying rules of his or her language. Frameworks which reject the idea of a capacity specialized for language assume that only domain-independent, more general cognitive functions have a role to play in the acquisition and use of language. These approaches must show how general cognitive functions and abilities can explain e.g., the unified constraints on constituent questions, contrastive foci and relativization, island effects in wh-ex-situ and wh-in-situ languages, the syntax-morphology isomorphy mentioned in section 5, or the many other linguistic generalizations in Svenonius (2016) and D'Alessandro (to appear). Different theories of language can only be compared on the basis of their explanations of specific linguistic data, the most successful being the one that can capture a wider range of empirical material with a smaller theoretical apparatus.