Abstract
This paper investigates the historical development during the past three centuries of the English suffix -en, used to create denominal adjectives (e.g. golden, silken), focusing on words that have remained in the language until the present day. We specify a way of calculating the rate of loss of the suffix and apply this to different lexical items involved in this process. Finally, we explore the roles of word frequency and collocations, in order to shed some light on how these factors relate to the loss of a linguistic form.
1 Introduction
In the course of its history, the English language has undergone radical morphological changes, which have left their mark on not only the morphology, but the entire grammar of the language. Compared to Old English, the case system of contemporary English has all but disappeared, and suffixes like for the plural or the past tense have become segmentally reduced, e.g. by vowel deletion. On a slope of morphological complexity, over the course of the last one and a half millennia, English has therefore developed from a (relatively) suffix-rich to a (relatively) suffix-poor language.
In the present paper, we illustrate the (ongoing) process of suffix loss of one particular derivational suffix: the -en suffix that is used to create adjectives from ‘material’ nouns. Examples are given in (1):
noun | adjective |
gold | golden |
silk | silken |
wool | woolen/woollen |
According to traditional grammars, this morphological process (N+en → Adj) is available for material nouns but not for other nouns; see e.g. Dalton-Puffer (1996, 166f.), who includes other relevant examples and discussion. That is, the process is not allowed for non-material nouns like house or spoon. However, first of all, it should be noted that not all material nouns allow this process, as illustrated by the forms in (2): 1
noun | adjective |
silver | *silveren |
cheese | *cheesen |
water | *wateren |
plastic | *plasticen |
stone | *stonen |
In other words, for some words (e.g. gold) there is a corresponding adjective (golden), while for other, similar words (e.g. silver) there is not. 2 Such a lexically skewed situation is expected to be subject to analogical pressure. Languages will either create words like *silveren and *stonen, following the model of golden and woollen, or words like golden will increasingly be eradicated from the language. English has adopted the latter path (note that Dutch does have stenen ‘made of stone’, ijzeren ‘made of iron’, etc., while English used to have some of these forms, see below), so that now both a golden ring and a gold ring are acceptable (see e.g. Zandvoort 1975, § 918 (p. 312) and below). That is, variation occurs, which is a sure sign that linguistic change is under way. 3 As we will make explicit, a few hundred years ago the former form was almost always used, whereas nowadays the latter predominates. Thus, the tendency is toward a regular system in which gold and silver are treated on a par and the suffix -en has ceased to function. The observation that here change in progress is taking place is not new; see e.g. Zandvoort (1975, § 918 (p. 312)), but so far it has not been possible to establish specific details.
In this article we investigate how this process has proceeded, from around 1700 up to the present day. We ask how fast the -en suffix is being lost from the language, proposing a rate of loss for different adjectives, based on incidence of these adjectives in Google Ngrams (see below). We address the question whether there is a relation between the lexical frequency of these adjectives (or their corresponding nouns) and their rate of loss. Are frequent adjectives ‘protected’ from this regularizing change, similar to the way frequent irregular verb forms (e.g. auxiliaries) and frequent irregular plurals resist regularization? Secondly, we raise the question of whether some of these adjectives are ‘protected’ from being regularized because they appear in fixed expressions, i.e. collocations, or whether perhaps some collocations help to speed up the change. We identify a number of collocations that could play a role like this.
This article is organized as follows: the next section presents the function and historical background of the suffix -en. Section 3 illustrates the methodology we followed and provides a rough description of how the suffix developed in British English, and by way of comparison, American English. Section 4 proposes a formal way of quantifying rate of change and identifies different periods of change in the history of the suffix. It also identifies the role of word frequency in the process. Section 5 investigates the most frequent collocations with -en adjectives and discusses their role in the process of analogical levelling. In section 6 we discuss our findings and point out issues and similar cases for future research.
2 Background
The -en suffix goes back to Proto-Indo-European and cognates can be found in all Indo-European languages (see e.g. Klein 1971, 636ff; Harper 2001–2020). It was originally a general adjectival suffix which could be attached to a much wider range of nouns than just material ones. For English, Harper gives the examples of fyren ‘on fire; made of fire’, hunden ‘of dogs, canine’, beanen ‘of beans’, baken ‘baked’, breaden ‘of bread’ which were common in Old or Middle English. 4 There are several examples in Shakespeare of such usage (Blake 2004, 95f; Bartlett 2015):
hempen | What hempen home-spuns haue we swaggering here (MN 3.1.71) |
lenten ‘threadbare’ (because Lent is a season of fasting) | |
What lenten entertainment the players shall receive from you (H 2.2.329) | |
threaden ‘made with linen thread’ | the threaden Sayles, (H5 3.0.10) |
wheaten ‘made of wheat’ | Your wheaten wreathe (TK 1.1.64) |
(MN = A Midsummer’s Night Dream [1595/96]; H = Hamlet [1599/1601]; H5 = Henry V [1599]; TK = Two Noble Kinsmen [1634]) |
In Shakespeare’s plays, gold was never used in adjectival position, as in a gold ring. Instead, he always used golden (e.g. golden age, golden touch, golden bed, golden rings, golden crown, etc.; see Bartlett (2015, 636ff)). For silver, there was no form in -en, so silver was used prenominally; both forms occur side by side in the line “Spread o’er the silver waves thy golden hair” (Comedy of Errors, 3.2.48).
In present-day English, both golden ring and gold ring are acceptable (see specific data below). 5 What happened in between Shakespeare’s time and the modern day is that the former form in -en (henceforth: EN form) gradually yielded to the latter bare form (henceforth: ∅ form). In the next section we will map out the way this happened in more detail, to ascertain when these changes took place specifically, if they affected some EN forms more than others, and how quickly the process went. After that we turn to the role of collocations with these adjectives, some of which are more fixed than others (e.g. golden opportunity, golden retriever), which may dampen the loss of this suffix.
3 Methodology and interpretation
We used Google Ngrams (Michel et al. 2011) to quantify the use of adjectives like golden and gold throughout the history of English since 1600. Google Ngrams allows searching for the incidence of words as a proportion of the total number of words in the relevant subsection of the corpus (e.g. the year), which makes it a useful instrument to trace the rise and fall of specific words (or combinations of words) in written sources through the history of a language. As an illustration, consider Fig. 1, which shows that golden was more frequent before nouns than gold, throughout most of the history of English, but that this pattern was reversed sometime between 1850 and 1900. In this article we investigate the exact mechanism of this change for all material nouns that have a corresponding -en adjective.
Frequency of occurrence of ‘gold’ and ‘golden’ followed by any noun, 1600–2000, corpus English, case-insensitive, from Google Ngrams
Citation: Acta Linguistica Academica 69, 3; 10.1556/2062.2021.00474
To look for words with the EN suffix, we used a reverse word list available online, 6 selected the words that started with ne- (i.e. reversed -en) which we re-reversed and sifted for adjectives manually and added the nouns to which these adjectives are related. In the modern language, there are only thirteen pairs of such words in which there is a clear noun–adjective relation. 7 All of these word pairs are given in Table 1.
Pairs of words with ∅ (noun) – EN (adjective)
wax – waxen | flax – flaxen | hemp – hempen |
wool – woo(l)len | silk – silken | lead – leaden |
oak – oaken | earth – earthen | ash – ashen |
birch – birchen | beech – beechen | wood – wooden |
gold – golden |
Note that some words that were used in earlier times (see for instance the examples from Shakespeare in (3)) are no longer in use today and therefore not included in Table 1. All of these older examples have already been lost in the course of the history, and therefore also suggest that this suffix is gradually disappearing. This paper will therefore focus on the residue of EN words in English, estimate how fast they are being lost and what factors play a role in this process.
Google Ngrams allows specifying word class in searches as well as certain restrictions on words before or after the target word. 8 Clearly, golden and the other words that end in -en in Table 1 are adjectives. To make sure we only searched for occurrence of adjectives (and excluded personal names, for instance), we specified the search term as “golden_ADJ”. 9 Words like ‘gold’ in a gold ring can either be analysed as a noun or as an adjective in prenominal position. Since it is not necessary to choose between these two possibilities here, we specified that gold (word class not specified) should appear before any noun (for which _NOUN_ can be used in Google Ngrams, as was done in Fig. 1). The exact search parameters are given in (4): 10
Search parameters: |
https://books.google.com/ngrams |
golden_ADJ vs. gold _NOUN_ 11 |
data points collected for 1600–2019 |
selected corpora: British English; 12 American English |
smoothing set at 0 (=raw values) |
We decided to search for data for American English (AmE; corpus of 155 billion words 13 ) and British English (BrE; corpus of 34 billion words) separately. We had no particular opinion on the question if a difference should be expected between these two varieties, but we also wanted to check if using two different subcorpora would yield different results.
To illustrate the way in which we proceeded, a subset of the data that we found is presented in Table 2. 14 This table gives the incidences of the ∅ forms (e.g. oak + NOUN, column 0) and that of the EN forms (e.g. oaken, column EN) in five specific years (i.e. 1600, 1700, etc.). These numbers were all multiplied by 10,000 for readability in the table. Besides the incidences, the proportion of words with -en as part of the total number of forms is also given (column ‘prop.’). For instance, it presents the number of forms oaken as a proportion of the number of forms oaken plus oak followed by a noun. If the proportion is 1, the form oaken is exclusively used. If the proportion is 0.5 both forms (EN and ∅) are equally used. If the proportion is 0, no oaken forms are used. An empty cell indicates that neither EN nor ∅ forms were found in that particular year.
Ngram relative frequencies (multiplied by 10,000 for readability) and proportions of forms with EN, 1600–2000, British (top panel) and American English (bottom panel). Shaded cells indicate a proportion of less than 0.5, i.e. when the relative frequency of the EN form is lower than that of the ∅ form. (Note: ‘EN’ is used for forms with -en)
Interpretation: The tendency is toward increased shading, i.e. proportions of EN forms are decreasing
![]() |
The table shows that the proportion of EN forms decreases for these five specific years, reflected by the shading pattern. Both British English and American English show the same tendency, although for the latter many data are missing for the earlier centuries. In 1800, the proportions of six out of thirteen forms in BrE are below 0.5, and the same goes for the same number of forms (although not exactly the same forms) in AmE, while in 1900 the numbers are nine for BrE and ten for AmE. In 2000, for both varieties, only one form, wooden, is more frequent than its corresponding bare form wood (used before a noun). All the other EN forms have become less frequent than their corresponding ∅ form.
If we add up the frequencies of all thirteen words in BrE we can calculate the proportions of the whole set. These are the numbers below the panels for BrE and AmE, which go down from 0.7 (1800) to 0.6 (1900) to 0.4 (2000) for BrE and from 0.6 to 0.5 to 0.4 for AmE, respectively. Both varieties thus show the same tendency. Individual words show different paths of loss, but the general rates of loss are roughly the same (we will make this more precise below). The tendency could be compared to cases where AmE has shown greater inclination to morphological regularization, e.g. in past tenses like learnt, spelt and sped (BrE) vs. learned, spelled and speeded (AmE), etc., see e.g. Algeo (2006, 12ff), Trudgill & Hannah (2017, 61). More importantly, it shows that the two different subcorpora (BrE and AmE) show very similar results for this pattern.
Recall that Table 2 only presents data for five specific years and was intended to illustrate our method and draw some general conclusions. We also collected the data for all years between 1600 and 2019, i.e. 419 data points, for 13 pairs of adjectives, for two varieties. This more fine-grained dataset allows us to study the development of the suffix in more detail, e.g. with respect to the question whether different periods of loss could be distinguished, whether loss of specific adjectives was related to their word frequency and whether the trend has ever been reversed (i.e. a rise of specific EN forms during specific periods). We turn to this in the next section.
4 Rate of change and word frequency
This section examines the individual words in Table 2 more closely and investigates if there is a relation between the rate of loss of the EN forms and word frequency. For instance, is it the case that infrequent adjectives are lost more quickly than frequent ones? This might be expected since infrequent adjectives are less familiar to speakers so that their formation might be less lexicalized. On the other hand, frequent adjectives might be more subject to loss due to “wear and tear” of linguistic usage, making frequent forms shorter in general than infrequent ones (see e.g. Bybee 2006).
To investigate this, we calculated rates of change for each of the EN forms in Table 2, based on their proportions for all years for which data is available (1600 up to the present day).
To determine the rate of change of a certain word every 10 years over a period of 200 years (1800–2000), we need to divide the relative incidence of that year with the one before. Within the expression, the denominator [Ay/(Ay+By)] represents the percentage of a certain EN word with the total sample being narrowed down to just the EN form and its zero form. Meanwhile the numerator calculates the percentage difference of that word the year after comparing to the previous year. With this said, we divide the difference with the total value to obtain the rate of change each year. The rate of change of the EN form for a certain year (y) can thus be calculated by the following formula:
|
– where A and B are the lexical frequency of the EN form and the ∅ form, respectively, and ‘y’ and ‘y+1’ indicate two consecutive years. |
– A negative f(y) (below zero) indicates a rate of loss; the smaller the number, the higher the rate. |
– A positive f(y) (above zero) indicates a rate of rise; the bigger the number, the higher the rate. |
– When f(y) is zero, there is no change in the proportion of the EN form compared to the total occurrences of the EN form and the ∅ form. |
Considering the fact that the EN form may not occur in a given year, we decided to first calculate and average the rate of change every year with every ten years as a group: this eliminates the effect of the years where there is zero occurrence. After this we performed the calculation stated above with the ten-year groups as the smallest unit. For example, Ay represents the incident frequency of the ten-year span from 1950 to 1960 and Ay+1 represents the same concept from 1960 to 1970.
To illustrate this, for the first decade of the seventeenth century, amid the period of Shakespeare’s active production, as observed above, golden was dominantly used as an adjective before a noun, rather than gold: this is confirmed by the score 0.77 for 1600–1609 in the British English dataset. The numbers rises to a maximum score of 0.91 for 1680–1689 but drop to 0.39 three hundred years later before climbing back a little again (highlighted in yellow in Table 3 below). Recall that if the proportion is below 0.5, then the ∅ form is more common. For instance, in the 1950s wool(l)en takes up only 14% in British English and 32% in American English of all tokens of both modifier forms combined, as shown in Tables 3 and 4.
Proportion of EN forms in total incidences of EN and zero forms in BrE for different decades; the last column was revised manually when values were missing. Numbers highlighted in yellow are discussed in the text
![]() |
Proportion of EN forms in total incidences of EN and zero forms in AmE for different decades; the last column was revised manually when values were missing. Numbers highlighted in yellow are discussed in the text
![]() |
Using the rates of change in Tables 3 and 4, the following graph (Fig. 2) shows the general trend of the decreased proportions by decades of the EN forms compared to the total occurrences of the modifiers with and without the suffix. It is noticeable that in both British and American English the downward trend halts around 2000 and is replaced by a mild rise in the most recent twenty years.
Overall proportions of –EN forms, in BrE and AmE, with trend lines, data for 1600–2019
Citation: Acta Linguistica Academica 69, 3; 10.1556/2062.2021.00474
Fig. 2 shows that the general development of the EN forms can be divided into three stages: (i) a first stage, from 1600 to 1800, with a general downward trend, but with rather large fluctuations, both between words and between BrE and AmE; (ii) a clear downward trend from 1800 to 2000, with much less fluctuation across all words and for both varieties; (iii) an upward trend after 2000, i.e. a resurgence of EN words.
It is tempting but risky to relate these different trends to other developments, e.g. in society or with respect to other linguistic changes. The large fluctuations during the first period identified above might be related to the fact that spelling and/or language use itself were less standardized during that time than in later centuries. This may have led to more varied use in books and other publications. Probably the fact that the corpora (both BrE and AmE) are smaller for the first period than for the later period, simply because fewer books were available then, also leads to larger variation. The period of loss between 1800 and 2000 is remarkably homogeneous and regular, and confirms the general trend identified in the previous section, of EN forms gradually giving way and falling into disuse. Finally, the upward trend after 2000 calls for comment. Why should there be a rebound of EN forms? One possibility is that “irregular social correction” might play a role, noted by Labov (1994, 518), who observes that in late stages of sound changes (e.g. mergers) speakers become aware of impending mergers and may try to “rescue” the threatened form by using it more often. This might conceivably play a role, although here the merger is between two morphosyntactic constructions and not between sounds, while public awareness probably does not play a large role. It is also possible that the books included in the corpora that we used might reflect different language use than books from the years before, e.g. by reflecting spoken language more than before (Brysbaert et al. 2011), so that, in fact, books from the period between 1800 and 2000 underestimate the use of EN forms in actual (spoken) usage. Note, finally, that the relatively most common form, wooden, is almost exclusively used in a non-metaphorical sense, while golden is used in a non-material sense much more frequently (e.g. golden age, golden opportunity, etc; see also the next section). A general shift from material to metaphorical uses of EN forms may have taken place (recall also brazen, fn. 7), but cannot explain this latter difference.
Another question we can address on the basis of Tables 3 and 4 is whether word frequency is related to rate of change. Word frequency can be deduced from these above, because the proportions they show are based on the incidences of these particular words in the Ngrams corpus. A higher incidence corresponds to a higher word frequency.
First, we can make a general observation. Consider the rates of change of a high-frequent word, golden, with that of a low-frequent word, beechen, for BrE and AmE, presented in Figs 3 and 4. In these graphs, loss (or rise) is defined as the difference in proportion between two decades, which can straightforwardly be calculated from Tables 3 and 4. If the rate of change is negative, loss takes place; if it is positive, EN gains ground relative to ∅ forms.
Rate of loss/rise for golden (high-frequency) in both AmE and BrE
Citation: Acta Linguistica Academica 69, 3; 10.1556/2062.2021.00474
Rate of loss/rise of beechen (low-frequency) in both AmE and BrE.
Citation: Acta Linguistica Academica 69, 3; 10.1556/2062.2021.00474
For both graphs, most data points are below zero, showing that generally loss was taking place. But the graphs also clearly show that the pattern of variation for golden is very stable and differs little between BrE and AmE. For beechen, on the other hand, variation is much larger (even going beyond the chart at some points). We interpret this as an effect of word frequency: higher-frequency words will show a more stable pattern than low-frequency words, because low-frequency words are less securely entrenched in speakers’ grammatical system, so they will be less secure (and hence more variable) in their usage of such forms.
From the data underlying Tables 3 and 4 the average incidence of a word in the Google Ngram corpora can be calculated, which can be regarded as an objective measure of its word frequency during the decades when it was used.
Now consider Table 5. On the basis of this table, the EN forms can be divided into three tiers which we believe best reflect the different frequencies of the word pairs. If either the AmE or BrE proportion exceeds 0.5, we categorized the pair into the first tier. The same logic applies to the remaining two tiers, the threshold value being 0.15. Here are the three resulting tiers, which correspond to the high-frequency, medium-frequency and low-frequency words:
Tier 1: high-frequency words: golden, wooden, wool(l)en |
Tier 2: medium-frequency words: waxen, flaxen, hempen, silken, leaden, earthen, ashen |
Tier 3: low-frequency words: oaken, birchen, beechen (the three adjectives related to trees) |
In the process of sorting, we also found that frequency is related to fluctuation to some extent, the lower the incidence frequency, the higher the fluctuation (which is reflected by the standard deviation).
Frequency and loss-rise fluctuation of the EN forms in AmE and BrE
![]() |
The data show that the two high-frequency forms, golden and wooden are best preserved, and that the low-frequency forms, birchen and beechen are almost obsolete. We conclude that a high lexical frequency acts as a defence against change, as has of course been observed in many cases before (e.g. Phillips 1984, 2001). It should also be mentioned that, like in the case of golden, semantic change has preserved some of these forms: for instance, ashen and leaden are nowadays almost exclusively used in the metaphorical sense of colours or facial expressions, i.e. not as material nouns (‘made of ash-tree’, ‘made of lead’). Perhaps birchen and beechen are now moribund since no similar salvage was possible for these forms. 15
In the next section, we look at specific collocations with EN words, to see whether these promote or slow down change in the suffix.
5 The role of collocations
In this section we present a short investigation of the role of collocations in the loss of the -en suffix. Collocations are here defined as two-word combinations which occur together particularly frequently, and limited to adjective–noun combinations in the context of this paper (e.g. golden ring, oak table, wooden box). Collocations exist with and without the -en suffix. As we saw in the previous section, the -en suffix is being lost from English in favour of ∅ forms, with the exception of wooden and to a lesser extent, golden, which are the most frequent forms. Here we focus on the role of collocations in this process: are collocations with the -en suffix less likely to be lost from the language than the suffix in general? Or do collocations without the -en suffix speed up this change?
To answer this question, we collected collocations, separately for BrE and AmE, from Google Ngrams in the same way as for individual forms, and checked their frequency for all years between 1600 and 2000. The searches we specified in Google Ngrams are given in (7):
golden *_NOUN |
gold *_NOUN |
Other parameters: identical to (4) |
This search results in the ten most frequent combinations of ‘golden + NOUN’, ‘gold + NOUN’, etc. for each year. We selected the ten most frequent collocations in 2019, for both the EN form and the ∅ form, and determined which of these occurred with both forms. For instance, oaken table and oak table are both among the ten most frequent collocations (both in AmE and BrE) in the year 2019. These forms are of particular interest, because in such collocations the -en form and the ∅ form are in direct competition, and we can compare their rate of change to the rate of change of the individual words, oaken vs. oak (before a noun). Consider first the results.
First of all, in these particular combinations of words the EN forms also represent a minority, compared to ∅ forms (recall that shading in these tables signifies a proportion of EN forms of less than half), so they also bear out the fact that EN forms are less widely used than ∅ forms, even in collocations.
Second, note that, in Table 6 (BrE) at least, the proportion is completely stable if we compare the years 1800, 1900 and 2000. This is not the case in Table 7 (AmE), where an (expected) downward trend is visible in the proportions of EN forms. We have no explanation for this difference, but since the number of forms (both of the selected collocations and the collocations themselves in the corpora) is very small, we will not pursue this issue here. Further investigation is needed.
Competing EN and ∅ collocations, British English; the column 0 form gives the incidence of the collocation with the ∅ form of the adjective (e.g. oak table), the column EN form gives the incidence of the collocation with the EN form (e.g. oaken table) and the column ‘prop.’ gives the proportion of EN form as part of the total of ∅ and EN forms; incidences multiplied by 10,000 for readability
![]() |
6 Conclusion
It is clear that forms with EN are fading fast in English. All forms with -en lost ground to bare adjective forms during the past 100 years, both in British and American English (Fig. 2). Some of the forms may considered to be lost already (beechen, birchen) while others have largely changed their meaning from a material adjective to a metaphorical sense (e.g. ashen). Loss has gone slowest for golden and wooden, where it is clear that there is a relation between loss and word frequency: these forms have the highest word frequency.
Finally, we saw that collocations may have a slowing effect on regularization: collocations, i.e. combinations of words that are frequent, tend to remain relatively stable, whether they have an -en form or a ∅ form. More investigation is necessary here, though.
Similar cases may be found and investigated in a similar way in English and other languages with the aid of the Google Books Ngram corpus. For instance, the inchoative suffix in English (which coincidentally is also -en), as in blacken, redden, is also highly asymmetrical (since other combinations like *greenen, *purplen, *orangen are ill-formed), so that analogical change might be expected to such forms, e.g. to increase analytical forms like “to become black” at the expense of blacken, where word frequency might again play a role. The same goes for the two ways of forming comparatives in English, cf. sad–sadder vs. beautiful–more beautiful, which is asymmetrical and might be expected to be regularized to forms with more across the board (i.e. more sad at the expense of sadder). We also leave such cases for future research.
Funding information
The research reported on here was partially sponsored by a grant from the Ministry of Education, Guangdong Province (grant no.: 2019WCXTD010, PI Jeroen van de Weijer).
Acknowledgements
We thank two anonymous Acta Linguistica Academica reviewers for their helpful comments on an earlier version. We also thank Wei Gao, Frans Hinskens and Joost van de Weijer for comments on earlier versions and valuable discussion on data statistics. We thank our BA students Minhua Chen and Baoxian Lin for assistance with some of the data collection and working with us to develop these ideas. All errors are our own.
References
Algeo, John . 2006. British or American English? A handbook of word and grammar patterns (Studies in English Language). Cambridge: Cambridge University Press.
Bartlett, John . 2015. A complete concordance or verbal index to words, phrases and passages in the dramatic works of Shakespeare with a supplementary concordance to the poems. London: Palgrave Macmillan.
Bauer, Laurie , Salvador Valera and Ana Díaz-Negrillo . 2010. Affixation vs. conversion: The resolution of conflicting patterns. In F. Rainer , W.U. Dressler , D. Kastovsky and H.C. Luschützky (eds.) Variation and change in morphology (Current Issues in Linguistic Theory 310). Amsterdam & Philadelphia, PA: John Benjamins. 15–32.
Blake, Norman F. 2004. Shakespeare’s non-standard English: A dictionary of his informal language (Athlone Shakespeare Dictionary Series). London: Continuum.
Brysbaert, Marc , Emmanuel Keuleers and Boris New . 2011. Assessing the usefulness of Google Books’ word frequencies for psycholinguistic research on word processing. Frontiers in Psychology 2. 27.
Bybee, Joan L. 2006. From usage to grammar: The mind’s response to repetition. Language 82(4). 711–733.
Dalton-Puffer, Christiane . 1996. The French influence on Middle English morphology: A corpus-based study of derivation (Topics in English Linguistics 20). Berlin & New York, NY: Mouton de Gruyter.
Harper, D. , 2001–2020. Etymonline. http://www.etymonline.com.
Klein, Ernest . 1971. A comprehensive etymological dictionary of the English language. Amsterdam: Elsevier.
Kruisinga, Etsko . 1931. A handbook of present-day English, 5th edn. Groningen: Noordhoff.
Labov, William . 1994. Principles of linguistic change, Vol. 2: Social factors. Oxford: Wiley-Blackwell.
Michel, J.-B. , Shen, Y.K. , Aiden, A.P. , Veres, A. , Gray, M.K. , The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak and Erez Lieberman Aiden , 2011. Quantitative analysis of culture using millions of digitized books. Science 331 (6014), 176–182.
Phillips, Betty S . 1984. Word frequency and the actuation of sound change. Language 60(2). 320–342.
Phillips, B.S . 2001. Lexical diffusion, lexical frequency, and lexical analysis. In J.L. Bybee and P. Hopper (eds.) Frequency and the emergence of linguistic structure (Typological Studies in Language 45). Amsterdam: John Benjamins. 123–136.
Plag, Ingo . 2003. Word-formation in English. Cambridge: Cambridge University Press.
Trudgill, Peter and Jean Hannah . 2017. International English: A guide to varieties of English around the world, 6th edn. Milton Park, Abingdon: Routledge.
Ungerer, Friedrich and Hans-Jörg Schmidt . 1996. An introduction to cognitive linguistics. London: Longman.
Wright, Joseph . 1898–1905. The English dialect dictionary, 6 vols. Oxford: Henry Frowde.
Zandvoort, Reinard W. 1975. A handbook of English grammar, 7th edn. London: Longman.
Perhaps phonological restrictions (also) play a role here: like its inchoative counterpart -en (as in whiten, blacken), adjectival -en seems to prefer to attach to monosyllabic stems that end in an obstruent (thanks to a reviewer for reminding us of this; see also Plag (2003, § 3.5.2)), except in woollen. However, this does not bear on the loss of -en in phrases like a golden ring.
Interestingly, compounds with -wood have no corresponding adjective either, e.g. *sandalwooden, Kruisinga (1931, § 1868).
Cf. Bauer et al. (2010) for discussion of similar competition between Adj.-V. conversion and affixation with the inchoative -en. As a side note, with the loss of the adjectival -en the original noun phrase (golden ring) is reduced to an N-N compound (gold ring).
In other words, this suffix was much more productive at that time. We will not dwell on the definition of productivity here. It would be very interesting, as a reviewer notes, to include such now-obsolete forms in our investigation (the reviewer adds examples like bricken, clayen, husken and inken), but here we limit ourselves to words that still occur in the current language. One reason for this is that obsolete forms like fyren, breaden and husken had already died out around 1800, and that data on their incidence before that time is not as reliable as the later data.
We are not claiming here that gold and golden always mean the same. On the contrary, diversification of meaning has almost certainly played a role in the retention of both forms. See also below.
https://github.com/dwyl/english-words (last accessed June 1, 2020).
We excluded the pair brass–brazen (e.g. brazen-faced, a brazen lie, where it should be noted that the meaning of the adjective has become completely metaphorical, i.e. removed from its original material sense) from consideration, where the relation between the noun and the adjective is obscured by vowel change (due to Old English umlaut, which accompanied this suffix during the OE period) and voicing (also due to an OE voicing rule). In its material sense, brazen has become completely replaced by brass (e.g. a brass plate, brass buttons). Other forms which are now obsolete or dialectal (see e.g. Wright 1898–1905, Vol. II) are also not considered here.
For another interface to searching Google Books, with a slightly different search possibilities, see https://www.english-corpora.org/googlebooks/compare-googleBooks.asp (last accessed September 16, 2021). Thanks to a reviewer for reminding us of this.
We also consistently searched case-insensitive. This means that a few family names like Beechen may be included, but searching case-sensitive for only lower-case would omit cases where e.g. Golden appeared at the beginning of a sentence.
Smoothing was set to zero (default is 3, which results in smoother graphs (like Fig. 1)), to get the raw data for each year. All raw data used in this paper are available in the OSF data repository, https://osf.io/ew3jv/.
Note the space between gold and _NOUN_. See https://books.google.com/ngrams/info# (last accessed June 18, 2020) for the query syntax and examples.
Defined as “books predominantly in the English language that were published in Great Britain”, https://books.google.com/ngrams/info# (last accessed June 18, 2020).
The size of these corpora is presented on https://www.english-corpora.org/googlebooks/x.asp (last accessed August 15, 2020).
For wool(l)en, we searched for both spellings in both corpora. For lead we specified that it should be an adjective before a noun (not a verb).
Note also that birchen and beechen are subordinate in meaning to wooden, so that survival of the latter term (which is of course also more frequent and more culturally ‘salient’ (Ungerer & Schmidt 1996)) might obviate the need for the survival of the former pair of words.