Abstract
It is commonly assumed that basic cardinal numerals such as English three are simplex expressions whose primary function is to quantify over entities denoted by the modified NP (e.g., Kennedy 2015; Rothstein 2017; Ionin & Matushansky 2018). In this paper, we explore crosslinguistic marking patterns suggesting that cardinals in fact lexicalize complex syntactic and semantic structures derived from the primitive notion of the number scale. The evidence we will investigate comes from various morphological shapes of cardinal numerals when used to count objects and when used for abstract arithmetical counting.
1 Introduction
Though cardinals can be used in various ways (for an overview, see e.g., Bultinck 2005), their most widely studied function is to enumerate entities designated by the noun. For instance, the numeral three in sentences such as (1) is used as a prenominal modifier which quantifies over individuals in the denotation of the modified NP. In (1a), it specifies the number of apples that fell from the table, whereas in (1b) the total number of the relevant musketeers. We will refer to this function as object counting.
a.  Three apples fell from the table. 
b.  The three musketeers fought bravely. 
The second function of cardinal numerals we will investigate here concerns reference to an abstract number concept (Bultinck 2005; Rothstein 2017). In sentences such as those in (2), the numeral three does not enumerate any objects but rather refers to an abstract mathematical entity which can enter arithmetical relations, see (2a), and has properties such as parity, as in (2b). We will call this second function abstract counting.
a.  Five minus three is two. 
b.  Three is an odd number. 
Conceptually, the distinction between object and abstract counting is evident. However, given that the very same morphological form is used in both (1) and (2), it is not immediately clear whether the distinction is also relevant for the grammar of natural languages. In this paper, we argue that it is indeed relevant for the grammar and we put forth a specific relationship between objectcounting and abstractcounting uses of numerals.
In order to determine the underlying meaning components (features) of numerals and the morphosyntactic structures they give rise to, we explore six types of meaning/form correspondences attested across languages that pertain to the distinction between object counting and abstract counting. Based on this crosslinguistic evidence, we argue that all (apparently simplex) cardinals are in fact morphosyntactically complex expressions. We shall further propose that objectcounting numerals are both syntactically and semantically derived from abstractcounting numerals.
The paper is structured as follows. In Section 2, we discuss evidence for the grammatical relevance of abstract and object counting in natural language along with some existing approaches to this distinction. In Section 3, we examine morphologically simplex abstractcounting numerals. We show that such numerals exhibit three distinct types of morphological relations to their objectcounting counterparts (syncretism, containment, suppletion). Section 4 provides evidence that the same patterns of relationship also apply to numerals that are morphologically complex in their abstractcounting function. In Section 6, we propose a compositional analysis of the meaning of abstract and objectcounting numerals which builds on the idea of invariant semantic primitives. In Section 7, we demonstrate how the proposed system can derive various marking patterns discussed in Sections 3 and 4. Section 8 summarizes our conclusions.
2 Two functions of cardinals
From the fact that there is a conceptual distinction between abstract counting and object counting, it does not necessarily follow that this distinction is reflected in the grammar of natural languages. However, there is evidence that distinguishing between abstract and objectcounting uses of cardinals is linguistically relevant. The crucial contrasts amount to the fact that objectcounting numerals denote counting devices, i.e., operations allowing for numeric quantification over objects, whereas abstractcounting numerals simply refer to arithmetical entities.
2.1 Linguistic relevance
It has been observed in the literature that abstract number concepts can have different properties than pluralities of individuals (Rothstein 2017). Some examples include being prime, odd or a natural number, see (3). On the other hand, such properties cannot be predicated of pluralities. For instance, (4a) and (4b) have very different truth conditions than the corresponding assertions in (3a) and (3b), whereas (4c) is simply infelicitous.
a.  Three is prime. 
b.  Three is odd. 
c.  Three is a natural number. 
a.  #Three things are prime. 
b.  #Three things are odd. 
c.  #Three things are a natural number. 
Furthermore, mathematical statements such as those in (5) require abstractcounting numerals. As evidenced by the infelicity of (6), even numeral phrases including the virtually vacuous noun thing are incompatible with mathematical environments, which call for numeric values and not pluralities of objects.
a.  Three times two equals six. 
b.  Six divided by three equals two. 
a.  #Three things times two things equals six things. 
b.  #Six things divided by three things equals two things. 
In addition, certain grammatical constructions require arguments referring to number concepts rather than to pluralities. For instance, only abstractcounting numerals can be complements of count (up) to, as witnessed by the contrast between (7a) and (8a). Moreover, the dimension of comparison in (7b) and (8b) is different. While in the case of arithmetical entities, what is relevant is their relative ordering on the number scale, the truth conditions in (8b) concern the size of individual objects making up a plurality. For instance, the sentence would be false when comparing three onions and two pumpkins. Notice also that number agreement in (7b) and (8b) is different.
a.  Hanna can count (up) to three. 
b.  Three is bigger than two. 
a.  #Hanna can count (up) to three things. 
b.  #Three things are bigger than two things. 
Yet another property of abstractcounting numerals is that they do not give rise to scalar implicatures (Bultinck 2005). While the most natural way to understand (9) is that the addressee needs to give the speaker at least three things to pay their debt, (10) lacks an analogous lower bounded reading and can only get an exact interpretation.
You must give me three things to pay your debt.  ✓at least 
You must divide six by three to get two.  #at least 
Finally, only objectcounting numerals allow for modification by comparative and superlative modifiers as well as socalled prequantifiers, e.g., the English universal quantifier all (cf. Corbett 1978). While the sentences in (11) are regular felicitous expressions, the examples with abstractcounting numerals in (12) are funny.
a.  More than three cats live in the barn. 
b.  At least three cats from the barn are crazy. 
c.  All three cats who live in the barn are cute. 
a.  #More than three is a natural number. 
b.  #At least three times two equals six. 
c.  #All three is prime. 
So far, we have seen that there is robust evidence that the conceptual distinction between abstract and object counting is encoded in natural language semantics and grammar. Given that all the examples discussed heretofore include the very same morphological form, an obvious question regarding the relationship between abstract and object counting arises.^{ 1 }
2.2 Relating abstract and object counting
The contrasts discussed in the previous section call for an explanation of the dual life of cardinals such as English three. In order to account for the relationship between number concepts and pluralities, most mainstream approaches take the objectcounting function of numerals as the basic one from which the abstractcounting meaning can be derived. For instance, Rothstein (2017) following Landman (2003) takes cardinal numerals to denote cardinal properties. As such they are intersective modifiers of type 〈e, t〉.^{ 2 } The abstractcounting meaning is then derived by a special shifting operation which applies to a cardinal property and yields a numberdenoting expression of type n.
Another account assuming essential objectcounting semantics for numerals is proposed by Ionin & Matushansky (2018). They posit that all cardinals are predicate modifiers (type 〈〈e, t〉, 〈e, t〉〉) which provide the cardinality of a partition of a plural individual. In order to derive the abstractcounting meaning, they propose a null nominalizing suffix. Its function is to yield the number corresponding to the objectcounting denotation of a given cardinal.
Finally, Kennedy (2015) also argues that the basic meaning of numerals is objectcounting. On his account, cardinals denote quantifiers over degrees (type 〈〈d, t〉, t〉), e.g., three denotes a set of degree properties whose maximal value is 3. The abstractcounting meaning is then obtained as a result of the consequent application of the standard typeshifting operations be and iota to the degreequantifier objectcounting semantics.
Though in recent years treating abstract counting as derived seems to have become prevalent, there are also proposals deriving the objectcounting meaning of numerals from a numberdenoting core. Two examples of such accounts are Krifka (1995) and Hackl (2000). In Krifka's system the underlying meaning of cardinals amounts to the reference to a number concept. In addition, depending on a language this core meaning can be accompanied by an overt or covert objectcounting element, e.g., a classifier. Likewise, Hackl starts with the abstractcounting meaning, which can be then shifted to a GQstyle objectcounting determiner.
What both families of theories briefly described above have in common is that they presume a relationship between the abstract and objectcounting component of numerals, an assumption justified by the identical form of numerals like three in both types of uses. How they account for the relationship in question is represented schematically in (13a) and (13b), where abs and obj stand for ‘abstractcounting’ and ‘objectcounting’, respectively, and α and β are operations responsible for the shifts.
a.  ⟦three_{ abs }⟧ = ⟦ [ α [ three_{ obj } ] ] ⟧ 
b.  ⟦three_{ obj }⟧ = ⟦ [ β [ three_{ abs } ] ] ⟧ 
On the assumption that morphological marking reflects meaning composition, the two approaches make different predictions regarding morphological makeup of cardinal numerals. In particular, (13a) predicts that across languages one should find numerals expressing α overtly. This means that crosslinguistically, abstractcounting numerals should exhibit a tendency to be morphologically more complex than objectcounting numerals. On the other hand, (13b) predicts the opposite, i.e., objectcounting numerals should be morphologically more complex than abstractcounting numerals.
In the next two sections, we will explore crosslinguistic morphological marking patterns focusing on the marking of the two types of numerals. In Section 3, we present data that call into question the idea that the objectcounting meaning of cardinals is basic, as in (13a), while supporting the view offered in (13b). In Section 4, we discuss examples indicating that even though the view in (13b) is essentially correct, it is still too simplistic in that even abstractcounting numerals are not grammatically atomic, and break down into at least two smaller meaning components.
3 Simplex numerals
Though the English data discussed so far do not show this, languages do in fact make various morphological distinctions between abstract and object counting (Hurford 1998, 2001). In this section, we start exploring these patterns by looking at numerals that are monomorphemic in the abstractcounting function. We shall call them simplex numerals. Comparing them to their objectcounting counterparts, we have identified three marking patterns, as provided in Table 1. In the table, A and B are variables representing morphemes.
Simplex numerals
syncretism  stacking  suppletion  
abstract counting  A  A  A 
object counting  A  A+B  B 
In the simplex syncretic pattern, the abstract and objectcounting forms are identical and both consist of a single morpheme. On the other hand, the simplex stacking pattern is characterized by objectcounting numerals being morphologically more complex. In addition to the abstractcounting core, they contain an additional morpheme. Finally, in the simplex suppletive pattern, abstract and objectcounting numerals have different and formallyunrelated monomorphemic shapes.
It is important to emphasize that the patterns as described in Table 1 above are used to classify individual numerals rather than entire languages. That is because various patterns can coexist within a single language (cf. Bale & Coon 2014).
Notice that the patterns represented in Table 1 do not exhaust the logical possibilities of variation. In particular, what is missing are abstractcounting numerals that would be more complex than their objectcounting counterparts. While such examples have also been reported, uncontroversial examples are very rare.^{ 3 } In this paper we limit our focus to the patterns in Table 1, where particularly the stacking pattern and the syncretic pattern are the two crosslinguistically most common ways of marking the two numeral types.
3.1 Syncretism
The first pattern we will investigate makes no overt distinction between abstract and objectcounting numerals. In other words, both functions are expressed by the same formal exponent. For instance, in English the prenominal modifier three is used to quantify over individuals in (14a). In this use, it has the same shape as the name of a number concept, see (14b). We call such numerals syncretic.
a.  three apples 
b.  Two times three equals six. 
From a typological perspective simplex syncretic numerals seem to be the most widespread type of numerals across languages. In (15)–(17), we illustrate this type by examples from Czech, Turkish and Basque, respectively.^{ 4 }
pět  koček 
five  cats 
‘five cats’ 
Tři  plus  pět  je  osm.  
three  plus  five  is  eight  
‘Three plus five is eight.’  Czech 
üç  kız 
three  girl 
‘three girls’ 
Yedi  eksi  üç  eşittir  dört.  
seven  minus  three  equals  four.  
‘Seven minus three equals four.’  Turkish 
bi  lili 
two  flower 
‘two flowers’ 
Hiru  bider  bi  sei  dira.  
three  time  two  six  are  
‘Three times two is six.’  Basque 
The syncretic pattern is not very revealing with respect to the relationship in question. In the next section, we examine cases where the abstract and objectcounting numerals differ formally.
3.2 Stacking
The second pattern involves a formal asymmetry between the two types of numerals in question. In particular, objectcounting numerals are morphologically more complex, i.e., include an additional morpheme compared to their abstractcounting counterparts. We shall refer to the formal relationship as stacking and numerals that arise as a result of stacking will be called augmented.
Typically, the asymmetry is characteristic of classifier languages. In these languages, the additional morpheme used with objectcounting numerals is often referred to as a classifier. For instance, in Mandarin, the numeral cannot appear bare when modifying an NP, see (18).
*sān  qiú 
three  ball 
Intended: ‘three balls’ 
sān  gè  qiú  
three  clf  ball  
‘three balls’  Mandarin 
On the other hand, the numeral has to be bare in an abstractcounting environment, see (19).^{ 5 }
jiǔ  chúyǐ  sān  shì  sān. 
nine  divide.by  three  cop  three 
‘Nine divided by three is three.’ 
#jiǔ  gè  chúyǐ  sān  gè  shì  sān  gè.  
nine  clf  divide.by  three  clf  cop  three  clf  
Intended: ‘Nine divided by three is three.’  Mandarin 
It should be noted that there is usually more to the semantics of classifiers than just allowing for numerical quantification. Classifiers often have also other functions, namely they introduce certain requirements regarding the kind, shape and size etc. of referents of the modified noun (e.g., Aikhenvald 2000). However, gè belongs in the class of the socalled general or default classifiers, which can be used to count virtually any type of object. Thus, rather than conveying some specific characteristic of the counted object, we understand Mandarin gè to be at its core a pure objectcounting marker.^{ 6 }
The same pattern as in (18)–(19) is also attested in Vietnamese.^{ 7 } As witnessed by the contrast in (20), the numeral hai ‘two’ must combine with an additional morpheme in an objectcounting construction, where cái is a general classifier for inanimate objects.^{ 8 }
*hai  bát 
two  bowl 
Intended: ‘two bowls’ 
hai  cái  bát  
two  clf  bowl  
‘two bowls’  Vietnamese 
On the other hand, in an abstractcounting environment like (21), the cardinal has to appear bare.
Bốn  nhân  hai  bằng  tám. 
four  times  two  equal.to  eight 
‘Four times two equals eight.’ 
#Bốn  cái  nhân  hai  cái  bằng  tám  cái.  
four  clf  times  two  clf  equal.to  eight  clf  
Intended: ‘Four times two equals eight.’  Vietnamese 
Yet another example comes from Adang (AlorPantar), a Papuan language with a developed classifier system. Similar to Mandarin and Vietnamese, Adang objectcounting numerals are bimorphemic, as demonstrated in (22), where paʔ is a general classifier (Robinson & Haan 2014, 250). On the other hand, abstractcounting cardinals are monomorphemic, see (23) (Klamer et al. 2014, 354).
ʔaburiŋ  paʔ  ut  
arrow  clf  four  
‘four arrows’  Adang 
air  nu  ‘aba’ang  ʔoalu  iwihing.  
ten  one  divide  posstwo  five  
‘Ten divided by two is five.’  Adang 
Augmented numerals occur also in IndoEuropean. For instance, in Bhojpuri (IndoAryan) the morphemes go and t_{˙}ho function as general classifiers that can be used to quantify over an object of any type, size or shape. An example of an objectcounting use is given in (24), where the bimorphemic form consisting of the numeral root and go is used as a prenominal modifier (Barz & Diller 1985, 166; adapted). However, in an abstractcounting context neither go nor t_{˙}ho is employed, see (25) (Barz & Diller 1985, 165).
sāt  go  nokar  
seven  clf  servants  
‘seven servants’  Bhojpuri 
das  ā  p˜āc  hai  panarā,  nā?  
ten  and  five  are  fifteen  no  
‘Ten and five make fifteen, don't they?’  Bhojpuri 
The final example to be discussed here comes from Yoruba (NigerKongo), which is a nonclassifier language. At least in some varieties of Yoruba, the distinction between abstractcounting and objectcounting numerals is marked by the prefix m.^{ 9 } As demonstrated in (26) and (27), only prefixed forms can be used as nominal modifiers whereas only bare numerals can express abstract counting.^{ 10 }
*ìwé  ẹ`ta 
book  three 
Intended: ‘three books’ 
ìwé  mẹ´ta  
book  prfxthree  
‘three books’  Yoruba 
èjì  pelu  ẹ`ta  je  àrún. 
two  with  three  earn  five 
‘Two plus three equals five.’ 
#méjì  pelu  mẹ´ta  je  márún.  
prfxtwo  with  prfxthree  earn  prfxfive  
Intended: ‘Two plus three equals five.’  Yoruba 
Crosslinguistically, the asymmetry discussed in this section is a relatively frequent phenomenon. Other languages with simplex numerals exhibiting the stacking pattern reported in the literature include, e.g., Thai (Hundius & Kölver 1983), Japanese (Sudo 2016), Assamese (IndoAryan) (Kalita 2011, 184) and Teiwa (AlorPantar) (Klamer 2010, 144–145; Klamer et al. 2014, 351). We interpret this phenomenon as evidence for the view that the abstractcounting meaning is basic and the objectcounting meaning is derived from it, recall (13b) in Section 2.2.
3.3 Suppletion
Finally, the third marking pattern to be discussed here concerns suppletive numerals. In such a case, for a given number there are two simplex morphologically unrelated forms one of which expresses abstract counting whereas the other is used in the objectcounting function. Admittedly, this pattern is very rare and it is challenging to provide an uncontroversial instance of a suppletive abstract/objectcounting pair. Nonetheless, Hurford (1998) gives the example of numeral suppletivism in Maltese in which there are two morphologically unrelated forms corresponding to the number 2, namely tnejn and żewġ (both ‘two’). As evidenced by the contrast in (28), only żewġ can convey the objectcounting meaning as a nominal modifier (Borg 1974, 293). On the other hand, only tnejn is possible in an abstractcounting environment such as (29) (Borg 1987, 62).^{ 11 }
*tnejn  nisa 
two_{ abs }  women 
Intended: ‘two women’ 
żewġ  nisa  
two_{ obj }  women  
‘two women’  Maltese 
Tnejn  u  tnejn  jagħmlu  erbgħa. 
two_{ abs }  and  two_{ abs }  they.make  four 
‘Two and two make four.’ 
*Żewġ  u  żewġ  jagħmlu  erbgħa.  
two_{ obj }  and  two_{ obj }  they.make  four  
Intended: ‘Two and two make four.’  Maltese 
As already mentioned above, the suppletive pattern is very rare and if identified, it typically applies to only one numeral in a language. However, according to the literature it is not restricted to Maltese ‘two’. Other examples of simplex suppletive numerals reported so far include, e.g., the cardinals ū and hlɛh (both ‘one’) in Palaung (Austroasiatic) (Greenberg 1978) as well as the Ibani (Eastern Ijaw) numerals gbẹrẹ and ng
The final note on suppletion we need to make here is that stacking and suppletion may in fact combine. As a case in point, consider the Mandarin numeral ‘two’. The numeral has the form èr for abstract counting, see (30a) (from PoChing & Rimmington 2015, 26). In (30b), we see that the objectcounting use features a suppletive root liǎng ‘two’, crucially accompanied by the general classifier gè (He 2015, 198). It is ungrammatical to use the abstractcounting èr in such phrases, see (30c) (He 2015, 198). In sum, the conclusion is that in the example (30b), stacking and suppletion combine.
yī  jiā  yī  děngyú  èr. 
one  add  one  equals  two 
‘One plus one is two.’ 
liǎng  gè  xuéshēng 
two  clf  student 
‘two students’ 
*èr  gè  xuéshēng  
two  clf  student  
Intended: ‘two students’  Mandarin 
3.4 A side note on compound numerals
In our investigation, we often found a relation (suggestive, even though possibly an imperfect one) between the shape of abstractcounting numerals and the shape of numerals within compound numerals. We use the term compound numerals to refer to numerals like two hundred five that arise by combining two or more basic numeral roots.
In order to see the parallel between complex numerals and abstract counting, recall the Mandarin example (19) repeated in (31). The example shows that classifiers are absent in mathematical statements.
jiǔ  chúyǐ  sān  shì  sān. 
nine  divide.by  three  cop  three 
‘Nine divided by three is three.’ 
#jiǔ  gè  chúyǐ  sān  gè  shì  sān  gè.  
nine  clf  divide.by  three  clf  cop  three  clf  
Intended: ‘Nine divided by three is three.’  Mandarin 
As pointed out by He (2015, 195), the classifier is also (unsurprisingly) missing in arithmetical statements like (32), where the property of being a prime number (which, recall, is a property of abstract numbers) is predicated of a compound numeral composed of three other numerals (‘two’, ‘ten’ and ‘three’).
èr  shì  sān  shì  sùshù.  
two  ten  three  be  prime.number  
‘Twenty three is a prime number.’  Mandarin 
These three numerals jointly specify (by multiplication and addition) the number 23, which is, indeed, a prime number. What is interesting for us is that in order to arrive at the meaning of the compound numeral, the arithmetical operations such as multiplication and addition must be postulated among the numeral roots, namely [23 = 2 × 10 + 3]. Such operations are, in turn, defined over abstractcounting numerals, and we are therefore not surprised that there is no classifier in between any two numerals in (32).
Interestingly, the absence of classifiers in between numerals is also observed in objectcounting contexts, see (33) from He (2015, 190).
[[èr  shí  sān]  wàn]  gè  xuéshēng  
two  ten  three  ten.thousand  clf  student  
‘two hundred thirty thousand students’  Mandarin 
He (2015) gives multiple arguments in favor of the proposal that the four basic numerals in (33) (i.e., èr, shí, sān and wàn) form a constituent to the exclusion of the classifier. According to He, the numerals are first joined together compositionally by a series of arithmetical operations [(2 × 10 + 3) × 10,000] and they jointly form a complex cardinal ‘two hundred thirty thousand.’ This internally complex compound cardinal is then used to quantify over the denotation of the noun, at which point the classifier becomes indispensable, turning the abstractcounting numeral into an objectcounting numeral (cf. Cinque 2020). In sum, the absence of classifiers inside complex numerals suggests that individual numerals inside compounds are similar (both semantically and morphologically) to abstractcounting numerals.
This view is supported by the fact that the form of the initial numeral ‘two’ in (33) is èr. Recall from (30), repeated in (34), that èr ‘two’ is a dedicated abstractcounting form of ‘two’; see (34a). Therefore, the appearance of èr in the objectcounting compound numeral in (33) indicates that inside such numerals we find the abstractcounting structure.
yī  jiā  yī  děngyú  èr. 
one  add  one  equals  two 
‘One plus one is two.’ 
*èr  gè  xuéshēng 
two  clf  student 
Intended: ‘two students’ 
liǎng  gè  xuéshēng  
two  clf  student  
‘two students’  Mandarin 
He (2015, 198–199) points out that the abstractcounting form èr is also found in additive numerals. This is interesting, because the sequence èr+clf is generally ungrammatical, recall (34b). However, in an additive numeral like ‘twelve’, the form èr must be used in objectcounting contexts, see (35a). The form liǎng ‘two’, which is found before the classifier in (34c), is impossible in compound numerals, see (35b) (He 2015, 198). Once again, the fact that èr is used in compound numerals suggests that the internal structure of such numerals is similar to abstractcounting structures.
[wǔ  shí  èr]  gè  xuéshēng 
five  ten  two  clf  student 
‘fifty two students’ 
*[wǔ  shí  liǎng]  gè  xuéshēng  
five  ten  two  clf  student  
Intended: ‘fifty two students’  Mandarin 
The reason why these facts are relevant to note is because abstractcounting contexts (arithmetical examples) are often missing in grammars. This limitation may be overcome by investigating compound numerals, which (in a number of cases) seem to mirror the structure of the Mandarin example in (35a).
Consider, for instance, the examples found in Upper Necaxa Totonac, as discussed in Beck (2004, 26–27). In Upper Necaxa Totonac, numerals in objectcounting functions precede the noun and they are themselves preceded by a classifier, see (36a) for the numeral ‘one’ with a classifier for bunches. In (36b), we see the numeral ‘ten’ with a general classifier preceding it. This numeral is given here in order to see that the numeral ‘eleven’ in (36c) is composed by juxtaposing the roots of the two numerals in (36a–b).
kiɬmaktín  más–ní  séʔnṵ 
clfone  rotdvb  banana 
‘a bunch of [rotten] bananas’ 
aʔkáux 
clften 
‘ten’ 
aʔkáuxtín  
clftenone  
‘eleven’  Upper Necaxa Totonac 
The classifier is found only once in the compound numeral, suggesting that (on analogy with Mandarin) the numeral ‘eleven’ is formed first as an abstractcounting numeral [10+1]. Only after the compound numeral is formed, the classifier attaches to it, yielding an objectcounting numeral. If correct, this suggests that Upper Necaxa Totonac provides us with yet another example of the stacking pattern.
The final example of this type we shall mention here comes from the Nilotic language Luwo (Storch 2014, 271). In this language, the numerals ‘one’ to ‘five’ are basic (not compound) and we list them in the first column in Table 2. It is interesting to note that they all start with an initial á, which is placed in italics and separated as an independent morpheme. The residue of the numeral is bold. The same strategy is applied to the numeral ‘ten’ at the bottom of the second column, which is also basic (not compound).
Luwo numerals
number  numeral  number  numeral  number  numeral 
1  ácíɛlɔ  6  ábiic bí cíɛl  20  dháànhɔ àdʊʊǹɔ 
2  áríɔẁ  7  ábiic bí ríɔẁ  40  jéríɔẁ 
3  ádák  8  ábiic bí dák  60  jédák 
4  áŋwɛ ɛn  9  ábiic bí ŋwɛ ɛn  80  jéŋwɛ ɛn 
5  áb ii c  10  ápààr  100  jéb ii c 
Let us now turn to the numerals ‘six’ to ‘nine’. These are contained in the second column. We can see that they are always composed of the numeral ‘five’, followed by bí (glossed as ‘plus’ in Storch 2014) and an appropriate numeral from the basic set. What is most relevant for our purpose is that the second numeral (after bí) lacks the initial á. This suggests a structure where the á is a separate morpheme that attaches to the compound numeral as in: á[5+1], á[5+2], etc. This again suggests that the compound numeral is formed first (through addition), and only after the compound numeral is created, we attach the ‘general classifier’ á in order to turn the abstract numeral into an objectcounting numeral. As in Upper Necaxa Totonac, this suggests that Luwo numerals belong to the stacking type, since the general classifier is missing in compound numerals, which resemble abstractcounting structures.
The same conclusion is suggested by looking at the multiplicative numerals ‘forty’, ‘sixty’, ‘eighty’ and ‘hundred’ in the last column. These numerals contain the base jé glossed as ‘person’ in Storch (2014), but obviously standing in for ‘twenty’. This invariant base is never preceded by á and it is therefore classified as a syncretic numeral in our typology. What is relevant is that when jé is multiplied, it is followed by the áless numeral roots of the numerals ‘two’ to ‘five’. The absence of the general classifier á once again suggests that the numerals 1–5 represent a stacking pattern that obtains between áless abstractcounting numerals and objectcounting numerals (the latter include the á).^{ 12 }
3.5 Summary
The data explored in this section (the stacking pattern discussed in subsection 3.2 in particular) suggest that across languages, the abstractcounting component of numerals is more primitive than the objectcounting one. This is because of the fact that in the stacking cases, the objectcounting numeral morphologically contains the abstractcounting numeral. In the next section, we explore crosslinguistic evidence suggesting that even abstractcounting cardinals involve more structure than typically assumed.
4 Complex numerals
In this section, we turn our attention to languages in which abstractcounting numerals are morphologically complex in that they contain two morphemes. We shall call such abstractcounting numerals complex numerals. The patterns we found in such languages are summarized in Table 3 (A, B and C again represent separate morphemes).
Complex numerals
syncretism  stacking  suppletion  
abstract counting  A+B  A+B  A+B 
object counting  A+B  A+B+C  A+C 
The types of relations in Table 3 are identical to what we have observed in simplex cardinals, recall Table 1; the only difference is that the first row of Table 3 contains a numeral that is bimorphemic (rather than nondecomposable). For example, in the complex syncretic pattern, there is (again) no formal distinction between abstract and objectcounting numerals. The only difference to what we have already seen is that the syncretic numerals in Table 3 consist of two separate morphemes. Similarly, the complex stacking pattern again reveals an asymmetry between abstract and objectcounting numerals, only this time, the former are already bimorphemic. As a result, the objectcounting form contains an additional third morpheme. Finally, according to the complex suppletive pattern, both abstract and objectcounting cardinals share the same core but differ in that each employs a different (morphologically unrelated) affix.
The data thus support the basic analytic claim (namely that objectcounting numerals are derived from abstractcounting numerals), but lead us to revise the hypothesis that abstractcounting numerals are structurally simplex. Specifically, if each morpheme in the abstractcounting pattern expresses an independent component of meaning, we have to decompose abstractcounting numerals into at least two ingredients.
We also add here again that the patterns to be discussed pertain to individual numerals rather than whole languages, since a given language can have more than one type of cardinals. A description of an entire numeral system of a language will result from providing a full classification of its numerals into various marking patterns.
4.1 Syncretism
Let us begin the empirical discussion from complex syncretic numerals. A good example is Hawaiian, where the cardinals 1–9 always consist of a root and the prefix ‘e.^{ 13 } As witnessed in Table 4 (based on Elbert & Pukui 1979, 158–160), the interpretation of ‘e as an affix is corroborated by the fact that it is replaced by another morpheme in distributive numerals.^{ 14 }
Numerals in Hawaiian
number  cardinal  distributive 
2  ‘elua  pālua 
3  ‘ekolu  pākolu 
4  ‘ehā  pāhā 
5  ‘elima  pālima 
6  ‘eono  pāono 
An example of an objectcounting use of the numeral ‘elua ‘two’ in Hawaiian is provided in (37) where the prefixed form precedes the noun (Elbert & Pukui 1979, 159).
‘elua  i‘a  
prfxtwo  fish  
‘two fish’  Hawaiian 
Interestingly, the very same shape is also used in mathematical discourse to express the abstractcounting function, as demonstrated in (38) (Pukui & Elbert 1986, 501; adapted). These data show that unlike, e.g., in Mandarin, Hawaiian cardinals are morphologically complex in both functions under consideration.
‘elua  ā  me  ‘elua,  ‘ehā.  
prfxtwo  and  with  prfxtwo  prfxfour  
‘Two plus two is four.’  Hawaiian 
Another example of complex syncretic numerals comes from Sanzhi Dargwa (Northeast Caucasian). In this language, all basic cardinals except for ca ‘one’ are morphologically complex. They consist of a root and a special derivational suffix (j)al, see Table 5 (based on Forker 2020, 130–135).^{ 15 } The affixal status of (j)al is proved by the structure of ordinal and multiplicative numerals in which (j)al is replaced by another element.
Numerals in Sanzhi Dargwa
number  cardinal  ordinal  multiplicative 
2  k’ ^{ w }el  k’ ^{ w } iɁibil  k’ ^{ w } ijna 
3  Ɂa ^{ ʕ } bal  Ɂa ^{ ʕ } bɁibil  Ɂa ^{ ʕ } bjna 
4  a b ^{ w }al  a b ʔubil  a b ^{ w }na 
5  xujal  xuʔibil  xujna 
6  urekːal  urekʔibil  urekna 
The bimorphemic forms in the left compartment of Table 5 are used as both object and abstractcounting numerals. For instance, in (39), the cardinal xujal ‘five’ functions as a prenominal modifier (Forker 2020, 131; adapted).
xujal  rursːi  
fivesfx  girl  
‘five daughters’  Sanzhi Dargwa 
The very same form is also used in abstractcounting environments such as (40) (Forker 2020, 137; adapted).
k’^{w}elle  Ɂa^{ʕ}bal  čikabixar  birχ^{w}u  argu  xujal 
twoloc  threesfx  sprdownnthrowcond  nbecomeprs  go.ipfvprs  fivesfx 
‘Two plus three equals five.’  Sanzhi Dargwa 
Having introduced the pattern of complex syncretic cardinals, we now turn to a language with complex numerals that exhibit the stacking pattern.^{ 16 } This language behaves similarly to, e.g., Mandarin or Yoruba, but features abstractcounting numerals that are morphologically complex.
4.2 Stacking
The relevant pattern is found in Vera’a (Vanuatu). In this language, the cardinals 1–5 are morphologically complex, containing a root preceded by the prefix vō/ve, see Table 6 (based on Schnell 2011, 73–74).^{ 17 }
Numerals in Vera’a
number  cardinal  multiplicative 
1  vōwal  vagwal 
2  vōru(ō)  vagru(ō) 
3  vō’ōl  vag’ōl 
4  vōve’  vagve’ 
5  velimē  vaglimē 
The status of vō/ve as a separate morpheme is evidenced by the fact that it is replaced by the prefix vag in multiplicative numerals. Analogous prefixes on basic cardinals are attested in other Vanuatu languages as well, e.g., in Vurës (Malau 2016, 126).
In abstractcounting contexts like the one exemplified in (41) (from Schnell 2011, 83), we find the numeral in the same form as given in the table. This pattern is similar to the ones seen in the previous section, where abstractcounting numerals were morphologically complex.
vēvēgi  ne  lukun  ēn  naw,  din  ēn  vō’ōl…  
mother3sg  tam  count  art  wave  reach  art  nbrthree  
‘Then his mother counted the waves reaching (the number) three…’  Vera’a 
What is different about Vera’a is that in NPs, i.e., in the objectcounting function, the numerals appear with an additional morpheme ne referred to as a ‘ligature’ by Schnell (2011, 74), see (42).
ēn  woqe’enge  ne  vōru  
art  tree  lig  nbrtwo  
‘two trees’  Vera’a 
It is interesting to note that according to Schnell’s description, the ligature is specific to numerals, i.e., it is not a general linking morpheme. In sum, Vera’a instantiates a pattern similar to Mandarin, because its objectcounting numerals stack an additional marker on top of the abstractcounting numeral. The new thing about Vera’a is that the abstract numeral is already complex, so we have a sequence of three morphemes in Vera’a.^{ 18 }
The Vera’a data suggests that the abstract and objectcounting functions consist of two and three semantics components, respectively, each of which can be exponed by a separate morpheme. Although instances of such a strongly agglutinative marking are scarce, the strength of the evidence from Vera’a indicates two things. First, morphologically complex abstractcounting cardinals suggest that we need a complex structure for them. And further, the data shows that even morphologically complex abstract numerals may be further augmented in their objectcounting function.
4.3 Suppletion
The final pattern that we would like to discuss is attested in a subset of languages that would be classified as obligatoryclassifier languages. In such languages, the numeral in the objectcounting function is always accompanied by one of a potentially large set of classifiers. We depict this schematically in (43a). This is similar to Mandarin and other languages discussed in Section 3.2.
A mixed pattern  
a.  Objectcounting: numeral + clf1/clf2/clf3/… 
b.  Abstract counting: numeral + clf1 
However, unlike in Mandarin, the languages that we shall discuss in this section retain the classifier in the abstractcounting function, as depicted in (43b). The particular classifier tends to be one of the set of objectcounting classifiers. Following the tradition, we will call it the general classifier.
If we analyze the pattern in (43), we realize that such languages represent a mixture of two ‘pure’ systems. On the one hand, such numerals have something in common with the complex syncretic numerals discussed in Section 4.1. The similarity consists in that the particular form consisting of a numeral + clf 1 (the general classifier) has both an objectcounting use and an abstractcounting use.
A pure complex syncretic pattern  
a.  Objectcounting: numeral + clf1 
b.  Abstract counting: numeral + clf1 
However, the language type in (43) is different from the pure complex syncretic pattern (as exemplified by the languages discussed in Section 4.1) in that the abstractcounting classifier is not always found in the objectcounting function and may be replaced by other (more specific) classifiers. If we only focused on these other classifiers, we would get a pattern where the abstractcounting classifier is replaced by an unrelated (i.e., suppletive) objectcounting classifier, as in (45).
A pure complex suppletive pattern  
a.  Objectcounting: numeral + clf2/clf3/… 
b.  Abstract counting: numeral + clf1 
Based on these considerations, we shall treat here the mixed type separately from the syncretic type, and, in fact, as closely related to the suppletive type. However, we admit that these languages do not represent the pure suppletive type depicted in (45). Such a pure suppletive type of language was not found among the languages we looked at.^{ 19 }
As the first example of the mixed/suppletive type, consider Shuhi (Qiangic, SinoTibetan). This language has a relatively rich inventory of classifiers. They are obligatory in numeral phrases and their choice depends on the modified noun. One of them is ko ^{35}, which serves as the general classifier, see (46a) (Qi and He 2019, 65).^{ 20 } The example (46b) contains a different classifier, namely ƫshu ^{55}.
zɐ^{31}mi^{31}  ȵe ^{ 33 }  ko ^{ 35 } 
child  twoclf 
‘two children’ 
lɑ^{33}re^{55}  ȡʑ i ^{ 33 }ƫshu ^{ 55 }  
towel  oneclf  
‘one towel’  Shuhi 
Importantly, unlike languages such as Mandarin or Yoruba, Shuhi numerals require the general classifier ko ^{35} also in abstractcounting contexts such as (47) (Qi & He 2019, 69).
ȡʑi^{33}ko^{35}re^{33}  ȡʑi^{33}ko^{35}ɦõ^{33}  me^{33}ba^{33}le^{55}  ȵ e ^{ 33 }  ko ^{ 35 }  le^{33}ʑiʔ^{33}ȡʑõ^{33}. 
oneclfabl  oneclfloc  diraddaux  twoclf  dirbecomedur 
‘One plus one is two.’  Shuhi 
A similar set of facts is found in Mokilese (Austronesian). In (48), we can see the examples of two classifiers (Harrison 2019, 91). The general classifier w is seen in (48b). The pair nicely illustrates the general nature of w.
adroau  riahkij 
egg  twoclf 
‘two pieces of egg’ 
adroau  riaw  
egg  twoclf  
‘two eggs’  Mokilese 
The example in (49) again contrasts the general classifier w in (49a) and the classifier pas for long thin objects, see (49b). In (49c), we see the animate classifier men.
wus  riaw 
banana  twoclf 
‘two bananas’ 
wus  rahpas 
banana  twoclf 
‘two banana trees’ 
jeri  roahmen  
children  twoclf  
‘two children’  Mokilese 
The abstractcounting context is shown in (50). We can see that the numeral walu ‘eight’ is followed by the general classifier w (Harrison 2019, 94).
Riapak  pahw  waluw.  
twotimes  fourclf  eightclf  
‘Two times four is eight.’  Mokilese 
A similar system is found in Abkhaz (Northwest Caucasian). In the morphological system of this language, nouns are grammatically classified as human or nonhuman, with human nouns including the classes of masculine and feminine nouns (Chirikba 2003, 24–25). Furthermore, the cardinals 2–10 can take two forms, as demonstrated in Table 7 (based on Chirikba 2003, 34–35 and Hewitt 2010, 33), each of which is clearly morphologically complex.^{ 21 }
Cardinals in Abkhaz
number  abstract/nonhuman  object human 
4  pš’ba  pš’j°ə(k’) 
5  x°ba  x°j°ə(k’) 
6  fba  fj°ə(k’) 
7  b əž’ba  bəž’j°ə(k’) 
8  aabá  aaj°ə(k’) 
Forms with the suffix ba are objectcounting cardinals used to quantify over referents of nonhuman nouns, see (51a). On the other hand, numerals with the suffix j°ə(k’) are dedicated for counting human individuals, see (51b) (Hewitt 2010, 35; transliterated and adapted).
ac’°ak°a  x°ba 
artapplepl  fivenh 
‘(the) five apples’ 
ač”k’°ənc°a  x°j°ə  
artboypl  fiven  
‘(the) five boys’  Abkhaz 
Both types are morphologically marked, i.e., there is no structural asymmetry between the two forms. However, despite this fact, only nonhuman numerals can express the abstractcounting meaning, as evidenced by the contrast in arithmetical statements such as (52).^{ 22 }
j°ba  jacəwc’ar  xpa  jəq’alojt’  x°ba 
twonh  if.you.add  threenh  it.becomes  fivenh 
‘Two plus three equals five.’ 
#j°əʒ’a  jacəwc’ar  xj°ə  jəq’alojt’  x°j°ə  
twoh  if.you.add  threeh  it.becomes  fiveh  
Intended: ‘Two plus three equals five.’  Abkhaz 
Admittedly, none of the patterns discussed here is an unequivocal instance of abstract/objectcounting suppletivism. Nevertheless, based on the reasoning presented at the start of this section, we treat this mixed type as a relevant case of suppletivism, because this is what one part of the system amounts to. The reason why it is important to capture the fact that the general classifier is replaced by other classifiers is because it is not necessary for it to be so. For example, in the Mayan language Chuj, the more specific classifier (called nominal classifier) may (optionally) appear in addition to the numeral classifier. For example, in (53), the animate numeral classifier wanh is followed by the nominal classifier nok’ ‘animal’ (Royer 2017, 33).
Ay  chab’wanh  nok’  tz’i’.  
exist  twonum.cl  nom.cl  dog  
‘There are two dogs.’  Chuj 
While we do not wish to suggest that Chuj represents a stacking pattern (there is much to be understood about the second – socalled nominal – classifier), we do think that the possibility of a system like the one in Chuj highlights the need to explain why the general classifier does not combine with other classifiers. And this is an essence of what we label as the complex suppletive pattern.
Before we propose how to account for the morphosemantic variation in cardinal numerals discussed so far, let us summarize our findings.
5 Data summary
In the two previous sections, we have examined six types of meaning/form correspondences across numerals in various genetically and typologically distinct languages. As one can see in Table 8, the crosslinguistic variation we have explored reduces to the interplay of two factors. The first one is the morphological relationship between forms expressing the abstract and objectcounting meaning, i.e., syncretism, stacking and suppletion. The second one concerns the shape of the abstractcounting form, i.e., simplex vs. complex. Importantly, the patterns in Table 8 are numeralspecific rather than languagespecific as different numerals in a single language can fall into various types.
Data Summary
type  language  number  abstract  object  
simplex  syncretism  English  3  three  three 
stacking  Mandarin  3  sān  sāngè  
suppletion  Maltese  2  tnejn  żewġ  
complex  syncretism  Shanzi Dargwa  5  xujal  xujal 
stacking  Vera’a  2  vōru(ō)  nevōru(ō)  
suppletion  Mokilese  4  pahw  pahmen 
In the simplex patterns, abstractcounting forms are monomorphemic, whereas in the complex patterns they are bimorphemic. Syncretic numerals do not differentiate morphologically between the abstract and objectcounting form. On the other hand, in the stacked and suppletive pattern, the two functions are expressed by different exponents. In the former case, objectcounting numerals employ an additional morpheme compared to their abstractcounting counterparts, whereas suppletive numerals are not distinguished by their structural complexity but rather by the fact that they consist of morphologically unrelated forms.
The patterns summarized in Table 8 on the one hand suggest that the abstractcounting meaning of numerals is more basic than the objectcounting meaning. On the other hand, the existence of complex abstract numerals calls into question a widespread assumption that basic cardinal numerals are simplex expressions. If we adopt the assumption that morphology expresses some meaning, it follows from the existence of the complex stacking pattern that there are at least two, respectively three semantic components encoded by such numerals.
What we perceive to be a single thread running through the patterns is that whenever there is a difference in morphological complexity among object and abstractcounting numerals, objectcounting numerals are more complex. This apparently closeto universal asymmetry suggests to us that it makes sense to propose a unified theory where all numerals are characterized by the same ingredients, such that at the level of meaning, objectcounting numerals are more complex than abstractcounting numerals. The remaining points of variation can be accounted for within a system of morphological realization.
6 Structures
In order to explain the asymmetry in marking patterns summarized in Table 8, we postulate a set of primitive semantic features that abstract and objectcounting numerals are composed of. We assume that those semantic primitives are crosslinguistically invariant and can feed into a unified morphosemantic system. As a result, we will be able to compositionally derive the abstract/object counting distinction in a crosslinguistically uniform fashion, with variation limited to morphosyntactic realization. However, before we turn to the technical details of our analysis, let us spell out our assumptions regarding the nature of cardinals and the role of classifiers.
6.1 Key intuition
The core intuition behind our proposal is that numerals are at their core scalar entities. Specifically, we assume that each numeral is associated with an interval on the number scale. Assuming that the number scale has a nonarbitrary starting point corresponding to 0 and that each numeral designates a different upper bound, such intervals can be viewed as ordered in terms of their length.
It seems to us that there are at least three other strands of research that share a similar intuition. The first one concerns an attempt to explain the meaning of spatial and directional numeral modifiers as in above three and up to three (Nouwen 2016). Furthermore, in a number of approaches to degree semantics, degrees are taken to refer to intervals rather than to points on a scale (e.g., Seuren 1984; Kennedy 2001; Schwarzschild & Wilkinson 2002). Finally, the recent analysis of morphological marking patterns in comparative formation suggests a similar direction (Vanden Wyngaerd et al. 2020).^{ 23 }
6.2 The role of classifiers
The standard assumption regarding the role of numeral classifiers in languages such as Mandarin concerns the semantics of nouns (e.g., Chierchia 1998; Borer 2005; Rothstein 2010; Scontras 2013). That is because according to the received view the denotations of nouns in classifier languages are masslike which makes them incompatible with the meaning of numerals. Thus, classifiers are required in order to compensate this semantic ‘deficit’ and turn the denotations of nouns into countable ones.
In this paper, however, we embrace an alternative view on the role of classifiers in numeral phrases (Krifka 1995; Bale & Coon 2014; Sudo 2016). One of the problems the received view encounters concerns languages such as Mi’gmaq (Algonquian) and Ch’ol (Mayan) in which certain numerals require classifiers whereas others appear only bare (Bale & Coon 2014). Furthermore, even in wellstudied obligatory classifier languages such as Japanese certain quantifiers do not take classifiers (Sudo 2016). If the occurrence of the classifier were due to the semantics of nouns, those facts would be unaccounted for since there is no evidence that nouns change their meaning depending on a quantifier that modifies them. Consequently, according to the alternative explanation of the role of classifiers, they are required to compensate semantic ‘deficits’ of numerals rather than nouns. The idea is that in classifier languages numerals lack a semantic component that would allow them to quantify over entities, recall (13b) in Section 2.2. This role is, thus, assigned to classifiers.
This semantic view is also confirmed by a comparative study of the ordering of classifiers, nouns and numerals by Cinque (2020). What Cinque observes (building on previous work) is that classifiers never occur on the opposite side of the noun than the numeral; orders such as *numeral > N > clf or *clf > N > numeral are not found. This receives a natural explanation under the hypothesis that [ numeral + clf] form a unit/constituent, which is what we are proposing here as well. Furthermore, as long as the classifier is adjacent to the numeral, it may in fact be separated from the noun by the numeral, yielding the order clf > numeral > N. Such examples are attested, see, e.g., the data from Upper Necaxa Totonac in (36) or Hawaiian in (37). Both of these facts suggest that the syntactic structure contains a constituent [clf numeral], which does not split and is (as a unit) positioned with respect to the noun.
With the crucial assumptions introduced, let us now move to the details of the proposal.
6.3 Semantic features
In order to capture the morphological complexity of basic cardinals discussed in Section 3 and 4 and summarized in Table 8 in Section 5, we propose three semantic primitives: Scale, Num (for ‘number’) and Cl (for ‘classifier’), which we take to be crosslinguistically stable ingredients of the meaning of numerals.
The first feature, namely Scale, denotes a closed interval, i.e., a set of natural numbers (type 〈n, t〉), see (54a). As indicated by the subscript, the semantic content is lexically encoded, and thus specific for each distinct cardinal numeral. In other words, there is Scale _{1} for ‘one’, Scale _{2} for ‘two’ etc. What all Scale elements have in common is that their lower bound is always 0. They differ in what the upper bound is. For instance, in the case of the cardinal three, Scale _{3} denotes the set of integers between 0 and 3, as in (54b).
a.  ⟦Scale
_{
m
}⟧_{〈n,t〉}

b.  ⟦Scale _{3}⟧ = [0, 3] 
The second primitive is the syntactic head Num. As such it is an invariant functional element shared by all cardinal numerals. The meaning of Num is a function from intervals to numbers. In (55a), the maximization operator max takes a set of integers denoted by Scale and returns the greatest value from that set. As a result, it creates a name of a number concept (type n) compatible with abstractcounting contexts calling for numeric arguments. For instance, when applied to the interval [0, 3] denoted by Scale _{3}, it will yield the number 3, see (55b).
a.  ⟦Num⟧_{〈〈n,t〉,n〉} = λP _{〈n,t〉}[max(P)] 
b.  ⟦Num⟧(⟦Scale _{3}⟧) = 3 
The final component we propose is the Cl head, which introduces objectcounting semantics. Its role is to turn a number into a counting expression corresponding to that number. In particular, we postulate that the meaning of Cl is a function from an integer to a predicate modifier supplied with the pluralization operation * (Link 1983) and the measure function #(P), which maps a plurality of individuals to a numeric value corresponding to the number of individuals making up that plurality, see (56a) (Krifka 1989). In other words, Cl is an expression of type 〈n, 〈〈e, t〉, 〈e, t〉〉〉 that creates an objectcounting numeral.^{ 24 } ^{,} ^{ 25 }
a.  ⟦Cl⟧_{〈n,〈〈e,t〉,〈e,t〉〉〉} = λn _{ n } λP _{〈e,t〉} λx _{ e }[*P(x) ∧ #(P)(x) = n] 
b.  ⟦Cl⟧(⟦Num⟧(⟦Scale _{3}⟧)) = λP _{〈e,t〉} λx _{ e }[*P(x) ∧ #(P)(x) = 3] 
Combining the components in (54) and (55) in a compositional manner gives us the abstractcounting structure in (57). As one can see in the tree, the entire NumP refers to a number concept. Such a denotation is thus compatible with expressions calling for numerical arguments such as be a prime number or plus which we take to be of type 〈n, t〉 and 〈n, n〉, respectively.
Adding the Cl head as defined in (56) on top of (57) leads in turn to (58) which represents structure of the objectcounting numeral. The semantic contribution of Cl derives us a predicate modifier which when combined with a noun returns a set of pluralities in the denotation of that noun whose cardinality is the number referred to by the NumP, e.g., 3 in the case of (58).
Our model is broadly based on the idea that the meaning components postulated above are uniformly structured across languages and they must all be pronounced, though languages differ in how they pronounce them. So, for instance, the English phrase three apples and the Mandarin classifier construction sān gè qiú ‘three balls’ get the corresponding denotations in (59a) and (59b), respectively (cf. Krifka 1995).
a.  ⟦three apples⟧ = λx[*apple(x) ∧ #(apple)(x) = 3] 
b.  ⟦sān gè qiú⟧ = λx[*ball(x) ∧ #(ball)(x) = 3] 
In the next section, we will show that it is possible to use the crosslinguistically uniform structures given in (57) and (58) and still account for language variation in the domain at hand, summarized in Section 5.
7 Spellout
The main idea of our analysis is that the semantic structures proposed in the preceding section are universal, and variation among languages reduces to how these ingredients are pronounced in individual languages. This idea is depicted in Table 9. In the table, we have six rows, one for each pattern. The table is further divided into two columns labeled abstract and object, which designate the relevant types of numerals. The abstractcounting column has two components of meaning, each in its own subcolumn. The objectcounting column has three ingredients.
Analysis Overview
abstract  object  
Scale  Num  Scale  Num  Cl  
simplex syncretic  three  English 3  three  
simplex suppletive  tnejn  Maltese 2  żewġ  
simplex stacking  sān  Mandarin 3  sān  gè  
complex syncretic  xu  jal  Sanzhi Dargwa 5  xu  jal  
complex suppletive  pah  w  Mokilese 4  pah  men  
complex stacking  ru(ō)  vō  Vera’a 2  ru(ō)  vō  ne 
The basic idea is that all of these ingredients must be lexicalized, i.e., linked to a specific morpheme (or more generally, to a lexical item). Individual numeral roots in particular languages are then analyzed as lexicalizing a variable number of these components. When the numeral is able to lexicalize both the Scale and Num heads, we see just a single marker in the abstract counting column. These are the simplex patterns above the double horizontal line. When the numeral lexicalizes only the Scale head, we get the complex numerals, because we need an additional marker to lexicalize Num.
The simplex and complex pattern further separate into three different types depending on how the extra Cl component is realized in the objectcounting column. If we need an extra morpheme for it, we get the stacking pattern. If it is realized along with other meaning components, we get either the syncretic pattern or the suppletive pattern depending on details that we shall look more into as the discussion unfolds. At this point, the main message is that we can lexically specify individual morphemes for the ingredients they spell out, and by doing so, derive the morphological variation from this assumption. Needless to say, the lexical specification of morphemes is the one place in the grammar where evidence for crosslinguistic variation is indisputable.
In the remainder of this section, we formalize this intuition using an independently established spellout algorithm developed within the Nanosyntax framework (Starke 2018). The very same algorithm has been used in the study of degree morphology (De Clercq & Vanden Wyngaerd 2017; Vanden Wyngaerd et al. 2020), relative/wh pronouns (Wiland 2018, Bergsma 2019) and case morphology (Caha 2019). The relevant point is that the account does not represent an ad hoc approach. Rather, we use an independently established model of crosslinguistic variation in order to show that the morphological facts discussed up to now can be derived while assuming the invariant structure in (58).
7.1 Simplex syncretic numerals
Nanosyntax is a realizational postsyntactic model of morphology. This means that syntactic structures are built first from abstract meaningful components, which are then mapped onto their pronunciation during the socalled lexicalization procedure. A crucial element in the lexicalization process is the lexicon, languageparticular list of lexical items. In Nanosyntax, each lexical item is understood as a link between representations belonging to different modules. Concretely, they link a particular syntactic structure to a particular phonology and/or concept.
Due to the linking, lexical items serve as ‘translation’ instructions of sorts. We can read them as follows: if syntax generates the stored representation S, then this representation is linked to a phonological representation P, which is associated to the lexical entry. Under this view, lexical entries have the restricted format < S,P >, where each of S and P is a wellformed representation in the respective module.^{ 26 }
In order to have a concrete lexical item as an example, consider the English numeral three. This numeral can act as an objectcounting numeral. The structure of objectcounting numerals according to our proposal is as given in (60). In order to express the fact that this structure can be pronounced by the numeral three, we link the relevant structure to the phonology /θɹiː/. This linking is shown in (61). This numbered item corresponds to the lexical item for the numeral three.
As already said, the entry for three can be read as a ‘translation’ instruction: when syntax builds ClP, this structure can be spelled out as /θɹiː/.
In Nanosyntax, the decision whether a lexical item can (or cannot) spell out a particular structure is dependent on the notion of matching. The rule is that whenever matching between a lexical entry and the structure obtains, the lexical entry can spell out the structure. Matching is based on identity: whenever the syntactic structure is identical to something stored in the lexicon, the item that contains the relevant tree matches the syntax tree. This is encoded by the socalled Superset Principle, see (62).
The Superset Principle (Starke 2009): 
A lexically stored tree L matches a syntactic node S iff L contains the syntactic tree dominated by S as a subtree. 
The Superset Principle leads to the consequence that a lexical entry matches any structure that is contained inside the lexical entry. This means that when syntax only builds the abstractcounting numeral – with structure as in (63) – the lexical item for the numeral three can be inserted, because it matches the structure. Matching (recall) is defined by The Superset Principle: since the lexical entry for the numeral, repeated in (64), contains the structure in (63), matching obtains and spellout can take place. Spellout is encoded by the circle around NumP in (63). The circle in (64) just highlights the fact that the lexical entry literally contains a part that is identical to the structure spelled out.
As a result, when a language has a numeral with an entry such as (64), we get a syncretism between the object and abstractcounting function.
7.2 Simplex suppletive pattern
Let us now address the question how the simplex suppletive pattern arises. This pattern is found in languages where a single numeral (e.g., ‘two’) has two lexical entries, each for a different function. As an example, recall the Maltese numeral ‘2’, which has the shape tnejn for abstract counting and żewġ for object counting, recall (28)–(29).
The lexical entries are given below. The entry for the objectcounting numeral is shown in (65), the abstractcounting numeral is in (66).
With these entries in place, let us consider the spellout of the object and abstractcounting forms. These are depicted in (67) and (68) respectively. In the case of the objectcounting numeral, we only have one matching lexical item, namely żewġ. The reason is that the other numeral (tnejn) does not match the relevant constituent (it does not contain it). Therefore, the objectcounting numeral is spelled out as żewġ.
Let us now turn to the abstractcounting structure in (68). This structure is contained in both lexical entries. Therefore, both entries match, and they can both be used to spell out the structure. In such a situation, competition among the entries arises. The competition is resolved according to the socalled Elsewhere Condition (Kiparsky 1973), which demands that ‘the more specific’ entry wins. In a system where insertion is governed by the Superset Principle, the most specific entry is the one that has fewer superfluous features. In the case of the abstractcounting structure depicted in (68), this is tnejn, whose entry has no superfluous features. On the other hand, żewġ does have one superfluous feature and it therefore loses in the competition. In sum, the pair of lexical entries in (65)–(66) gives rise to the simplex suppletive pattern.
The Elsewhere Condition: When multiple items match, chose the more specific one (the one that has fewer superfluous features). 
7.3 The simplex stacking pattern
In the simplex stacking pattern (which is found, for instance, in Mandarin), we encounter (for the first time) a situation where two morphemes are needed to spell out a particular structure. We shall begin by providing the lexical entries for the numeral ‘three’ (sān) and for the default classifier gè; see (70) and (71) respectively.
We can see that the lexical entry for the numeral in (70) corresponds to the abstractcounting structure. Therefore, when syntax builds such a structure – as in (72) – the numeral can be used to spell it out. As a result, we correctly model the fact that we only see the bare numeral in the abstractcounting function in Mandarin.
Consider now the objectcounting structure in (73). Here, the numeral cannot spell out the full structure because it does not contain it in its entry. (Hence, no match and no spellout.)
The maximal constituent matched by the numeral is NumP, circled in (73). This still leaves the head Cl and its projection ClP without spellout. Leaving projections without spellout is not allowed in Nanosyntax: in order to externalize meaning, all features must be spelled out (cf. Fábregas 2007). The intuition which we shall follow is that the two unlexicalized nodes in (73) are spelled out by the classifier gè. These are also precisely the two nodes that are contained inside its entry in (71).
In order to see how this intuition is technically implemented in Nanosyntax, we must introduce the idea of cyclic spellout and spelloutdriven movement. Cyclic spellout in Nanosyntax means that every time a new feature is introduced in the syntax, it must be lexicalized before the derivation is allowed to continue. We state this in (74).
Cyclic Phrasal Spellout. Spell out must successfully apply to the output of every Merge F operation. Since Merge proceeds bottomup, so does spellout. After successful spellout, the derivation may terminate, or proceed to another round of Merge F. 
The consequence of adopting cyclic spellout is that structures are built in small steps, where each Merge F operation is immediately followed by spellout. For example, in order to build the structure of the objectcounting numeral (corresponding to ClP), we must first build – and spell out – the abstractcounting numeral. Therefore, the tree in (72) can be considered the first step in the construction of the objectcounting numeral in (73). The tree in (73) (with a circle around the NumP) then represents a stage in the derivation where we have added a new feature to the structure in (72), and we now need to spell out the new structure.
In order to see how exactly spellout operates, let us present in (75) the socalled spellout algorithm. This algorithm implements the idea of Cyclic Spellout, and it also introduces three different spellout options for an FP built by external merge.
The Spellout Algorithm (Starke 2018)  
a.  Merge F and spell out FP. 
b.  If (a) fails, move the Spec of the complement of F and spell out FP. 
c.  If (b) fails, move the complement of F and spell out FP. 
The first option is to find a match for that FP and spell it out. By hypothesis, the Mandarin numeral ‘three’ is stored as a NumP, and it therefore does not match the whole tree in (76). What this means is that other derivational options in the algorithm in (75) must be explored.
The first rescue option that must be tried (since spellout failed in (76)) is to move the Spec of the complement (and then retry spelling out FP). However, since the complement of Cl has no Spec in (76), the option (75b) is undefined. As a consequence, the final derivational option is tried, see (75c). This derivational option requires us to move the complement of F, which is a step we show in (77).
The root node in (77) has the same label as the one on the right branch. The point here is not so much to suggest that this is an adjunction structure; the point is to say that it is the right branch that provides the label to the root node.
After complement movement, the structure in (77) can be simplified as in (78) with the trace of the movement ignored. The assumption that spellout movements do not leave a trace is not crucial, even though it is a standard part of the Nanosyntax toolbox. The reason why we follow this idea here is that ignoring the trace makes it easy to see when lexical entries match and when they don't. See Starke (2018) and Caha (2019) for more discussion.
The spellout algorithm now requires that after we move the complement, we try to spell out FP again. This time, matching is successful, because the general classifier gè, recall (71), can be inserted at the lower ClP, as in (79). As a result, the simplex stacking pattern is derived using the lexical entries in (70)–(71), and we also derive the correct ordering. The structure in (79) also replicates the intuition depicted in Table 9 to the effect that the classifier in Mandarin spells out Cl, while the numeral root spells out Scale and Num.
7.4 On the interaction between suppletion and stacking in colloquial Mandarin
At this point, we have shown how all the three simplex patterns arise. The current section turns to a particularly interesting set of facts surrounding the numeral ‘three’ in spoken Mandarin, relying on the description in He (2015, 199) and Qi & He (2019, 76–78). We argue that the facts provide a particularly compelling case for analyzing the patterns at hand by means of phrasal spellout.
As a background, recall that Mandarin generally exhibits the stacking pattern illustrated in (80ab).
jiǔ  chúyǐ  sān  shì  sān. 
nine  divide.by  three  cop  three 
‘Nine divided by three is three.’ 
sān  gè  xuéshēng  
three  clf  student  
‘three students’  Mandarin 
sā  (*gè)  xuéshēng  
three.clf  clf  student  
‘three students’  Colloquial Mandarin 
Interestingly, the spoken language has developed a special form for the objectcounting numeral ‘three’, namely sā. This form is in (80c), and He (2015, 199) describes it as a “reduced form” corresponding grammatically to the Standard Mandarin combination sān gè ‘three clf’. The analysis of sā as an opaque combination of a numeral and a classifier is motivated by the fact that sā must combine with the noun directly, i.e., without a classifier; see (80c). In our system, sā is therefore an instance of a suppletive objectcounting numeral with sān in (80a) as its abstractcounting counterpart.
The important point is that this description nicely illustrates the gist of our idea as to how morphologically simplex objectcounting numerals arise, namely as an opaque realization of a complex structure containing the meaning components corresponding to the combination of an abstractcounting numeral and a classifier.
We will see shortly that the reduction of sān gè ‘three clf’ to sā is not phonological, but morphological in nature. In our proposal, the colloquial example in (80c) therefore requires us to posit a special lexical entry for sā, which only exists in the colloquial variety. The entry is as given in (81).

This lexical entry coexists in the colloquial variety with two lexical entries. One of them is the lexical entry for the abstractcounting sān ‘three’, which the colloquial variety uses in arithmetical statements just like the standard variety. The second lexical entry is the entry for the general classifier gè, which is still used in the colloquial variety with other numerals. We give the two entries in (82) and (83), repeated from (70) and (71), respectively.
The coexistence of the three lexical entries in the colloquial variety allows us to illustrate the explanatory power of the theoretical tools that we have posited up to now. First of all, note that sā with the lexical entry as in (81) could be in principle used in mathematical statements, because it contains the abstractcounting structure. However, sān with the entry (82) is a perfect match for the abstractcounting structure, and we therefore predict that sā cannot occur in arithmetical environments. This prediction is borne out, see (84) (Qi & He 2019, 77).
*Wǒ  zuì  xǐhuan  de  shù  shì  sā.  
1.sg  most  like  prt  number  cop  three.clf  
Intended: ‘The number I like most is three.’  Colloquial Mandarin 
A more interesting prediction relates to the relative sequence of steps in the spellout algorithm. To see it, suppose that the cyclic derivation has produced the abstractcounting numeral sān ‘three’. Suppose now that the head Cl is added, as in (85), repeated from (76).
The prediction – related to the ordering of steps in the spellout algorithm – is that when confronted with (85), the system first tries to spell out the whole constituent without movement. In the colloquial variety, this attempt leads to a successful match by sā, see (86). As a result, we predict that in colloquial Mandarin, sā (rather than sān gè) is going to be the spellout of (85), which is the case.
On the other hand, if complement movement were ordered before direct spellout, we would derive the form with the classifier (i.e., sān gè), despite the presence of the suppletive portmanteau sā ‘three.clf.’ Therefore, if the ordering of the steps were different, we would fail to derive the colloquial form even if it was contained in the lexicon, clearly the wrong result. In sum, we found support for the proposed ordering of the two relevant steps of the spellout algorithm.
Most curiously, some additional facts show that even though colloquial Mandarin prefers the monomorphemic spellout sā ‘three.clf’ over sān gè ‘three clf’, this is only so if sān and the Cl head form a constituent. Since this is something that is predicted by the phrasalspellout account of suppletion, let us discuss the relevant facts in the remainder of this section.
The relevant configuration arises when the numeral ‘three’ is a part of a transparent compound numeral, such as the Mandarin equivalent of ‘fifty three’. The structure of such numerals argued for in He (2015, 200) is as in (87). In this tree, we only elaborate on the fine structure of sān ‘three’, while the rest of the structure is simplified. The head that contributes the additive semantics is represented as &, following He.
Against this background, consider what our theory predicts when the Cl head is merged on top of the compoundnumeral structure, as in (88). The spellout algorithm requires that ClP must be spelled out. The first attempt is to spell out ClP directly. This will fail in (88): even though the numeral sā can spell out Cl along with the NumP, the matching principle allows this only if the Cl head and the NumP are contained in the lexical entry of the numeral; recall the Superset Principle in (62). This is obviously not the case in (88); therefore, we predict that sā ‘three.clf’ cannot appear in the compound numeral ‘fifty three’.
The prediction is borne out. As the data in (89a) show, it is indeed impossible to use sā ‘three.clf’ inside the compound numeral. Instead, the analytic combination sān gè is found, as in (89b).
*wǔ  shí  sā  xuéshēng 
five  ten  three.clf  student 
‘fifty three students’ 
[wǔ  shí  sān]  gè  xuéshēng  
five  ten  three  clf  student  
‘fifty three students’  Colloquial Mandarin 
The example in (89b) (where sān is directly followed by gè) shows that the numeral sā does not arise as a result of a simple phonological or morphological rule that operates on the basis of linear adjacency between sān ‘three’ and gè ‘clf’. Rather, it operates on the basis of syntactic constituency, which is predicted by our account where suppletive objectcounting numerals like sā arise as a result of phrasal spellout.
Finally, let us turn to the issue of how we can generate the correct form in (89b). Note that when the spellout algorithm fails to directly spell out Cl in (89), it proceeds to movements. The first movement to be tried is Specmovement, recall (75b). The phrase in (88) has a Spec, namely the phrase ‘fifty’. However, even if that phrase is removed, the & head (responsible for addition) is in the way of successful spellout of Cl and the NumP. Therefore, the option that ultimately succeeds is the one that moves the complement of the Cl head to the left, as schematically depicted in (90). This gives rise to the structure in (91) (with the trace removed), and the remnant ClP is spelled out by the classifier, see (92).
As a final point in this section, we note (following He 2015) that the data discussed above support the idea presented in Section 3.4 that, in at least some cases, compound numerals form a complex constituent composed of abstractcounting numerals. Only after the compound numeral is formed, Cl is merged in order to turn the abstractcounting compound numeral into the corresponding objectcounting numeral. (This does not exclude the possibility that in other cases, the structure may be more complex, recall footnote 12.)
7.5 Complex syncretic numerals
In this section, we turn to complex numerals. Recall that complex numerals are bimorphemic already at the abstractcounting level. In the current model, this entails that they spell out separately Scale and Num. Consider, for instance, the case of the Sanzhi Dargwa numeral xujal ‘five’. This complex numeral can be used both as an abstractcounting numeral and also as an objectcounting numeral, recall (39)–(40). We have therefore classified it as a complex syncretic numeral.
The lexical entries needed to capture this pattern are given in (93)–(94). The first lexical entry (namely xu ‘five’) in (93) spells out just Scale (reflecting the intuition in Table 9). The remaining features are included in the lexical entry of jal, see (94).^{ 27 }
In order to see how these lexical entries lead to the result xujal in both object and abstractcounting uses, consider the derivation below. We start by assembling the lowest Scale _{5} node and we spell it out as xu, using the lexical item (93). Once the spellout of Scale _{5} succeeds, the abstractcounting structure is derived by the addition of Num, see (95).
The NumP in (95) fails to spell out, because there is no match. (Neither the entry for xu (93) or jal (94) contain the structure.) Therefore, we must use the other options of the spellout algorithm, specifically complement movement. (96) shows the configuration after Scale _{5} undergoes complement movement from below the Num (the trace is ignored). In this configuration, the remnant NumP node may be spelled out by jal. This correctly yields the form xujal in (97) as the spell out of the abstractcounting structure.
Consider now how the objectcounting numeral is derived. Following the algorithm, we first merge the feature Cl on top of the abstractcounting numeral. The abstractcounting numeral is in (97), and merging Cl yields (98). Spellout without movement fails here, since the top node of (98) (ClP) is not contained inside any single entry.
Therefore, as customary by now, other clauses of the spellout algorithm are invoked. The first option to be tried is Specifier movement, recall (75b). The movement is shown in (99).
The result of Spec movement (with the trace ignored) is given in (100). After Spec movement, we have to retry ‘spell out FP’, where FP corresponds to the remnant ClP. This ClP is contained inside the lexical entry of the suffix jal, and the suffix is therefore inserted, see (101). Note that as a part of the process, the ‘old’ spellout of NumP in (100) is replaced by the ‘new’ spellout of ClP in (101). This is called overriding in Nanosyntax; see Starke (2009) and other Nanosyntax references for more on overriding, here we only make this explicit.
Before we conclude, let us make a general remark about syncretism. In the current theory, syncretism is always the result of the fact that one and the same lexical item may spell out two different structures. In the simplex syncretic pattern (e.g., three), the numeral would either spell out all three meaning components of objectcounting structures, or just Num and Scale in the abstractcounting use. In the complex pattern, this is very similar. The ambiguity boils down to the fact that the suffix jal either spells out only the Num feature, or Num+Cl. This is also as coded in Table 9.
7.6 Complex suppletive numerals
Let us now consider complex suppletive numerals. The derivation of this type is very similar to the way complex syncretic numerals work, the only difference is that instead of having an ambiguous suffix (like jal), we will have two different suffixes: one for Num+Cl, and the other one for Num only.
The lexical entries needed for this are given below. They are modeled on the basis of Mokilese, recall (48)–(50). In Mokilese, we find the general classifier w in abstractcounting contexts, and we therefore hypothesize that numerals spell out only Scale, see (102). The general classifier w spells out Num, see (103). In (104), we give the lexical entry for the classifier men, which is suffixed to the numeral in the objectcounting function when animates are counted. The crucial point is that men replaces w, and we therefore attribute to it both the feature Cl and Num.
The entries as given above will derive a pure suppletive system – which (as the reader may recall from the discussion in Section 4.3) we have not been able to find yet. For the sake of the discussion, we will nevertheless show how the derivation of such a pattern proceeds, using the entries as given above. Later, we shall also derive the actual attested patterns that mix suppletion and syncretism. (That will require us to say something about the difference between different classifiers.)
The derivation of a pure suppletive type proceeds in exactly the same steps as in the preceding Section 7.5. First, Scale is spelled out by the numeral. The feature Num is added. There is no single morpheme to spell out both, therefore Scale moves to the left and the remnant NumP is spelled out by w, see (105). Note that men can also be inserted here, but it loses in competition with w. Note also that the structure has the same abstract shape as the one used for Sanzhi Dargwa, recall (97).
The tree in (106) represents the objectcounting numeral. It is abstractly also the same structure as in Sanzhi Dagwa, recall (101). The only difference between Mokilese and Sanzhi Dargwa is that Mokilese has a dedicated suffix for each of the right branches in (105) and (106), while Sanzhi Dargwa had just a single marker for both. This is then how an idealized complex suppletive pattern arises.^{ 28 }
Now in reality, recall from (49a) that the general classifier w can also appear (as an ‘unmarked’ classifier) in objectcounting use. Therefore, we will have to attribute to it a lexical entry capable of spelling out Cl, which makes it – as things stand – identical to Sanzhi Dargwa jal. This is empirically correct, since both morphemes can actually be used in both object and abstractcounting uses.
However, by attributing Cl to w, we also make it indistinguishable from men, which is wrong; men does not appear in abstractcounting contexts. The answer we suggest here emanates from the observation that classifiers themselves may be morphologically complex. To see that, consider the following dataset from Nepali. Nepali is a simplex stacking language. In the abstractcounting use, numerals are bare, see (107) (Turnbull 1982, 45).
dui  ekan  dui.  
two  onetimes  two  
‘Two times one is two.’  Nepali 
When numerals are used in the objectcounting function, they must be accompanied by a classifier. Example (108a) shows the general classifier wota, while (108b) exemplifies the human classifier jana.
ek  wota  kaka 
one  clf.general  uncle 
‘one uncle’ 
ek  jana  kaka 
one  clf.human  uncle 
‘one uncle’ 
Interestingly, as pointed out by AllassonnièreTang & Kilarski (2020, 129–130), the general classifier (though not other classifiers) agrees in gender with the head noun and shows up in two different shapes, namely wota for masculines and woti for feminines.
tin  wota  keto 
three  clf.generalm  boy 
‘three boys’ 
tin  woti  keti 
three  clf.generalf  girl 
‘three girls’ 
What these data show is that the general classifier wot (the realization of a pure Cl node), may be followed by additional (agglutinative) markers that further restrict the applicability of the general classifier to masculine/feminine nouns respectively.
Extrapolating this observation to Mokilese, we may think of the nongeneral classifiers (that are opposed to a general classifier) as lexicalizing (in a manner of phrasal spellout) additional features apart from Cl. For instance, the Mokilese situation could be modeled as a combination of the following two entries, see (110) and (111).
(110) shows the updated entry for the general classifier. Compared to the original version in (103), it can also spell out the Cl head. With this lexical entry, it can appear in both object and abstractcounting environments, cf. the Sanzhi Dargwa jal in (94). The animate classifier men has now been updated to include an additional Anim feature, which (unlike the general classifier w) imposes an additional restriction on the nature of the counted object. Note that since the classifier men also includes the features Num and Cl, it does not stack on top of the general classifier, but replaces it.
7.7 Complex stacking
The final type to be addressed is the complex stacking type. In this type, exemplified by Vera’a, the abstract numeral is complex, e.g., vōru(ō) ‘prfxtwo’, and an additional marker stacks on top of the numeral in the objectcounting function, e.g., ne vōru(ō). This can be modeled by proposing that each marker spells out exactly one feature as specified below.
We shall now run the derivation using the markers as specified above. As a caveat, we point out that if we take the entries to be as in (112)–(114), we shall generate a mirrorimage suffixal paradigm (i.e., instead of ne vōru(ō) we shall get ne ru(ō)vō ne). This can be fixed by changing the shape of lexical entries into a different format that leads to prefixation (cf. Starke 2018, De Clercq & Vanden Wyngaerd 2018), but we do not take up this issue for reasons of space, since the derivation of prefixal paradigms in Nanosyntax would take us too far afield.
With this caveat in place, our starting point is the fact that the derivation of the abstractcounting numeral will proceed as in Mokilese and Sanzhi Dargwa, yielding the tree in (115). The reason for this is that the numeral can spell out only Scale; for num to be spelled out, Scale must move to the left, and the remnant NumP is spelled out by the marker vō.
Now the difference between Vera’a and the other languages is that Cl is spelled out by a separate marker still, see (114). (In Mokilese and Sanzhi Dargwa, Cl was spelled out jointly with Num.) This leads to the fact that the derivation is forced to another step of complement movement, producing the completely analytic structure in (116).
8 Conclusions
In this article, we set out to ask the following question: what is the relationship between objectcounting and abstractcounting numerals? Through the investigation of a number of languages, we concluded that the answer to this question is that objectcounting numerals both syntactically and semantically contain abstractcounting numerals.
In order to model all the morphological effects that we encountered while comparing the two types of numerals, we have decomposed the numerals into three components, that we label Scale, Num and Cl. According to our proposal, Cl is missing in abstractcounting numerals, but it is found in objectcounting numerals. This approach is in line with some earlier proposals such as Krifka (1995) and Hackl (2000), who derive the objectcounting meaning of numerals from a numberdenoting core.
We have further proposed that different patterns of marking come about as a result of the fact that numerals across different languages (and sometimes also within individual languages) lexicalize different sets of the three underlying formatives. We have provided an algorithmic implementation of this idea within the Nanosyntax framework.
Acknowledgments
We would like to thank the audiences at TripleA 7, SinFonIJA 13, CLS 57, AFLA 28 and Olinco 5 for their feedback. We also thank Bartosz Wiland, two anonymous reviewers and the guest editor, Irina Burukina, for their helpful suggestions and comments on the previous versions of this paper. Special thanks go to our informants and consultants: Mary Chimaobi Amaechi, Albert J. Borg, Viacheslav Chirikba, Maia Duguine, Nina Haslinger, Chang Liu, Catriona Malau, Stefan Schnell, Yasu Sudo, and Tue Trinh. All errors are, of course, our own responsibility. Marcin Wągiel’s work was supported by a Czech Science Foundation (GAČR) grant to the Department of Linguistics and Baltic Languages at the Masaryk University in Brno (GA2016107S).
References
Aikhenvald, Alexandra Y. 2000. Classifiers: A typology of noun categorization devices. Oxford: Oxford University Press.
Ajiboye, Oladiipo. 2005. Topics on Yorùbá nominal expressions. Doctoral dissertation. The University of British Columbia, Vancouver.
Ajiboye, Oladiipo. 2016. The Yorùbá numeral system. In Ozomekuri Ndimele and Eugene S. L. Chan (eds.) The numeral systems of Nigerian languages. Port Harcourt: M & J Grand Orbit. 1–25.
AllassonnièreTang, Marc and Marcin Kilarski . 2020. Functions of gender and numeral classifiers in Nepali. Poznan Studies in Contemporary Linguistics 56(1). 113–168.
Bale, Alan and Jessica Coon . 2014. Classifiers are for numerals, not for nouns: Consequences for the mass/count distinction. Linguistic Inquiry 45(4). 695–707.
Barz, Richard K. and Anthony V.N. Diller . 1985. Classifiers and standardization: Some South and SouthEast Asian comparisons. In David Bradley (ed.) Papers in SouthEast Asian Linguistics No. 9: Language policy, language planning and sociolinguistics in SouthEast Asia. Canberra: Australian National University. 155–184.
Beck, David . 2004. A grammatical sketch of Upper Necaxa Totonac. Munich: LINCOM.
Bergsma, Fenna. 2019. Mismatches in free relatives – grafting nanosyntactic trees. Glossa: A Journal of General Linguistics 4(1). 119.
Borer, Hagit. 2005. Structuring sense I: In name only. Oxford: Oxford University Press.
Borg, Alexander. 1974. Maltese numerals. Zeitschrift der Deutschen Morgenländischen Gesellschaft 124(2). 291–305.
Borg, Albert J. 1987. To be or not to be a copula in Maltese? Journal of Maltese Linguistics 17(18). 54–71.
Bultinck, Bert. 2005. Numerous meanings: The meaning of English cardinals and the legacy of Paul Grice. Amsterdam: Elsevier.
Caha, Pavel. 2019. The Nanosyntax of case competition. Ms., Masaryk University. lingbuzz/004875.
Chierchia, Gennaro. 1998. Plurality of mass nouns and the notion of ‘semantic parameter’. In Susan Rothstein (ed.) Events and grammar. Dordrecht: Kluwer. 53–103.
Chirikba, Viacheslav A. 2003. Abkhaz. Munich: LINCOM.
Cinque, Guglielmo. 2020. Word order variation: Towards a restrictive theory. Ms., University of Venice, Venice.
Corbett, Greville. G. 1978. Universals in the syntax of cardinal numerals. Lingua 46(4). 355–368.
De Clercq, Karen and Guido Vanden Wyngaerd . 2017. *ABA revisited: Evidence from Czech and Latin degree morphology. Glossa 2(1). 69: 1–32.
De Clercq, Karen and Guido Vanden Wyngaerd . 2018. Unmerging analytic comparatives. Jezikoslovlje 19(3). 341–363.
Elbert, Samuel H. and Mary Kawena Pukui . 1979. Hawaiian grammar. Honolulu: University of Hawaii.
Fábregas, Fábregas. 2007. An exhaustive lexicalisation account of directional complements. In Monika Bašić , Marina Pantcheva , Minjeong Son and Peter Svenonius (eds.) Space, motion, and result. Nordlyd: Tromsø University Working Papers on Language & Linguistics 34(2). Tromsø: CASTL, University of Tromsø. 165–199.
Forker, Diana. 2020. A grammar of Sanzhi Dargwa. Berlin: Language Science Press.
Greenberg, Joseph H. 1978. Generalizations about numeral systems. In Joseph H. Greenberg (ed.) Universals of human language, Vol. 3. Stanford, CA: Stanford University Press. 249–295.
Hackl, Martin. 2000. Comparative quantifiers. Doctoral dissertation. Massachusetts Institute of Technology, Cambridge, MA.
Harrison, Sheldon. 2019. Mokilese reference grammar. Honolulu, HI: University of Hawaii Press.
He, Chuansheng. 2015. Complex numerals in Mandarin Chinese are constituents. Lingua 164. 189–214.
Hewitt, George. 2010. Abkhaz: A comprehensive selftutor. Munich: LINCOM.
Hundius, Harald and Ulrike Kölver . 1983. Syntax and semantics of numeral classifiers in Thai. Studies in Language 7. 165–214.
Hurford, James R. 1998. The interaction between numerals and nouns. In Frans Plank (ed.) Noun phrase structure in the languages of Europe. Berlin: Mouton de Gruyter. 561–620.
Hurford, James R. 2001. Languages treat 14 specially. Mind & Language 16(1). 69–75.
Ionin, Tania and Ora Matushansky . 2006. The composition of complex cardinals. Journal of Semantics 23. 315–360.
Ionin, Tania and Ora Matushansky . 2018. Cardinals: The syntax and semantics of cardinalcontaining expressions. Cambridge, MA: MIT Press.
Jacques, Guillaume. 2021. A grammar of Japhug. Berlin: Language Science Press.
Kalita, Jagat. 2011. The referring systems and the determinative elements of noun phrases in Assamese. In Gwendolyn Hyslop , Stephen Morey and Mark W. Post (eds.) North East Indian linguistics 3. New Delhi: Cambridge University Press India. 173–196.
Kennedy, Chris. 2001. Polar opposition and the ontology of ‘degrees’. Linguistics and Philosophy 24(1). 33–70.
Kennedy, Christopher. 2015. A ‘deFregean’ semantics (and neoGricean pragmatics) for modified and unmodified numerals. Semantics & Pragmatics 8. 1–44.
Kiparsky, Paul. 1973. ‘Elsewhere’ in phonology. In Stephen Anderson and Paul Kiparsky (eds.) A festschrift for Morris Halle. New York, NY: Holt, Rinehart & Winston. 93–106.
Klamer, Marian. 2010. A grammar of Teiwa. Berlin: Mouton de Gruyter.
Klamer, Marian , Antoinette Schapper , Greville Corbett , Gary Holton , František Kratochvíl and Laura C. Robinson . 2014. Numeral words and arithmetic operations in the AlorPantar languages. In Marian Klamer (ed.) The AlorPantar languages: History and typology. Berlin: Language Science Press. 337–373.
Krifka, Manfred. 1989. Nominal reference, temporal constitution and quantification in event semantics. In Renate Bartsch , Johan van Benthem and Peter von Emde Boas (eds.) Semantics and contextual expression. Dordrecht: Foris Publications. 75–115.
Krifka, Manfred. 1995. Common nouns: A contrastive analysis of Chinese and English. In Gregory N. Carlson and Francis Jeffry Pelletier (eds.) The generic book. Chicago, IL: University of Chicago Press. 398–411.
Landman, Fred. 2003. Predicateargument mismatches and the adjectival theory of indefinites. From NP to DP, Vol. 1: The syntax and semantics of noun phrases. Amsterdam: John Benjamins. 211–237.
Link, Godehard. 1983. The logical analysis of plural and mass nouns: A lattice–theoretical approach. In Rainer Bäuerle , Christoph Schwarze and Arnim von Stechow (eds.) Meaning, use, and interpretation of language. Berlin: Mouton de Gruyter. 302–323.
Malau, Catriona. 2016. A grammar of Vurës, Vanuatu. Boston, MA: Walter de Gruyter.
Nouwen, Rick. 2016. Making sense of the spatial metaphor for number in natural language. Ms., Utrecht University. Available from lingbuzz/003100.
Obikudo, Ebitare F. 2016. Counting: The I ˙ b ˙ a ˙ ni ˙ way. In Ozomekuri Ndimele and Eugene S. L. Chan (eds.) The numeral systems of Nigerian languages. Port Harcourt: M & J Grand Orbit. 217–223.
Pham, Giang and Kathryn Kohnert 2008. A corpusbased analysis of Vietnamese ‘classifiers’ con and cái. MonKhmer Studies 38. 161–171.
PoChing, Yip and Don Rimmington . 2015. Chinese: A comprehensive grammar. London & New York, NY: Routledge.
Pukui, Mary Kawena and Samuel H. Elbert . 1986. Hawaiian dictionary: Hawaiian–English, English–Hawaiian. Honolulu, HI: University of Hawaii.
Qi, Jianjun and Chuansheng He . 2019. The morphosyntax of numerals dʑi33/dʑĩ35 ‘one’ in Shuhi and implications for the semantics of numerals. Lingua 225. 63–80.
Robinson, Laura C. and John W Haan . 2014. Adang. In Antoinette Schapper (ed.) The Papuan languages of Timor, Alor and Pantar, Vol. 1: Sketch Grammars. Berlin: Walter de Gruyter. 221–284.
Rothstein, Susan . 2010. Counting and the mass/count distinction. Journal of Semantics 27(3). 343–397.
Rothstein, Susan . 2017. Semantics for counting and measuring. Cambridge: Cambridge University Press.
Royer, Justin. 2017. Noun and numeral classifiers in Chuj (Mayan). In J. Nee, M. Cychosz, D. Hayes, T. Lau, and E. Remirez (eds.) Proceedings of the 43rd annual meeting of the Berkeley Linguistics Society, Vol. 2. Berkeley, CA: Berkeley Linguistic Society. 29–38.
Schnell, Stefan . 2011. A grammar of Vera’a, an Oceanic language of North Vanuatu. Kiel: Kiel University.
Schwarzschild, Roger and Karina Wilkinson . 2002. Quantifiers in comparatives: A semantics of degree based on intervals. Natural Language Semantics 10(1). 1–41.
Scontras, Gregory. 2013. Accounting for counting: A unified semantics for measure terms and classifiers. In Todd Snider (ed.) Proceedings of Semantics and Linguistic Theory, Vol. 23. Ithaca, NY: CLC Publications. 549–569.
Seuren, Pieter A. M. 1984. The comparative revisited. Journal of Semantics 3(1). 109–141.
Starke, Michal . 2009. Nanosyntax: A short primer to a new approach to language. Nordlyd 36(1). 1–6.
Starke, Michal . 2018. Complex left branches, spellout, and prefixes. In Lena Baunaz , Karen De Clercq , Liliane Haegeman , and Eric Lander (eds.) Exploring Nanosyntax. Oxford: Oxford University Press. 239–249.
Storch, Anne . 2014. Counting chickens in Luwo. In Anne Storch and Gerrit J. Dimmendaal (eds.) Number – constructions and semantics: Case studies from Africa, Amazonia, India and Oceania. Amsterdam: John Benjamins. 265–282.
Sudo, Yasutada . 2016. The semantic role of classifiers in Japanese. Baltic International Yearbook of Cognition, Logic and Communication 11(1). 1–15.
Turnbull, Archibald . 1982. Nepali grammar & vocabulary. New Delhi: Asian Educational Services.
Vanden Wyngaerd, Guido , Michal Starke , Karen De Clercq and Pavel Caha 2020. How to be positive. Glossa 5(1). 23.
Wągiel, Marcin and Pavel Caha . 2020. Universal semantic features and the typology of cardinal numerals. Catalan Journal of Linguistics 19. 199–229.
Wiland, Bartosz. 2018. A note on lexicalizing ‘what’ and ‘who’ in Russian and in Polish. Poznan Studies in Contemporary Linguistics 54(4). 573–604.
Yue, Anne O. 2017. The Sinitic languages: Grammar. The SinoTibetan languages. London: Routledge. 114–163.
An anonymous reviewer points out that, in some cases, one can imagine contexts where the sentences that we mark as infelicitous can be used. While this is an interesting domain of inquiry (namely how to manipulate the context to make such sentences acceptable), this does not challenge our main claim. The claim is that there is a contrast between the two different uses of numerals; specifically, while one type of numerals is natural in the relevant examples, for the second type, one must work hard on the context to make them sound more natural.
Actually, Rothstein (2017) distinguishes typally between lower numerals and multiplicands such as hundred but since this paper focuses mainly on basic cardinals such as three, we will ignore this intricacy.
One potential example is the German eins ∼ ein (both ‘one’); see Wągiel & Caha (2020) for a discussion and detailed analysis.
We would like to thank Maia Duguine for her judgements and the discussion of the Basque data.
We would like to thank Chang Liu for his judgements and the discussion of the Mandarin data.
Note also that the closely related Dungan language (Sinitic) employs only one generic classifier kə for all nouns and a similar tendency is also present in many Northern Chinese dialects (Yue 2017, 114–115).
We would like to thank Tue Trinh for his judgements and the discussion of the Vietnamese data.
For a corpusbased study of the distribution of cái, see Pham & Kohnert (2008).
The occurrence of m is also accompanied by a prosodic change, namely the low tone is replaced by the high tone (Ajiboye 2016).
Ajiboye (2005, 235) states that both bare and prefixed forms can occur in some objectcounting environments, but Ajiboye (2016, 1, fn 3) notes that nowadays many speakers prefer prefixed forms in all objectcounting contexts. In any case, our informant accepts only m numerals for object counting. We would like to thank Mary Chimaobi Amaechi for her judgements and the discussion of the Yoruba data.
It should be noticed that Borg (1974) proposes that the distinction between tnejn and żewġ could be accounted for syntactically. Specifically, he proposes that tnejn is an independent form used in the absence of a noun, whereas żewġ can only be used when preceding an NP. For the sake of space, we will not dive into the details of Borg's and Hurford's accounts and we simply acknowledge that an alternative explanation of the pattern is possible. We would like to thank Albert J. Borg for his judgments and the discussion of the Maltese data.
Interestingly, the addition of the numerals 1–9 to 10 is done in a different way, where both numerals contain the ‘classifier’ á, as in (i).
ápààr  ŋwɔŋ=ɛ  ácíɛlɔ 
clften  increase:tr=3sg  clfone 
‘eleven’ lit. ‘ten increases one’ 
This suggests that languages probably allow for (at least) two different ways of creating additive numerals. The second option, recently argued for in Ionin and Matushansky (2006), is that additive numerals originate as complex coordinations of the sort shown in (iia), where the first noun is unpronounced (possibly due to Right Node Raising). This structure could be the source for (i), while the examples discussed in the main text would arise by employing the structure in (iib). It must be noted, though, that the structure in (iia) is only available for the formation of objectcounting numerals.
a.  [ clfnumeral N ] & [ clfnumeral N ] 
b  clf[ numeral & numeral ] N 
The prefix can be sometimes realized as ‘a and it can be dropped only in a counting sequence as in (‘e)kahi, (‘e)lua, (‘e)kolu… (‘one, two, three…’) (Elbert & Pukui 1979, 158).
We follow here Elbert & Pukui (1979) in using the label ‘distributive’ though at least some of the uses of numerals prefixed with pā are clearly multiplicative.
The suffix has the shape al after consonants, jal after vowels and el in the numeral ‘two’, apparently as a result of contraction (Forker 2020, 129–130).
The complex syncretic pattern is also found in Japhug (SinoTibetan, Jacques 2021: 233–234, 289–290, 425–426).
The allomorph ve appears only in the numeral ‘five’ (Schnell 2011, 73). Cardinals higher than ‘five’ are compound expressions with the structure liviX ‘five plus X’, where X is substituted by the relevant unprefixed numeral; e.g., liviru(ō) ‘seven’. According to our classification these would fall into the simplex category.
Schnell notes that the ligature is not always found with the numeral in the objectcounting function, describing it as optional in this context. Concerning the ligaturefree uses of the numeral, Schnell (2011, 74) says “that often the combination of noun and bare numeral is a fixed lexicalized expression.” In our account, we focus only on the pattern where the ligature is present inside NPs, and we leave an account of the optionality for future work.
A reviewer raises the possibility of a different kind of complex suppletive pattern, namely one where a morphologically complex abstractcounting form A+B would have a simplex suppletive objectcounting form C. We did not find such a pattern. We shall return to this pattern in footnote 28.
Numbers in the superscript represent tones.
The numeral ‘two’ takes the suffix ʒ’a instead of j°ə(k’) and ba has the form bá in the numeral ‘eight’ and pa in ‘three’ as a result of phonetic assimilation (Chirikba 2003, 34). In multiplicatives, ba/j°ə(k’) are replaced by the suffix nt’°.
We would like to thank Viacheslav Chirikba for the discussion of the Abkhaz data.
The idea that numerals are at their core associated to an interval is compatible with them having also other uses, since the interval can be mapped onto various types of objects, e.g., its extreme point. We shall return to this later.
Notice, however, that nothing crucial hinges on the exact denotation of Cl. What is essential is the fact that it shifts an object of type n into a quantifying expression. As long as this is ensured, the proposed system is also compatible with other theories of the objectcounting meaning.
An anonymous reviewer points out that our denotation for the Cl head (and its semantic type 〈n, 〈〈e, t〉, 〈e, t〉〉〉) raises the question whether our system allows for the existence of classifiers that can be used without an accompanying numeral (e.g., grain). The general response is that care must be taken not to confuse the abstract meaning primitives like Cl that we are describing in this section with particular morphemes (like grain). The reason is that we adopt here a realization approach to morphology (to be described shortly) where phonological strings like grain may first of all realize multiple meaning elements, and second, they may also represent different sets of such features in different uses. Therefore, there is no entailment that classifiers – when they occur without numerals – have exactly the same semantic type as when they occur with the numeral. This is because of the fact that at the core of any lateinsertion approach lies the idea that individual phonological strings (like grain) can be inserted in multiple different environments. As a result, phonological strings (grain) may have no consistent semantic type across their different uses. For instance, the numeral five has a different type when it occurs as an abstractcounting numeral and when it occurs as an objectcounting numeral.
Lexical items may of course be also associated to some concept C, but we disregard this here. The semantic information relevant for our purposes is not associated to lexical entries in some arbitrary manner, because it is an inseparable part of the grammatical features the lexical entries spell out.
In order for the spellout procedure to work smoothly, we are led to assume in (93) that the Scale node is actually complex and arises by merging two ingredients together. This complexity is reflected by the triangle under Scale in (93). The reason why we have to assume this is theory internal. It has to do with the way the spellout algorithm works: it alway starts by merging two features together, and only then it pronounces the resulting FP. Therefore, if a morpheme pronounces Scale, the node must be internally complex. We do have some ideas about what these ingredients could be, but we shall not elaborate on this idea in any detail; see Vanden Wyngaerd et al. (2020) for a decomposition of adjectival scales that may be relevant here.
Recall from footnote 19 that a complexsuppletive pattern could also feature a complex numeral A+B for abstract counting and a simplex suppletive numeral C for object counting. Such a pattern is unattested in the languages we looked at. In the system we explore, this gap could be ruled out in the following way: in order for C to spell out the objectcounting function, it must be able to spell out all the three meaning components Scale, Num and Cl. However, with such a lexical specification, C can also spell out the abstractcounting structure (characterized by Scale and Num). Now given that a single morpheme C can spell out the abstractcounting structure without any movement, we do not expect to find a bimorphemic spellout by A+B, since the latter would require using spellout movements. And since nonmovement spellout is preferred by the spellout algorithm, this hypothetical pattern is ruled out. See, however, Wągiel & Caha (2020) for a discussion of potential cases where the objectcounting numeral appears to be simplex, while the abstractcounting numeral is complex.