The numeral system(s) in Western Serengeti: Formal, functional, and historical inferences

In this study we offer a detailed synchronic and diachronic account of the hitherto un(der)documented numeral systems of the four closely related (Eastern) Bantu language varieties: Ikoma, Nata, Ishenyi, and Ngoreme – together forming the Western Serengeti subgroup. We describe the essentially identical formation and organization of numerals in these language varieties while also noting the morphosyntactic behaviour of numeral expressions and their extended uses. Based on an extensive quantity of comparative data, we furthermore disentangle the historical background to the numerals and their systematization in Western Serengeti, connecting this specific linguistic domain with the wider genealogical profile of this subgroup.


Introduction
In this article 1 we focus on four closely related and poorly described (Eastern) Bantu language varieties acknowledged as forming a genealogical subgroup, i.e. the Western Serengeti (WS) branch of the Mara subgroup of Great Lakes languages (Gibson & Roth 2019, Schoenbrun 1990). Three of the four language varieties constituting the WS group are Ikoma (ca. 15 000 speakers), Nata (ca. 11 500 speakers), and Ishenyi (ca. 9 500 speakers), jointly classified with the iso-code ntk and Guthrie code JE45 2 ; the final language variety is Ngoreme (ca. 55 000 speakers), classified as nqk and JE401, respectively (see Aunio et al. 2019). They are all spoken in the Serengeti district of the Mara region, North-Western Tanzania, an area situated between Lake Victoria to the west and the Serengeti National Park to the east. The Mara region is dense and diversified from a linguistic perspective. Including the members of the WS group, it consists altogether of roughly 20 Bantu language varieties, along with other languages of Nilotic descent.
This study accounts for the numeral system(s) of the WS group languages. Numeral systems have clear semantic delimitations and morphosyntactic behaviour (Rischel 1997, Hammarström 2010, and even in more rudimentary descriptions of Bantu languages, numbers are typically still included. Numerals in Bantu are a formally distinctive class with a separate type of agreement marking (Stappers 1965). Consequently, there is plenty of early historical-comparative works in Bantuistics which pay attention to the origin of numerals and the developments within numeral systems in Bantu languages (see inter alia Werner 1919: 133-143, Schmidl 1915, Meinhof 1948: 117-124, Meeussen 1967: 96-98, 105, 117, Meeussen 1969, Hoffmann 1953, Polak-Bynon 1965, Stappers 1965. However, numerals are seldom a topic in modern comparative Bantu studies (an exception is Pozdniakov 2018, which, however, is a broader study encompassing the whole (putative) Niger-Congo macro family) 3 . This is surprising, given the fact that the "domain of numerals presents a prime case of using structured groups of lexemes for assessing historical-comparative questions" (Güldemann 2018: 74). Similarly, descriptive works on single Bantu languages or on small subsets of languages usually do not offer reconstructions of the origins of numerals and their further evolution within the numeral system. 1 This research has been funded by the Kone Foundation. We wish to gratefully acknowledge their support. We also wish to thank our Western Serengeti language consultants, the Mara branch of SIL International and Tim Roth. We also thank Mary Chambers for polishing our English. 2 See subsection 3.1 for more information about this Bantu specific referential system. 3 See also Grimm (2019) for a critical review of this work.
Following the appeal by Rankin (2006), this study sets out to show how a historical-comparative approach is a particularly useful tool for the analysis and description of previously un(der)described varieties. Placing the WS numerals in a historical-comparative framework, this study brings further light to the diachronic forces behind the system, consequently offering a more robust description of it. At the same time, describing the numeral systems of WS also provides extra data for further comparative (and typological) work, facilitating the drawing of more fine-grained generalizations and conclusions about this linguistic notion. To this we may add that the documentation of numeral systems in these Tanzanian language varieties is a particularly pressing matter, insofar as they are increasingly being replaced by numerals borrowed from Swahili -that is, the prominent language which is both the national and co-official medium of communication (see e.g. Legère 2006) -and thus are at imminent risk of disappearing. This loss is a situation they share with many numeral systems of the world. Thus, Comrie (2005a) considers numeral systems of the world and their socio-cultural particularities a specifically endangered domain of languages.
In a fashion congenial to the bifocal descriptive-cum-comparative aim of this study -and in accordance with Blažek's (1999) three steps of numeral analysis -the article is organized in the following manner. In section 2 we present the overall numeral system with regard to the organization of ordinal and cardinal numerals, their agreement marking, and other formal and functional traits. In section 3 we focus on the historical-comparative background to this numeral system. In section 4 we offer a summary and some final conclusions.

Presenting the numeral system
In this study we define numerals, following Hammarström (2010: 11, see also Schapper & Klamer 2014), as "spoken normed expressions that are used to denote the exact number of objects for an open class of objects in an open class of social situations with the whole speech community in question" 4 . We furthermore treat these numeral expressions as being systematically arranged into a numeral system. As pointed out by Rischel (1997), numeral systems form a closed and relatively limited functional-semantic domain. That is to say, although an enormous number of different digits may indeed be formed in 4 Notice that by following this definition we exclude "inexact" numerals for this study, like the reflexes of the common Bantu quantifiers (see Zerbian & Krifka 2008) or different fractions ('half', 'quarter' etc.). a language, there is only a limited and closed set of primitive (mostly low-valued) numerals, of which all other digits are merely complex derivatives. The numeral system is typically further subdivided into cardinals and ordinals and we follow that convention here when describing the numeral systems of WS. However, as will be further evidenced in this section, there are also several other functional traits associated with numerals in these language varieties.

Cardinals
The simplex or basic cardinal numbers, as they occur in the WS language varieties, are presented in Table 1 5 . Note that there is no dedicated expression for 'zero', nor does any such expression occur in the formation of higher digits either. This is in line with the typical case in natural languages (see Greenberg 1978, Hurford 1987. The WS examples cited in this paper are from a corpus of transcribed and analyzed recordings made during extensive fieldwork in the Mara region from 2008 to 2019. Some of the data used has been collected by SIL members. Bible quotations referred to also come from the work by SIL. As Ikoma is the only WS language with an approved orthography, the writing system in the WS examples here is phonological (IPA), with the exception of contrastive long vowels, which are written with double consonants. The Ishenyi vowel system appears to be going through a loss of phonemic ATR contrasts, and there is a lot of inter-and intra-speaker variation. To keep the data comparable across the WS varieties, Ishenyi is also transcribed with 7 vowels despite occasional inconsistencies (see Laine 2016). Only surface tones are marked (with an accent), and numeral stems are not marked for tone as tones can also be realized on the numeral prefix and not the stem. It should be noted that tone analysis for these languages is work in progress. As seen in Table 1, the various numeral systems are more or less identical. The exceptions are, apart from some minor differences in vowel quality associated with more general differences in phonological structure, the extra word-medial nasal in '6' in Ngoreme. Additionally, word-final syllables in Ikoma may not be labialized, which induces the differing shapes of the numerals '1' and '1000' in this variety. The similarity between the varieties is not surprising, given the fact that they are closely related, with an estimated lexical overlap of up to 85% between Ikoma, Nata, and Ishenyi and 77% between Ngoreme and Ikoma (Roth 2018: 12).
As further indicated in this table, the cardinal numeral system consists of both variable and invariable numerals. Compare example (1), where the numeral '3' agrees with the head noun, with (2), where the numeral '9' remains unaffected.
(1) Ngoreme Bantu languages are characterized by having an extensive gender-like system, consisting of up to 20 noun classes, of which most are paired based on number (singular/plural). The WS language varieties form no exceptions to this characteristic trait. The variable numerals may take agreement with most of these different noun classes as they occur among the WS members. We use data from Nata to illustrate this fact in Table 2. (See also Table 3 in subsection 3.2). ɾi-/a-7/8 ki-/βi-9/10 i-/i- ll the other WS members behave in essentially the same way as Nata when it comes to numeral agreement. The Ikoma and Nata enumerative prefixes have the close back vowel /u/ for the numeral '1', and Ishenyi and Ngoreme the close mid back vowel /o/. Underlyingly, the vowel in Ikoma and Nata may also be the close mid vowel which is dissimilated from the −ATR stem vowel /ε/. In Nata, the −ATR vowel is still present in the stem -mwε, whereas Ikoma (as mentioned above) has lost the final vowel due to a later rule that prohibits labialization from occurring word-finally.
One exceptional feature of Nata is the regular singular-plural shift of the diminutive noun class 12 ka-to class 13 tu-, as in (3). The other WS members derive plurals of class 12 ka-with the prefix of class 19 hi-(another noun class dedicated to diminutives), as in (4).
(3) Nata The enumerative class prefixes for classes 11 ɾu-, 15 ku-, 16 ha-, and 20 ɣu-are not included in this table as they only agree with the numeral '1'. Curiously, class 14 is also a singular class. However, head nouns in this class can still be modified with plural numerals, as evident in (5). Similarly, all WS members lack enumerative prefixes for the locative noun classes 17 and 18 (as well as for the almost completely obsolete locative noun class 25, cf. Grégoire 1975: 170-175). Consequently, they are not included in the table either. In fact, locative class agreement on numerals is generally rare given two conspiring facts. Firstly, the locative noun classes are overwhelmingly devoid of any inherent nouns. Instead, they are applied onto nouns of other noun classes to mark notions of location. Secondly, there is a general restriction in the WS which stipulates that any modifiers of such a noun derived with an additive locative class prefix do not agree with the locative but with the lexical noun class (Aunio et al. 2019). The only exception to this pattern, with locative enumerative agreement marking, occurs when the class 16 enumerative prefix is employed for marking agreement with either of the two nouns inherently belonging to class 16, that is, ahasé and ahaɣíɾo, both meaning 'place'. This is illustrated in (6). 'The one place which is there is mine.' As is typical for Bantu languages (see e.g. Schadeberg 2003: 150), the enumerative agreement prefixes form a distinct paradigm of agreement markers. These prefixes, as represented in Table 2, are largely identical to the set of pronominal prefixes used for other nominal modifiers. The main differences are found in classes 4 (e-), 6 (a-), and 10 (i-), which lack an initial consonant in the enumerative form, unlike in the pronominal forms (which are ɣe-, ɣa-, and tʃ e-, respectively).
We can illustrate this fact with example (7) from Ishenyi, where the same head noun, derived in noun class 10, triggers agreement in the noun phrase which is realized differently on the possessive pronoun than on the numeral. In (8) from Ikoma, it is the demonstrative and connective which agree differently from the numeral, all agreeing with the noun amaɲémbe 'mangoes' of noun class 6.
Regarding the invariable numerals, they are formed as nouns and are assigned to different noun classes. Only '7' belongs to noun class 3; '9', '10', and '100' instead belong to class 5/6. The highest basic digits of the system, '1000' and '100 000', belong to class 7/8. The noun class prefix of class 5 behaves irregularly with numerals associated with this class, unlike other nominal stems which regularly take a full CV shaped prefix ɾi(i)-. The numeral ɾii-ɣána '100' keeps this form, but for i-kómi '10' this prefix is reduced to i-and for kénde '9' the prefix is further reduced to zero. (But see subsection 2.2 where it is shown that the full prefix re-appears in ordinal constructions in Ishenyi.) Bantu languages are generally considered as having decimal-based numeral systems, given the fact that the base -that is, the numerical value used recursively to form other numerals -is a stem meaning 'ten'. Hence, the numerals 11-19, as well as the decades, are typically synchronically transparent complex con-structions derived with 'ten' 6 . This system adheres to the pattern which is also by far the most common from a cross-linguistic perspective. As Comrie (2005b) notes: "We live in a decimal world". Thus, not surprisingly, the members of the WS group also have decimal-based numeral systems. Notice that, with the exception of the decimal-based system, there is also a recurrent pattern across the Bantu speaking area indicative of a quinary or base-five system (for digits below 10). However, although some close relatives/neighbours have such a quinary system (mixed with a decimal system), this is not the case in the WS language varieties (see subsection 3.4 for a more elaborate account). Greenberg (1978Greenberg ( , 2000, see also Comrie 2005a, 2005b, Schapper & Klamer 2014 distinguishes between "additive" and "multiplicature" complex numeral constructions, describing the most common arithmetic operations applied to the base and other numeral components in the formation of numerals. The WS language varieties form their numerals 11-19 through an additive strategy. As apparent in Table 1, the word for the (cardinal) numeral '10' in all the WS languages is ikómi, which also forms the augend (i.e. the count base) in these case(s), the addend being any number in the series from 1-9. The comitative preposition/conjunction na 'and/with' functions as the "link", that is, the additive operator between the augend ikómi and the serialized addend. The full set of these numerals are illustrated in (9) from Ikoma, with class 1 and 2 agreement.
Notice that the final numeral in these examples agrees with the head noun and not with the base, as evidenced in (10). Example (11) from Ngoreme furthermore illustrates the analysability of the building blocks of these numerals. As seen at the end of this sentence, the augend base is used only once to cover a series of numerals and the comitative has been changed to a disjunctive coordinator 'or' (au, borrowed from Swahili) to signal the possible range of variation. Decades, that is, the numerals '20'-'90', are also decimal-based. However, they are formed differently in two aspects. Firstly, and crucially, they are not formed with ikómi but with the alternative word miɾɔŋɡɔ (mu-ɾɔŋɡɔ in the singular, hence belonging to noun classes 3/4). Furthermore, they are formed through multiplication, that is, constructions where the decimal base (in this case miɾɔŋɡɔ) serves as a multiplicand and any number from 2-9 may serve as a multiplier. The relationship between the two components is marked via noun class agreement governed by the multiplier. Example (12) illustrates this with data from Ishenyi.
Numerals within these decades are formed with the combination of the multiplicative strategy and the additive strategy described for numerals 11-19. This is illustrated in (13) and (14)  foc-sp10-cop-loc 10-cow 5-how_many sp10-cop-loc 4-ten 4-two com-10-six 'How many cows are there? There are 26.' Schadeberg (2003: 150) subdivides the Bantu cardinals into two further subtypes, that is, "referential" and "absolute" numerals, respectively. Whereas referential numerals -constituted by all examples provided up to now -are used for counting individuals or entities ('one/two/three X'), absolute numerals are dedicated to calculations ('one, two, three…'). Absolute numerals are always inflected in class 9/10 in all four varieties, for example, Ngoreme emwé '1', iβéɾe '2', isáto '3'. In contrast with a common trait in other Bantu languages (Vanhoudt 1994), there is no difference in the formal realization of the word stem '1' when used as an absolutive relative to its use as a referential. One potential exception is found in Ikoma, where the final vowel in the stem for '1' fluctuates between /u/ and /a/ in the formation of '11' and other additive numeral constructions with '1' as the addend. Compare (9) above with aβáána βaatʃ e ikómi na úmwa 'his eleven sons' (Genesis 32:22).
The forms used for expressing '100', '1000', and '100 000' are also included in Table 1 as they are simplex numerals which also serve as bases for other complex numeral derivatives 7 . The use of both '100' and '1000' is illustrated in (15) below.
sp2-nar-kill-fv aug-6-hundred aug-6-hundred until maybe aug-7-thousand 7-one 'The Maasai got drunk and slept, then they killed them, they killed hundreds, hundreds up to maybe a thousand.' 7 We follow the criteria by Schapper & Klamer (2014) that the lowest recursively occurring base designates the system. Hence, as this is '10' in the WS group, the numeral systems of these languages are to be treated as decimal-based.
Notice that the disparate formal realization of '1000' in Ikoma, as seen in this example, is due to the same phonological phenomenon as discussed for '1' above (intrinsically illustrated in this example as well). Thus, what in the other varieties is pronounced with a glide, ɣe-kwé, is pronounced without word-final labialization in Ikoma.
These higher digits may also be multiplied and serialized as bases using a strategy parallel to that described for forming decades.

Ordinals
As is common across the Bantu speaking area (see Van de Velde 2013, ordinals are formed as "numeral possessives" (Schadeberg 2003: 150), that is, in a complex construction consisting of a head noun and the connective followed by the numeral 8 . This construction is illustrated in (17) The connective marker is commonly used in Bantu languages to connect a head noun with another modifying nominal constituent. The connective is formally different in the WS group from the canonical Bantu reflex -a (see Aunio et al. 2019: 517-518 for further details).
12-four 'Then came another (second) young man […] Then came another (third) young man […] Then came another (fourth) young man.' Several facts regarding the ordinal construction may be deduced from these examples. Firstly, we may note from example (19) that other modifiers may interfere between the head noun and the connective construction containing the ordinal within the noun phrase. Secondly, in (17), we see that the full CV-shaped noun prefix of class 5, which is otherwise reduced to /i-/ in ikómi '10', re-occurs within ordinal connective constructions in Ishenyi. Thirdly, in examples (18) and (19) we see that, in addition to the connective, the numeral is inflected with a prefix ka-/ɣa-, the variation in consonant realization being conditioned by Dahl's Law 9 . However, if the numeral is the invariable '7','9', or '10', this prefix may not surface. Thus, compare the different realizations of '5', '6', and '7' in (20). The omission of ka-/ɣawith invariable numerals is further illustrated in (20) and (21) Similar to what has been pointed out by, for example, Stappers (1965) for several other (Eastern) Bantu languages, the prefix ka-/ɣais also used for deriving multiplicatives of numerals in the WS language varieties. This is illustrated in (22) and (23). 9 Dahl's law refers to a type of dissimilation process common in North-Eastern Bantu, where voiceless stops are voiced when the succeeding syllable also consists of a voiceless consonant. See Davy & Nurse (1982)  The ordinal 'first' constitutes the only exception to the pattern in the WS group where ordinals are directly derived from cardinals. As is common cross-linguistically and in Africa in general (Stolz & Veselinova 2013), as well as in Bantu in particular (Polak-Bynon 1965), 'first' is instead formed through suppletives, that is, derivationally independent forms. Whereas Ikoma, Ishenyi, and Nata form the ordinal 'first' exclusively with the adjective -mbεɾε, as evidenced in (24) to (26), Ngoreme forms it with either -mbεɾε (27) or kwánsa (28). Whereas -mbεɾε acts as an adjective and takes regular nominal agreement, kwánsa is formed with the use of the connective marker.

Functional traits: Morphosyntactic behaviour and extended uses
Insofar as the numerals have a particular set of agreement prefixes, they may be morphologically defined as constituting a word category of their own (cf. Schadeberg 2003, Stappers 1965, see also Greenberg 2000). However, as pointed out in subsection 2.1 above, some numerals do not inflect for agreement. Moreover, their morphosyntactic behaviour is in many ways identical to that of other adnominal modifiers. Cardinals behave identically to adjectives (see Van de Velde 2019) and ordinals may be subsumed within a more general framework of connective constructions (as touched upon already in subsection 2.2). Thus, in accordance with the general head-driven typological structure characterizing Bantu languages, where modifiers tend to follow the head they modify (Van de Velde 2019), the numeral typically follows the noun it modifies in the WS language varieties (as may be deduced from all the previous examples in this article). With that said, however, a numeral seems to be allowed to occur relatively freely in a clause. As evident in the various realizations of the same proposition in (29), a numeral can even precede the head noun.
Another feature in need of further investigation is the fact that variable numerals generally do not take the augment in WS. The augment is a functionally elusive nominal pre-prefix whose presence is dependent on a number of factors often connected to notions such as specificity, topicality, and definiteness (at least diachronically, see de Blois 1970, Van de Velde 2019. In other Bantu languages, a numeral automatically carries an augment if the governing noun has one (cf. de Blois 1970). Furthermore, just as with adjectives, it is common to add the augment on numerals to make them nominalized and non-restrictive (Van de Velde 2019: 262-263). However, in neither of these two contexts does an augment occur in the WS language varieties: the former is evident in (1) above, just to mention one of several examples in the paper illustrating this fact; and the latter is illustrated in (30) below. Note that the invariable nominalized numerals do not adhere to these restrictions but take the augment like any other noun, as evident for example with a-ma-ɣána 'hundreds' and e-ɣe-kú 'thousand' in (15) above (but where ki-mu '1' modifying 'thousand' occurs without an augment).

b) m-ba-ɣɛɣ-iɾɛ e-βi-kɔḿbɛ
βi-βeɾe βi-βeɾe foc-sp2-carry-pfv aug-8-cup 8-two 8-two 'They carry 2 cups (each).' Forming distributive numerals through reduplication is a typologically common strategy, including in the Bantu family, as shown by Gil (2013). According to this author, the motivation behind its ubiquity is iconicity, as the reduplication directly corresponds to the conceptualization of a multiple set of entities.
Another type of numeral derivative, also described for the related language Gusii (Cammenga 2002: 349-351), comes from the use of plural referent agreement with the ordinal '1'. As illustrated in (32)  Besides this meaning, however, this form has additional distinctive functions, namely as a modal adverbial expressing epistemic possibility (34) and as a disjunctive coordinator 'or' (35).

(34) Ishenyi
hamwe ha-ká a-aɾe, ne-ku-ɾóɾ-a i-βásikeli e-etʃ e perhaps 16-home sp1-cop sp1sg-ipfv-see-fv 9-bicycle 9-poss3sg 'Perhaps s/he is at home, I see her/his bicycle.' Finally, we may note that numerals are found in lexicalized constructions such as in compounds referring to the names of the day of the week, as illustrated in (36). Notice that the counting of days starts from Sunday (unlike in the co-official and national language Swahili, where the first day of the week is Saturday).

Historical-comparative implications
After having described the formal and distributional characteristics of the numerals in the WS language varieties, in this section we attempt to unravel the semasiological background of these forms and to account for the further historical implications that this gives rise to 10 . To facilitate this task, we first have to explain the classificatory and genealogical particularities of the WS group.

Classificatory profile and genealogical background of the WS group
The Bantu languages are most commonly classified through an alpha-numeric referential system first developed by Guthrie (1948Guthrie ( , 1967Guthrie ( -1971 and later updated by Maho (2003Maho ( , 2009). In this system, consisting of zones (letters), further divided into groups (decimals) and individual languages (numbers), the Mara languages, to which the WS language varieties belong, were initially classified within zone E40, but were later reclassified together with several other languages spoken around Lake Victoria, previously belonging to zone D and E, into a zone J. Pace Maho (2003, cf. Philippson & Grollemund 2019, these languages are typically referred to with J as their first letter followed by their original classification, hence Mara = JE40. The Guthrie classification system is primarily a geographical and not a genealogically based system. However, with the exception of JE41 Rogoori (which was re-assigned to the Luhya cluster), the Mara group or JE40 is often treated as a valid subgroup on genealogical grounds as well (see e.g. Nurse 1999). From a gene-alogical standpoint, primarily based on lexico-statistics (see Nurse & Philippson 1980, Schoenbrun 1990, see also Hill et al. 2007) 11 , the Mara branch has been categorized as belonging to the East Nyanza group, which is in turn a subgroup of the Great Lakes (GL) languages. For a visual representation of the genealogical relationship see Roth (2018: 111) (which is a slightly adapted version of that of Schoenbrun 1997: 12-13). The GL (also known as the (Inter)Lacustrine) group corresponds with zone J.
Two facts need to be highlighted that are of importance for the following discussion. Firstly, Nurse (1999: 27-28) points out that the GL languages on the eastern side of Lake Victoria, that is, the East Nyanza and (Greater) Luhya groups, are linguistically similar in a manner which is at the same time different from the other GL languages. This is surprising, as these groups are typically treated as being relatively distantly related. On the other hand, geographically they occupy contiguous areas, renowned for extended contacts. Secondly, as touched upon already in section 1, the area where the Mara languages, including the WS language varieties, are spoken is also characterized by contact with other non-Bantu linguistic communities (Ehret 1971, Shetler 2003: 11-14, 288, Dimmendaal 1995, Nurse 1999). In the present day this contact situation is primarily with South and West Nilotic languages (Datooga and Luo, respectively), but historically there has also been contact between ancestors of the WS group and Cushitic and earlier Nilotic linguistic communities (e.g. East Nilotic Maa).
The GL languages are, in turn, part of an Eastern Bantu group whose ancestors separated from their Western counterpart(s) somewhere in the Congo region roughly 2000 years ago (Grollemund et al. 2015). When referring to Eastern Bantu, we follow the latest phylogenetic classification of such a group, as provided by Grollemund et al. (2015), which, nonetheless, corresponds "fairly well" (Phillipson & Grollemund 2019: 346) with previous attempts at such a classification. We will differentiate between reconstructible shared material and patterns within Eastern Bantu and what is reconstructible, or has indeed been reconstructed, for Proto-Bantu, that is, the earliest ancestor of all Bantu languages.
Zooming in from the macro-to the micro-level, the Mara branch itself is divided into a South and North Mara (see Schoenbrun 1990). Gibson & Roth (2019) and Roth (2018: 110-111) argue for a further split of South Mara into a SW Mara subgroup -containing Ikizu and Zanaki -and the Western Serengeti, that is, the group of language varieties under consideration in this study.

On the enumerative prefixes
Before any discussion of the actual numeral forms, a brief comment on the history of the enumerative agreement prefixes should be given. Schadeberg (2003: 150) notes that the reconstructions for these prefixes to Proto-Bantu are "somewhat shaky" and that there is typically interference with the set of pronominal and nominal prefixes. With that said, the WS members chiefly adhere to the reconstructed set of enumerative prefixes (Meeussen 1967: 97, see also Stappers 1965, as is made further evident in Table 3. The shakier reconstructions are indicated by question marks, in analogy with how they are represented in Meeussen's (1967: 97) reconstructions. Classes 17 and 18 are not included in this table as the WS language varieties have lost the reflexes of these enumerative prefixes (see also subsection 2.1). Conversely, the enumerative prefix of class 20, which exists throughout the Mara group -and is thus doubtlessly a shared retention in the WS subgroup -is not included in this table as it is not reconstructible for Proto-Bantu (see Maho 1999: 253).
An interesting exception, however, is the class 3 marker in Ngoreme, which has the velar onset (characteristic of the pronominal prefix) for numerals as well. As Schadeberg (2003: 150) considers the reconstruction of the class 3 enumerative prefix *ú-as "less certain" (compared to that of class 10 for example), the question should be raised whether this is due to "interference" or whether it should be taken as counterevidence against the suggested reconstruction.

Numerals 1-5
For the reconstruction of the numerals, we start with the numerals 1-5, which are all simplex (monomorphemic) and variable stems in the WS language varieties. These numerals can all be straightforwardly connected with the forms reconstructed for Proto-Bantu (see Meeussen 1967: 105, Schadeberg 2003, that is, *moi '1', *bɪli '2', *-tátʊ '3', -nai '4', and -táano '5'. In fact, all forms for the numerals 1-5, with the exception of 2, are further reconstructible for an assumed Proto-Niger Congo (see Pozdniakov 2018: 293, 313). Regarding the expression of the numeral '2', which is split between a western and an eastern form in Bantu, the WS languages pattern with the Eastern Bantu languages (see for example Guthrie 1961-1971 in having a reflex of *bɪli and not *bali. Furthermore, we may note that the WS varieties seem to have levelled out the formally distinctive absolute number *mʊ-oti reconstructed for Proto-Bantu by Vanhoudt (1994), using the reflex of the referential numeral also for calculations.  Interestingly, it would seem that the conditioned lenition of the initial consonant of the numerals '3', '4', and '5', when occurring with the high fronted class 10 enumerative prefix in the WS varieties, is also a retention from Proto-Bantu. Thus, Meeussen (1967: 105) notes for Proto-Bantu (orthography slightly altered, see f.n. 9): "In class 10 the prefix has to be set up as i-[…] with a peculiar representation in at least two stems: icátu 'three', icáano 'five' (and inyai 'four'?)". From a micro-comparative perspective, it is interesting to note the formal variation of the morphophonological realization of this specific feature as found across the East Nyanza languages, namely Mara and Suguti. In the Suguti languages (including Kwaya, which otherwise tends to pattern with the WS group, see e.g subsection 3.4), the reflexes of '4' and '5' do not alter their basic form when inflected with a class 10 prefix. On the other hand, the form for '3' has been reanalyzed as satu, that is, with a fricativized stem-initial consonant regardless of which agreement class prefix is in use (cf. Stappers 1965, Dimmendaal 2011. Interestingly, there are two North Mara varieties that also pattern differently from the other Mara language varieties (see Aunio et al. 2019). Kabwa behaves similarly to Suguti, whereas in Simbiti, the class 10 fricativization is only optional with '3', i.e. i-tatɔ ~ i-satɔ.

Numerals 6-9
Compared with the numerals 1-5, which can be linked to Proto-Bantu reconstructions, the semasiological background of the numerals 6-9 is more opaque. Greenberg (1978: 291) suggests, with specific reference to Bantu, that such a "penumbra of the system" has to do with the lower frequency in use of these numerals compared to their lower counterparts. With this said, reflexes of these numerals exist far beyond the limits of the WS group, or even the GL branch. Arguably, some are reconstructible up to a putative Eastern Bantu ancestor (although arguing for such a proto-language is far beyond the scope of this article). At the same time, however, the different stems constituting this set of numerals in the WS language varieties also pattern differently in relation to genealogical and/or geographical parameters. One major feature which differentiates the WS members from some of their closest relatives/neighbours is the overall organization of this set of numerals. Thus, Jita (a Suguti language) and Gusii (classified within the North Mara group) make use of an additive quinary-based system of the form augend-link-addend, using the numerals 1-5 with 5 as a base, namely "5 and/with 1, 2, 3, 4…" (a structure similar to that used for forming 11-19 with a base-ten in WS, see subsection 2.1 above). The synchronic transparency of this system would suggest that it is a relatively more recent innovation (see Schapper & Klamer 2014) which has possibly replaced cognates of the numerals found in the WS group. The fact that another Suguti language variety, Kwaya, has a system for forming 6-9 which is identical to that of WS would point towards such a conclusion. Furthermore, Gusii, unlike Jita, does not use the base-five approach to form '9'. Instead kianda is employed, that is, a reflex of the same form found in the WS group.
Another difference between the WS languages and their GL relatives, particularly outside the confines of the Mara branch, has simply to do with the lack of cognancy. This specifically concerns the words for '6' and '7', which are also the most problematic numerals to account for in terms of etymology. No obvious candidates emerge and the references that do discuss these forms do so in quite speculative, and occasionally contradictory manners.
Starting with the word for '6', -saansaβa ~ -saasaβe: similar to the formal division between Ngoreme and the rest of the WS members, reflexes of this form may be expressed with or without an additional word-medial /n/, e.g. Kuria -sansaβa, but Shashi -sasaβa. The WS language varieties pattern with most varieties of the Mara group in having this form of the numeral, with the exception of Gusii, which uses the five-based system mentioned above and hence lacks the form. However, except in the Mara language varieties, the spread of this form is limited, its cognates being confined to the very northern borders of (Eastern) Bantu and the varieties of the Greater Luhya (JE30) subgroup, such as (the clusters of varieties constituting) Masaba (JE31) 12 and Luhya (JE32). As these varieties neighbour the Mara group to the north without being considered to be directly related (cf. subsection 3.1), this shared numeral cognate is suggestive of an areal trait. In fact, Ehret (1971: 130) claims sa(n)saβa to be a loanword from Proto/Pre-South Nilotic *tɪsap '7' (which he (1971: 111) ultimately links to a stem borrowed from Eastern Cushitic). However, although a plausible account of the phonological adaption involved is given (Proto-South Nilotic *t > s; *ɪ > a), the semantic motivation of meaning shift from '7' to '6' is not clear.
Johnston (1919( -1922 instead suggests no less than three Bantu-inherent etymologies of -sa(n)saβa, of which none seems to work very well with the WS data. Two of these etymologies entertain the idea that -sansaβa is in some way derived from '3', in a fashion similar to °-tanda(tu), a common word for '6' in several Eastern Bantu languages (see Schmidl 1915, Meinhof 1948: 118, Hoffman 1953, Meeussen 1969. Although it is tempting to reconstruct sa(n)saβa in this way there are problems with such an endeavour. Specifically, one would have to explain why /t/ spirantized here and not elsewhere. Of course, as we saw in subsection 3.3, it may happen for (lexicalizing) numerals that phonemic change occurs sporadically without adherence to regular sound laws (see also Schapper & Klamer 2014) and the weakening of an already weak /t/ to a fricative would in that case not be a big leap. However, we have failed to find any evidence for this scenario. Moreover, we still would not know how to explain the ending /βa/.
The stem for '7', -huŋɡate, has a different semasiological background from that of '6'. It is also attested for the whole of Mara (minus Gusii, plus Kwaya), and has a much wider distribution in the Eastern Bantu region. What is more, the most northern attestation of a reflex of this form in the GL group comes from Kuria (JE43), that is, a North Mara language and the closest neighbour to the WS members in the north. The Mara languages also seem to constitute the westernmost outpost for this particular numeral, with cognate forms attested only further to the east and the south-east, across much of Southern Kenya and Northern and Central Tanzania (see e.g. Werner 1919: 138, Hoffmann 1953. It does not occur in the rest of the GL group, however, which makes it difficult to account for how this form entered the Mara branch. That is, is -huŋɡate a retention from an early Eastern Bantu stem which disappeared in the rest of the GL or is it the result of diffusion from the east? As it surfaces in most of the Mara varieties, however, it is still fairly safe to conclude that it was inherited into WS branch. Despite the relatively wide distribution of cognates of -huŋɡate '7', the etymology is still opaque. Meeussen (1969: 17), citing Hoffmann (1953: 71-72), rejects Meinhof's (1948: 119) reconstruction (see also Schmidl 1915), which links the form to *-túng-'tie' + tatu three. Instead, the (more fine-grained) comparative data demonstrate that the stem-initial consonant of the proto-form must have been *p and not *t. Such a reconstruction also makes perfect sense for the realization of the stem-initial consonant in WS, as *p has regularly been debuccalized to /h/ in these language varieties. Meeussen's (1969: 19) own suggested etymon for this numeral is a verb °-punk-'point, demonstrate' (and derivatives thereof), along the line of reasoning that the index finger would form the seventh finger when counting on the hands. However, as he points out himself, this is a very fragile reconstruction. Apart from some morphophonological problems involved in connecting this verb stem with the numeral, a major difficulty is the fact that this verb stem is only attested in some Luba varieties (L30), and does not seem to occur in any Eastern Bantu languages, including in the WS group.
The word for '8' has often been suggested in the Bantu literature as being derived from the doubling of '4', from counting with four fingers on one hand and four fingers on the other (see, inter alia, Werner 1919: 134, Schmidl 1915, Meinhof 1948: 118, Greenberg 1978, Schadeberg 2003. Such a proposal also holds for -naane '8' in the WS language varieties. In fact, the WS data add valuable strength to such a proposal, as both of the nasals in this word form are palatalized when inflected with the agreement prefix of class 10, that is, -iɲaaɲe (see subsection 2.1). This would indicate that '8' to some extent is still analyzed as a composition of '4' and '4'.
The word for '9', kénda, can be linked to a stem *-kèndá which is widespread across the East African part of Eastern Bantu, and particularly the Guthrie zones E, F, G, and J (= the GL), see Struck (1911: 991), Hoffmann (1953: 75), Guthrie (1967-1971, and Bastin et al. (2002) 13 . Reflexes of this stem occur across the entire Mara group, including in the otherwise differing Gusii. Hence, it can be safely assumed that this numeral exists in the WS group through inheritance.

'10' and the decimal base(s)
The word ikómi used for '10' can be directly linked to a Proto-Bantu reconstruction, namely the stem -kʊḿì 'ten' (noun class 5/6) 14 (e.g. Bastin et al 2002). Pozdniakov (2018: 133) traces this stem to an innovation *kum/kam/gham in Bantoid, that is, a higher node in the (putative) Niger-Congo phylum of which the Bantu family is a part. Hence, it is probably even older than Proto-Bantu. The word used for forming multiple of tens, muɾɔŋɡɔ / miɾɔŋɡɔ, has a long history as well. This lexeme is a reflex of the form *-dòngò 'ten (decade)' (noun class 3/4), which is attested 13 Reflexes of *-kèndá are also attested in zone L (Bastin et al. 2002), belonging to (South-)Western Bantu (see Grollemund et al. 2015). There are also attestations of this numeral in Zone M of the "osculant pair" *yenda, which additionally surfaces in some parts of the GL/zone J languages; see Guthrie (1967-1971Bastin et al. 2002). 14 The original noun class membership is most obvious in Ishenyi, which retains a reflex of the full prefix *di-in ordinal connective constructions (see example (15) in subsection 2.2 above).
According to Guthrie, miɾɔŋɡɔ arose to form numerals for multiples of '10' in Eastern Bantu, a role previously fulfilled by *-kʊḿì. This division of labour is in accordance with a general cross-linguistic tendency to have a "suppletive alternant" for the actual word for '10' to form decimals (Greenberg 1978). Accordingly, this pattern stretches through a large part of the Eastern Bantu area, including all Mara language varieties. However, it does not seem to be very widespread in the rest of the GL languages, which seem to prefer the use of pluralized reflexes of -kʊḿì or other strategies where the base is not transparent 16 . Thus, the situation is similar to that described for the numeral '7' in subsection 3.4.
Notice that the additive link, used to form numerals within decades, is also reconstructed for Proto-Bantu as *nà 'with, and' (Bastin et al. 2002). Meinhof (1948: 121) already notes its extended use across the Bantu family as a link when forming additive numerals. Forming additive numerals with a "comitative link" like this is also typologically common, as pointed out by Greenberg (1978). The ubiquitous spread of *na with this function across the Bantu family strongly suggests that it was inherited into the ancestor of the WS language varieties.
In conclusion, it is safe to assume that both decimal stems were inherited into the WS. It is also likely that the division of labour between them was inherited, namely with -komi being used for the actual numeral 'ten' and as the base for forming additive numerals (in a construction with a comitative link) and with -roŋɡo as a dedicated base for multiplication.
3.6. The higher base numbers (100, 1000, 100 000) The word for '100', ɾiiɣána, in the WS language varieties can be linked straightforwardly with a stem *-gànà (noun class 5/6), which, according to Guthrie (1967-71, vol. III: 206 (=CS 774)) is "probably" a Proto-Eastern Bantu item given its distribution throughout zones D to zone S (see also Dempwolff 1916Dempwolff -1917Bastin et al. 2002). Indeed, many other GL languages surveyed for this study, including all Mara languages, have a reflex of this stem, even Gusii 17 . Thus, it is definitely a shared retention among the WS language varieties.
Reflexes for the word for '1000', eɣe-kwé (Ikoma eɣe-kú due to the restriction on labialization on the final syllable), is not attested in any general Bantu reconstruction work. However, there are scattered attestations across the Eastern Bantu area. It has been reconstructed for Proto-Sabaki (~G40, E70) by Nurse & Hinnebusch (1993: 292, 663), see also Nicolle (2013: 39-40) on Digo (E73) specifically. It is further attested in Chewa-Nyanja (N31; Werner 1919: 140) and Cuwabo (P34; Guérois 2019). Except for Gusii (which only has a Swahili borrowing attested), all members of the Mara branch -and some of the Suguti -use a cognate of this stem for '1000'. Otherwise, eɣe-kwé, and reflexes thereof, is once again a numeral that does not seem to be used in other parts of the GL area.
Any source meaning of the stem is not clear. No further meanings were provided by the consultants. Is it somehow connected to the proto-GL stem *-kwé (noun class 14) 'bride price' (see Schoenbrun 1997: 94-95) and/or the stem -kwe ~ -ku in the WS language varieties, meaning 'firewood' (as in a heap or a pile, a common type of metaphorical extension for large numerals in Bantu, see Schmidl 1915)?
Similarly, -ɾaɾa (noun class 7/8) for '100 000' remains an enigma from an etymological point of view, not least because it is seldom the case that a digit of such high value is mentioned in the comparative literature. However, as this numeral surfaces in all the WS language varieties it is at least a shared retention or innovation within this group.

Ordinals
As already mentioned in subsection 2.2, the connective construction for forming ordinals is common across the Bantu speaking area and thus is also undoubtedly an inherited pattern in the WS group. To that we may add Polak-Bynon's (1965: 136) statement that connective ordinals accompanied by the preceding element ka-are particularly common in (North)Eastern Bantu in general and particularly around Lake Victoria (i.e. GL). Polak-Bynon (1965: 136) links this element to the noun class 12 prefix, which has the same form, and which is also commonly used for deriving adverbials from nominal stems (e.g. Meinhof 1948: 124).
Regarding the deviant derivatives of 'first', we may note the following. To begin with, -mbεɾε, used exclusively for 'first' in Ikoma, Nata and Ishenyi, and variably used in Ngoreme, originates from a form meaning '(in) front (of)', as seen in (37) below. (Similarly, ɲuma 'back' is exclusively used to denote 'last' in Ikoma, Nata and Ishenyi, as well as in Ngoreme.) (37) Ikoma a-ŋómbe n-e-eɲí á-mbεɾε e mé-te 9-cow foc-sp9-cop 9-front conn9 3-tree 'The cow is in front of the tree.' To derive the ordinal numeral 'first' from 'front' is an extremely common pattern across Eastern Bantu, and languages using this strategy include members from most subgroups of GL, although its adjectival use with nominal prefixes rather than within a connective construction is innovative (see Polak-Bynon 1965: 150-151, see also Grégoire 1975: 212-215). Taken together with the fact that mbεɾε for 'first' is attested for all WS language varieties it is most likely a shared retention.
The irregular ordinal kwánsa in Ngoreme is probably a borrowing from Swahili, partly given the variation from mbεɾε but also because a reflex of the source verb *-yànd-'begin' (Nurse & Hinnebusch 1993: 650) does not seem to exist in the language. Polak-Bynon (1965: 149) finds related forms in other Eastern Bantu varieties such as Pogoro and Sukuma, but mentions that these are likely to be borrowings from Swahili as well. See also Greenberg (1978), who notes that unlike the ordinal '1', the borrowing of the equivalent cardinal 'first' is not uncommon cross-linguistically.

Summary and conclusions
In this study we have accounted for the synchronic and diachronic aspects of the numeral system in four closely related Bantu varieties of the Western Serengeti subgroup, that is, Ikoma, Nata, Ishenyi, and Ngoreme. The study has shown that the systems are more or less identical. Except for the borrowed ordinal kwánsa in Ngoreme, the numerals only differ in terms of the more overarching phonological disparities characterizing the different members of this group.
We can furthermore conclude that the WS language varieties are conservative with regard to this specific linguistic domain, particularly the fact that they have kept the conditioned weakening of the numerals '3'-'5' with the agreement prefix of class 10 suggested for Proto-Bantu. It is possible that this maintenance is connected to the fact that "Bantu spirantization" (Bostoen 2008) has generally not affected these language varieties. However, most other simplex numerals and strategies of forming complex numerals can also be linked to cognates with a further distribution across the (Eastern) Bantu family. This stability in itself can be taken to stand out, not least as other parts of these language varieties are remarkably different from canonical Bantu patterns (such as a highly complex vowel harmony system, inverted auxiliary constructions, and non-inverted existentials).
The fact that the numeral systems of the WS language varieties (and the Mara language varieties more generally) appear to pattern more closely with other Eastern Bantu languages than with their supposedly closest relatives of the GL group is problematic from a genealogical perspective and would bring support to Nurse's (1999) scepticism with regard to this grouping. In addition to this, the etymologies of some of the numerals are still left unresolved. These two facts, taken together, serve as an impetus for further comparative work on this subject and in this region.
Finally, we note that the only clearly attested Swahili borrowing in the language varieties is (most likely) kwánsa 'first' in Ngoreme. This is in contrast with other Tanzanian Bantu languages where especially the higher numerals are claimed to have shifted to Swahili to a more or lesser degree (see e.g. Morrison 2011: 216, Bernander 2017: 32, 79, Wilhelmsen 2019. Hence, it would seem that the WS numeral systems are not in danger of extinction, or at least not at the moment.