Livestock Research for Rural Development 23 (9) 2011 Notes to Authors LRRD Newsletter

Citation of this paper

Population structure of the Sahiwal breed in Kenya

T K Muasya*, J N Kariuki** and J M K Muia**

* Animal Genetic Resources Group,
** Kenya Agricultural Research Institute,
NAHRC, P.O. Box 25-20117, Naivasha, Kenya


The Sahiwal breeding programme in Kenya has been run as a closed nucleus for over 50 years. The population structure of the Sahiwal breed in Kenya was studied using pedigree analysis. Pedigree data from 18315 animals born between 1949 and 2008 was used to analyse population structure in the current study. The ENDOG program was used to compute pedigree completeness index, average inbreeding coefficient, generation equivalents, effective population size and average relatedness coefficient.

The average inbreeding level for the population was 0.6% and the average relatedness among the animals was 1.9% and an effective population size of 219. Inbreeding levels increased by generation while effective population size declined. The maximum traced generations were 11, with a mean of 3.71 and an increase in inbreeding of 0.19% per generation. The mean complete generations and mean equivalent generations were 1.74 and 2.48 respectively, with rate of increase in inbreeding of 0.49% and 0.4%. The respective effective population sizes were 265.08, 102.37 and 125.15. Mean generation interval was 6.9 years and was longer for the sire-son pathway. The effective population size for the Sahiwal is still within the recommended levels for a population to maintain its evolutionary potential. Due to increasing levels of inbreeding and declining effective population size, measures are required to develop a breeding program which integrates management of genetic variability and selection. 

Key words: effective population size, diversity, generation interval, pedigree


The Sahiwal breed was imported into Kenya from India and Pakistan between 1939 and 1963 in an effort to improve the performance of the local East African zebu. The founding population was composed of 60 bulls and 20 cows which were placed in livestock improvement centres across the country. The best animals from the centres were selected and used to establish the National Sahiwal Stud at Naivasha in 1963. Meyn and Wilkins, (1974) designed a closed nucleus breeding plan for genetic improvement of milk and growth rate. Since then the herd has been run as a closed nucleus with introduction of semen from Pakistan in 1992 (Ilatsia et al 2007). The breed has become an economically important dual purpose cattle breed for pastoral and mixed farming communities in the eastern African region (Muhuyi et al 1997). Main breeding work is at KARI Naivasha where a closed nucleus herd of 1500 animals is maintained. Closed nucleus breeding programs usually face the risk of increased inbreeding levels and reduced genetic diversity. Optimising genetic response and maintaining genetic variability is important when the aim is to conserve animal genetic diversity (Hill 2000; Torricellas et al 2002). A reduction in genetic variability leads to unfavourable effects such as inbreeding depression and increase variance of genetic progress due to chance (Falconer and Mckay 1996).  

Genetic variability within a population can be assessed by estimating effective population size and inbreeding, or quantifying founder representation in a population (Lacy 1989, 1995; Boichard et al 1997). These parameters can be managed to avoid reduced variability through loss of founder alleles in future generations in breeding programmes (Faria et al 2009). Gene origin statistics provide an historical overview of changes occurring in a population whereas effective population size and inbreeding are useful for long term management of genetic variability. Inbreeding and effective population size can therefore be used to monitor trends in genetic variability. However inbreeding level of a population can be underestimated where pedigree information is incomplete. Effective population size would on the other hand be overestimated after a number of generations with missing pedigree information (Boichard et al 1997; Faria et al 2009). Exchange of germplasm between populations can also change the population structure, making it difficult to interpret estimates of inbreeding (Boichard et al 1997). Other factors which can change the population structure include periodic reductions in population size, decline in the number of contributing males and unequal contribution of ancestors. 

Average Relatedness co-efficient (AR) is the probability that any two alleles pick at random in a population are identical by descent. AR can be used to predict the long-term inbreeding of a population because it takes into account the percentage of the complete pedigree originating from a founder at population level. In addition, AR can be used to compute the effective size of the founder population as the inverse of the sum of the square AR coefficients across founder animals. Thus this parameter is an alternative or complement to inbreeding. AR for individuals can be used to increase the representation of the founders in the current population (Gutierrez et al 2003) by using for mating animals with low AR values. This way, genetic variability in a population is maintained and the original founder variability is not lost (Goyache et al 2003).  

The objective of this study was to assess the genetic variability of Sahiwal at the National Sahiwal Stud, Kenya by estimating the current levels of inbreeding, average relatedness coefficient, effective population size and generation intervals for the population.

Materials and Methods

Pedigree data were obtained from animals born in the National Sahiwal Stud at KARI Naivasha, Kenya, from 1949 to 2008. Additional information included dates of birth and sex of each animal. A total of 18315 animals were included in the study. The pedigrees of the animals were traced as far back as possible in the birth record book database. All ancestors and relatives of each individual were included in the analyses. Generation intervals for the four pathways: sire-son, sire-daughter, dam-son and dam – daughter were calculated for each period. 

Parameters estimated from the data were pedigree completeness index, average inbreeding coefficient, number of generations, effective population size and average relatedness coefficient.

The parameters were estimated as follows:

Pedigree completeness

Pedigree completeness index (PEC) was estimated to provide information on the quality of pedigree. This was done by describing the completeness of each ancestor in the pedigree to the 5th parental generation using the coefficient for pedigree completeness (PEC)The pedigree completeness index (PEC) was calculated for each individual according to McCleur et al (1983) as follows; 

Where Csire and Cdam are the paternal and maternal line contributions, respectively. The contributions were calculated as;

where d is the total number of generations taken into account. i=1, 2, …, and ai is the proportion of ancestors present in generation i 

Number of traced generations

Number of generations was computed as those generations separating the offspring of its furthest known ancestor in each path. Ancestors with no known parent were considered as founders and assigned to generation 0. 

Number of equivalent generations

Number of equivalent generations is the number of generations, n separating the individual from each known ancestor. This was computed as the sum of (1/2)n where n is the number of generations separating the individual to each known ancestor. 

Complete generation equivalent

Complete generation equivalent was computed as the farthest generation for which all ancestors are known.

Inbreeding coefficient (F)

Inbreeding F was computed as the probability that an individual had two alleles identical by descent according to Meuwissen and Luo (1992).

where and Fi-1 are the average inbreeding at generation i

Average relatedness coefficient (AR)

The average relatedness coefficient (AR) of each individual was defined as the probability that an allele randomly chosen from the whole population in the pedigree belongs to a given animal (Gutierrez and Goyache,2005). Average Relatedness can then be interpreted as the representation of the animal in the whole pedigree regardless of the knowledge of its own pedigree. The parameter was calculated as follows: 

c' = (1/n) 1'A, with A being the numerator relationship matrix (Henderson 1976; Quaas 1976); 1 is a vector of ones (1 x n).  

Effective population size (Ne)

Generation interval

Generation interval (GI), defined as the average age of parents when offsprings are born, help to evaluate future rates of inbreeding and opportunities for annual genetic improvement of a breed. Therefore GI was computed as the average age off-parents when their offspring are born, for each of the four selection pathways; sire–son (SS), sire–daughter (SD), dam–son (DS), and dam–daughter (DD), after Falconer and MacKay (1996).

All parameters were estimated using the ENDOG v.4.5 program (Gutierrez and Goyache, 2005).


The entire relationship matrix included a total of 18315 animals. Pedigree completeness for all animals in the national Sahiwal stud at KARI Naivasha is shown in Fig. 1. The completeness of the pedigree improved with generation and ranged from 11% in the older generations to 91% in the current generation.  

Figure 1. Pedigree completeness levels in the whole pedigree data file for Sahival breed in Kenya

The low estimate of pedigree completeness in the earlier generations was due to the fact that many animals in the earlier generations did not have sire and dam records, and any individual with either of its parents unknown was taken as a founder (Gutierrez and Goyache 2005).  

The distribution of inbreeding (F%) average relatedness coefficient (AR%) for the pedigree of the population at the National Sahiwal stud at Naivasha, Kenya is given in table 1a.  

Table 1a. Distribution of Inbreeding coefficient (F%) and average relatedness coefficient (AR%) for the population at the National Sahiwal Stud, Kenya
















*standard deviation

Estimates of F and AR ranged from 0 to 26.6% and 0.005 to 4.8% respectively. The respective median values were 1.6 and 1.9%

Estimates of inbreeding and average relatedness coefficient (AR) are shown in table 1b. The average inbreeding level for the entire population was 0.6%, average relatedness coefficient of 1.9% and effective population size of 219. 

Table 1b. mean inbreeding coefficient (F %) effective population size (Ne) mean generation equivalents and average relatedness (AR %) for the Sahiwal pedigree at KARI Naivasha


Whole pedigree














AR %





 Maximum generations





Mean complete generations










Inbreeding coefficient (F) and AR were higher in inbred individuals and males than in females (Table 1). Inbred animals were 24.5 % of the population with an average inbreeding of 2.1%. The proportion of inbred animals increased over time as did inbreeding and average relatedness coefficient (Figure 2 and table 3). The trends in F and AR are shown in Figure 2 for the years 1949 to2008. 

There was a general increase in inbreeding at an annual rate of 0.04%. Up to 1990 F and AR increased consistently (Figure 2) but went down in 1993. Thereafter the upward trend continued. There has been a consistent increase in F and AR among individuals in the pedigreed population. AR reached 1% 2 years before the genetic relationships existing between individuals could be measured as inbreeding. AR was higher than 1% and has been consistently above 1% since 1970 while annual mean inbreeding has been above 1% since 1990. 

Figure 2. Trends of Average Relatedness (AR) and inbreeding (F) and proportion of inbred animals by year birth

Effective population size (Ne) was computed by regressing individual inbreeding on mean maximum generation, mean complete generation and mean equivalent generation (Table 2). This was done in order to approximate the lower and upper and actual limits of Ne, especially when genealogical information is scarce (Goyache et al 2003). Complete and equivalent generations were 5.9 and 6.08, respectively. Their means are given and respective changes in inbreeding, average relatedness coefficient and effective population size over generations are given in Table 2. 

Table 2. Generation equivalents, rate of change in inbreeding (ΔF) and their effective population sizes (Ne)





Mean maximum generation1




Mean complete generation2




Mean equivalent generation3




1Maximum generations traced is the number of generations separating an individual from its furthest ancestor.

2Complete generation equivalent is the farthest generation for which all ancestors are known

3Number of generations separating the individual from each known ancestor

The upper and lower limits of Ne for the pedigreed population were 265 and 125 . The respective rates of change in inbreeding were 0.19%, 0.49% and 0.40% (Table 2).  

Table 3. mean inbreeding by complete generations for the national Sahiwal stud


No. of Animals

Mean F%

% inbred

Av. F for inbred






































Inbreeding levels and average elatedness coefficient increased by complete generation, while effective population size decreased (table 3). The differences found in estimates in effective population size could be due to variations in average inbreeding coefficients across generations, as Ne was calculated as a function of the increase in inbreeding, ΔF (Table 2)  

Table 4 shows generation lengths for the four parent-offspring pathways. Average generation interval for whole pedigree was 6.9 years.  Sire pathways were longer than those involving dams, the longest interval being on the sire-son pathway.  

Table 4. Generation intervals (GI), standard deviations (SD) and standard error of mean (SEM) for the four parent-offspring pathways for the whole Sahiwal pedigree
































The pattern of pedigree completeness is similar to estimates reported elsewhere (Baumung and Solkner 2002; Valera et al 2005). The average number of known ancestors for the Sahiwal breed in Kenya is comparable to results obtained from French (Boichard et al 1997) and Austrian Original Pinzgau (Baumung and Solkner 2002) cattle populations, which ranged from 97 to 0.16%. Similar trend of known ancestors per generation has been reported for Brazilian zebu cattle (Faria et al 2009). Other cattle breeds, the Tux-Zillertal and Carinthian Blond cattle breeds in Austria (Baumung and Solkner 2002) and Maremmana and Mucca Pisana of Italy (Torricellas et al 2002) had lower proportions of known ancestors per generation. 

Average inbreeding level in this study (0.6%) is way below the 2% reported for the same population by Rege and Wakhungu (1992). However, in the previous study the number of generations considered and the depth of the pedigree was not described, making comparison difficult. However the two estimates are lower than the 4% reported for the Sahiwal in Pakistan (Dahlin et al 1995). Corrales et al. (2010) reported an average inbreeding of 13.0% for Creole cattle breed in Nicaragua. Other high values of inbreeding have been reported for Caracu in Brazil (Pereira et al. 2005) and the Brazlian Dyr cattle (Filho et al 2010). The decrease in inbreeding in 1993 was due to the birth of calves sired by five imported sires from Pakistan (Ilatsia et al 2007), thereafter the upward trend continued. Estimates of inbreeding depend on the quality of pedigree and are therefore unique to each population (Baumung and Solkner 2002). Despite being a closed nucleus, the slow build up of inbreeding in the Kenyan Sahiwal could be due to lack of any effective selection (Nomura et al 2001; Ilatsia et al 2007). In order to maintain genetic variability in a population, matings can be planned to achieve genetic gain without ignoring the relatedness among animals (Martinez et al 2006).  

Inbreeding has been reported to always lag behind AR in pedigreed populations (Goyache et al 2003). As an alternative or complement to inbreeding, AR can be used to predict the long-term inbreeding of a population because it takes into account the percentage of the complete pedigree originating from a founder at population level. In addition, AR can be used to compute the effective size of the founder population as the inverse of the sum of the square AR coefficients across founder animals. A practical application of AR is in the maintenance of genetic variability in a population through use for mating animals that have low AR values (Goyache et al 2003). 

The Ne estimate found in his study is similar to that reported for the Chianina breed in Italy (Torricellas et al 2002) but higher than is most other previous studies (Solkner et al 1998; Gutierrez et al 2002; Torricellas et al 2002; Baumung and Solkner 2002). In Nicaragua, Corrales et al. (2010) reported an effective population size of 50 for the Reyna Creole breed. Missing pedigree information can lead to overestimation of effective population size (Baumung and Solkner 2002; Faria et al 2009), especially over many generations (Boichard et al 1997). In this study, the number of founders i.e. individuals with either of parents not known; were 223. This could have contributed partly to the large effective population size reported (Baumung and Solkner 2002), as well as lack of effective selection (Nomura et al 2001; Ilatsia et al 2007). Assuming that natural selection counteracts inbreeding depression, it has been recommended that the effective population size be maintained at 50 to 100 or maintain an annual rate of inbreeding of <1% (FAO, 1998), in order to maintain fitness in a population. Further, if genetic variability within a population is expected not to increase as a result of mutation, an effective population size of 500 should be maintained (Franklin and Frankham, 1998). The Ne reported in the current study of 219 is within the two limits recommended by FAO (1998) and Franklin and Frankham (1998), suggesting that the population is viable and has adequate genetic variability. 

Estimates of maximum traced generations (11), mean equivalent generation (2.48) and mean complete generation (1.74) were within the range reported for various breeds of cattle (Baumung and Solkner 2002). In Italy, maximum generations varied from 13 in the Chianina to 7 in Mucca Pisana cattle breeds (Torricellas et al 2002). In the Gir, Nelore and Guzerat cattle in Brazil the maximum generations traced were 10, 13 and 12 respectively (Faria et al 2009). The Sahiwal breed in Kenya had similar complete generation equivalents compared to Italian beef breeds which had values ranging from 2.26 to 3.13 (Torricellas et al 2002) and Brazillian Zebu cattle (Faria et al 2009). Number of known generations and number of equivalent generations characterize the shallowness of the available pedigree information. In the current study the two parameters were approximately 4 and 2.5 respectively, indicating somewhat deep pedigree structure.  

The generation interval obtained in this study is comparable to the 6.9 years reported for the Reyna Creole cattle (Corrales et al 2010). Other similar generation intervals have been reported for the Boran in Ethiopia (Mekonnen and Phillipson 1994), Sahiwal breed in Pakistan (Dahlin et al 1995). Longer generation intervals (>8 years) have been reported in Gir, Nelore and Guzerat cattle breeds in Brazil (Malhado et al 2008; Faria et al 2009; Filho et al 2010). The relatively long generation interval reported especially for the sire-son pathway could be due to the time taken in progeny testing bulls, and the longevity of animals in the Sahiwal population, where cows can be retained for up to15 years of age. Long generation intervals slow the rate of genetic progress and consequently lower economic returns for a breeding program. 

The Sahiwal population in Kenya was established from a small number of founders (Muhuyi et al 1997), thereby limiting the genetic variability from the onset. Further, since establishment the Sahiwal herd has been genetically closed, and therefore subsequent loss in genetic diversity is expected. The increasing rate of inbreeding and declining effective population size threatens the sustainable utilization of this important resource. The Sahiwal cattle breed has been found to maintain production and reproduction under dry, tropical conditions where other cattle breeds cannot not survive and remain productive (Muhuyi et al 1997). Using AR and inbreeding, genetic variability and genetic improvement can be managed together, through mating plans which pair parents of next generation on the basis of AR. There is a need therefore to develop a sustainable conservation program which includes all Sahiwal herds in Kenya, India and Pakistan. 

Conclusions and recommendation


The authors acknowledge the Director KARI for financial support and the National Sahiwal Stud, Naivasha, Kenya for provision of data and computing facilities.  


Baumung R and Solkner J 2002 Analysis of pedigrees of Tux-Zillertal, Carinthian Blond and original Pinzgau cattle population in Austria. Journal of Animal Breeding and Genetics119, 175–181.


Boichard D, Maignel L and Verrier E 1997 The value of using probabilities of gene origin to measure genetic variability in a population. Genetics Selection and Evolution 29, 5–23. Retrieved January 1 2011


Corrales R, Näsholm A, Malmfors B and Philipsson J 2010 Population structure of Reyna Creole cattle in Nicaragua. Tropical Animal Health Production 42, 1427–1434.


Dahlin A, Khan U N, Zafar A H, Saleem M, Chaundhry M A and Philipsson J 1995 Population structure of the Sahiwal breed in Pakistan. Animal Science 60, 163–168.


Falconer D S and Mackay T F C 1996 Introduction to quantitative genetics. 4th edition. Longman Scientific and Technical, Harlow, UK.


Faria F J C, Fihlo A E V, Madalena F E and Josahkian L A 2009 Pedigree analysis in the Brazilian Zebu breeds. Journal of Animal Breeding and Genetics 126,148-153.


Food and Agriculture Organization 1998 Secondary Guidelines for Development of National Farm Animal Genetic Resources Management Plans: Management of Small Populations at Risk. FAO, Rome, Italy. Available at.


Filho J C R, Lopes P S, Verneque R S, Torres R A, Teodoro R L and Carneiro P L S 2010 Population structure of Brazilian Gyr dairy cattle. Revista Brasileira de Zootecnia39 (12), 2640-2645. Retrieved on January 1, 2011 from


Franklin I R and Frankham R 1998 How large must populations be to retain evolutionary potential? Animal Conservation 1, 69–70. Retrieved January 1 2011 from


Gutierrez J P, Altarriba J, Dıaz C, Quintanilla R, Canon J, Piedrafita J 2003 Genetic analysis of eight Spanish beef cattle breeds. Genetics Selection and Evolution 35, 43-63 Retrieved January 1 2011 from


Gutierrez J P and Goyache F 2005 A note on ENDOG: a computer program for analysing pedigree information. Journal of Animal Breeding and Genetics 122, 172-176.


Henderson C R 1976 A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 31, 69-83


Hill W G 2000 Maintenance of quantitative genetic variation in animal breeding programmes. Livestock Production Science 63, 99–109.


Ilatsia E D, Muasya T K, Muhuyi W B and Kahi A K 2007 Genetic and phenotypic parameters and annual trends for milk production and fertility traits of the Sahiwal cattle in semi arid Kenya. Tropical Animal Health Production 39, 37–48.


Lacy R C 1989 Analysis of founder representations in pedigrees: Founder equivalents and founder genome equivalents. Zoo Biology 8, 111–123.


Lacy R C 1995 Clarification of genetic terms and their use in the management of captive populations. Zoo Biology 14, 565–578.


MacCluer J W, Boyce A J, Dike B, Weitkamp L R, Pfenning D W and Parsons C J 1983 Inbreeding and pedigree structure in Standard bred horses. Journal of Heredity 74, 394–399.


Malhado C H M, Ramos A A, Carneiro P L S 2008 Progresso genético e estrutura populacional do rebanho Nelore noEstado da Bahia. Pesquisa Agropecuária Brasileira 43, 1163-1169.


Martinez M L, Verneque R S, Teodoro R L 2006 Programa Nacional de Melhoramento do Gir Leiteiro.Sumário Brasileiro de Touros – Maio 2006. Juiz de Fora: EmbrapaGado de Leite, 2006. 54p. (Documentos, 108).


Mekonnen H M and Philipsson J 1994 Estimates of genetic and environmental trends of growth traits in Boran cattle. In: Genetic Analysis of Boran, Friesian and Crossbred Cattle in Ethiopia, Ph.D. Thesis, Swedish University of Agriculture Sciences.


Meuwissen T I and Luo Z 1992 Computing inbreeding coefficients in large populations. Genetics Selection and Evolution 24, 305–313. Retrieved December 23 2010 from


Meyn K and Wilkins J V 1974 Breeding for milk in Kenya with particular reference to the sahiwal stud. World Animal Review 11, 24-30.


Muhuyi W B 1997 A comparison of the productivity of the Kenya Sahiwal cattle and their crossbreds in large scale dairy-dual purpose and beef production systems. PhD. Thesis, University of Nairobi Kenya Pp 149.


Nomura T, Honda T, and Mukai F 2001 Inbreeding and effective population size of Japanese Black cattle. Journal of Animal Science 79, 366–370.


Pereira M C, Zerlotti M M E, Galvao de Albuquerque L and Razook A G 2005 Estimativa de Ganho Genético a Partir de Diferenciais de Seleção e Parâmetros Populacionais em um Rebanho Caracu, Revista Brazileira de Zootecnia, 4(6)(Suplemento), 2245–2252.


Quaas L R 1976 Computing the diagonal elements of a large numerator relationship matrix. Biometrics 32, 949-953.


Rege J E O and Wakhungu J W 1992 An evaluation of a long-term breeding programme in a closed Sahiwal herd in Kenya. I. Genetic and phenotypic trends and levels of inbreeding. Journal of Animal Breed and Genetics 109, 374–384.


Solkner J, Filipcic L and Hampshire N 1998 Genetic variability of populations and similarity of subpopulations in Austrian cattle breeds determined by analysis of pedigrees. Animal Science 67, 249–256.


Torrecillas P C, Bozzi R, Negrini R, Filippini F and Giorgetti A 2002 Genetic variability of three Italian cattle breeds determined by parameters based on probabilities of gene origin. Journal of Animal Breeding and Genetics 119, 274–279.


Valera M, Molina A, Gutierrez J P, Gomez J, Goyache F 2005 Pedigree analysis in the Andalusian horse: population structure, genetic variability and influence of the Carthusian strain. Livestock Production Science 95, 57–66.

Received 28 June 2011; Accepted 1 August 2011; Published 1 September 2011

Go to top