Non – invasive genetic approaches for estimation of ungulate population size : a study on roe deer ( Capreolus capreolus ) based on faeces

Non–invasive genetic approaches for estimation of ungulate population size: a study on roe deer (Capreolus capreolus) based on faeces.— Estimating population size is particularly difficult for animal species living in concealing habitats with dense vegetation. This is the case for roe deer as for many other ungulates. Our objective was to develop a non–invasive genetic capture–mark–recapture approach based on roe deer faeces collected along transects. In a pilot study, we collected 1,790 roe deer faeces during five sampling days in a forested study area in south western Germany. We extracted DNA from 410 of these samples and carried out microsatellite analysis using seven dinucleotide markers. The analyses resulted in 328 useable consensus genotypes which were assigned to 174 individuals. The population size estimated using a Bayesian approach was 94 (82–111) male and 136 (121–156) female roe deer. Our study shows that non–invasive genetic methods are a valuable management tool for roe deer.


Introduction
Assessing population size is of prime importance for the management of large herbivores such as the different deer species (Gordon et al., 2004;Williams et al., 2002).However, for elusive animal species living in habitats with dense vegetation, estimating population size is particularly difficult.This is the case for roe deer (Capreolus capreolus), and other ungulates living in concealing habitats (Smart et al., 2004;Hewison et al., 2007).For these species, indirect methods such as pellet counts seem more appropriate than direct counts, even though the latter have been applied rather successfully in deer populations in open areas (Smart et al., 2004;Tsaparis et al., 2009).Nevertheless, most traditional indirect approaches to estimate abundance yield estimates of low accuracy and low precision (Garel et al., 2010), and thus have limited power for certain management aspects.Direct methods such as traditional capture-mark-recapture have also been applied for roe deer, but mostly in enclosed study areas or in combination with radiotelemetry (Gaillard et al., 1986;Pegel & Thor, 2000;Hewison et al., 2007).In this context, non-invasive genetic methods based on hair or faeces represent a promising alternative to traditional methods because they can combine the advantages of indirect methods with the accuracy and precision of CMR approaches (McKelvey & Schwartz, 2004;Beja-Pereira et al., 2009).In the early 1990s, this approach was mainly used for carnivores such as coyotes and bears (e.g.Kohn et al., 1999;Woods et al., 1999), but it was not long before it was also used for studies on other species (reviewed in Waits & Paetkau, 2005;Beja-Pereira et al., 2008;Ebert et al., 2010).However, it is only in recent years that non-invasive genetic population size estimation methods -mostly based on faeces as a DNA source-have been carried out for ungulate species, such as Argali sheep (Ovis ammon, Harris et al., 2010), Sitka black-tailed deer (Odocoileus hemionus sitkensis; Brinkman et al., 2011), and wild boar (Sus scrofa; Ebert et al., 2009, in press).
The roe deer is a widespread species in many parts of Europe and when it occurs in high densities it can have a significant impact on the vegetation in its habitat through browsing (Ellenberg, 1978;Gordon et al., 2004).In the Palatinate Forest, where this study was carried out, roe deer are managed mainly by hunting.However, in our study area, as in most regions, reliable data on roe deer abundance that would facilitate roe deer management (e.g. for setting appropriate harvest quotas) are lacking.We therefore aimed to establish a non-invasive genetic population estimation method for roe deer and to test its feasibility for this species.In this article, we present a pilot study based on faeces samples collected along transects.

Study area and faeces sampling
Roe deer faeces sampling was carried out in a 2,400 ha segment of a wildlife research area situated in the Pa-latinate Forest in Rhineland Palatinate, south western Germany (49° 12′ N, 7° 45′ E).This wildlife research area (approx.100 km²) was established in 2005.It is situated in the German part of the transfrontier biosphere reserve 'Vosges du Nord-Pfälzerwald', located along the French -German border.Forest cover in the study area is 93%, with nutrient-poor sandstone as the prevalent soil.The dominant tree species are Fagus sylvatica (33%), Pinus sylvestris (30%), Quercus spec.(16%), Picea abies (10%) and Pseudotsuga menziesii (8%) (Forestry of Rhineland-Palatinate, pers. comm.;Hohmann & Huckschlag, 2010).Several small settlements with surrounding open areas lie at the periphery of the study area.Altitude ranges between 220 and 611 m a.s.l.Annual average temperature is 8-9°C (Weiß, 1993), and annual precipitation approximates 600-1,000 mm.Besides roe deer, red deer (Cervus elaphus) and wild boar (Sus scrofa) are common ungulate species in the study area.
The state forestry authorities in Rhineland-Palatinate manage the state-owned forests and are responsible for the hunting in these areas.The roe deer hunting bag in the forestry of Hinterweidenthal, in which our study area is located, is on average 2.0 animals/1 km 2 (average between 1999 and 2009).The hunting season for roe deer is from May until January, with different age classes being allowed to be hunted at different times of the season.Hunting is performed by drive hunts (≥ 1,000 ha) between October and January and by single hunt throughout the rest of the season.
We implemented 20 transects in a study area of approximately 4,000 ha (fig.1).We aimed to cover the area as systematically as possible and to include every habitat type in the sampling scheme.Transect lengths varied between 4.9 and 7.2 km with an average length of 5.7 km.Distances between transects ranged between 231 m and 500 m with an average of 322 m.
Each transect was searched by one person on each of five days (14-18 III 2011).The field workers followed transect routes using maps and compasses.The locations of all detected roe deer faeces were recorded using GPS loggers (Mobile Action Inc., I-gotU GT 120, http://www.i-gotu.com).Approximately one hand full of pellets of each detected pellet group was collected using an inverted freezer bag which was then reversed and closed.Samples were stored frozen (-19ºC) in the sealed freezer bags until analysis.The remaining pellets from each sampled pellet group were removed to avoid double sampling of the same faeces.
Because of limited lab resources we were not able to analyze all collected faecal samples and thus had to select a subsample among the collected faeces.We selected samples proportional to the number of detected faeces in each transect.We selected approximately 20% of all samples in each of the 20 transects.The number of detected samples was unequally distributed over the transect grid.To guarantee a maximum chance of representative animal detection along the whole transect, we divided each transect in six subtransects of equal length.Next we selected at least one sample in each subtransect if possible.The residual samples were selected in proportion to the number of findings within the subtransects.When choosing among sam-ples on a subtransect, we used freshness as selection criterion in order to optimize the genotyping success.

DNA extraction and genotyping
DNA was extracted from the faeces samples within four weeks after collection using the NucleoSpin soil kit (Macherey-Nagel, Düren, Germany) following the manufacturer's instructions, but with one modification: for the first step, three whole faecal pellets were vortexed together with one ml of lysis buffer for 15'.Seven dinucleotide microsatellites were analyzed to identify individual roe deer (table 1).To determine the sex of the sampled animals, we additionally amplified the Amelogenin gene according to Gurgul et al. (2010).A sample was classified as male when the Y-allele was present at least once.All samples showing only the X-allele in all three repeats were classified as female.For microsatellite analysis, all markers were combined in one multiplex reaction and a GeneAmp ® PCR System 9700 Cycler (Applied Biosystems, Darmstadt, Germany) was used at the following PCR conditions: initial denaturation at 95°C for 15 min followed by 45 cycles of 30 s at 94°C, 30 s of 57°C and 60 s at 72°C, and a terminal elongation step at 60°C for 30 min.Amplification reactions were performed in triplicate, each in a total volume of 12 μl using the Qiagen Multiplex PCR kit (Qiagen, Hilden, Germany).We used the primers at concentrations of 0.2 μM to 0.4 μM.Amplification products were run on an ABI3730 Sequencer using the ABI GS500LIZ size ladder and analyzed using the software GeneMapper v3.7 to determine allele lengths (Applied Biosystems, Darmstadt, Germany).
We deduced consensus genotypes from the triplicate results.Samples were typed as heterozygous at one locus if both alleles appeared at least twice, and as homozygous when all replicates showed the same result.We repeated the genotyping another three times in cases when results were ambiguous after the first three replicates.All samples which failed to amplify or to produce unambiguous results (i.e. both alleles present at least two times for heterozygotes and only one allele in all six repeats for homozygotes) for more than two loci were discarded.We scrutinized genotypes differing by one (1-MM) or two (2-MM) alleles to detect genotyping errors.For all 1-or 2-MM pairs, raw data were re-checked to resolve the mismatches.Genotype pairs with only one mismatch were regarded as originating from the same individual (Ruell et al., 2009), whereas 2-MM pairs were considered as originating from different individuals, if re-checking of raw data and an additional three PCR repeats did not alter the results and if both samples matched with other samples in the data set (Paetkau, 2003).
Determination of matching genotypes and construction of capture histories were carried out using GENECAP (Wilberg & Dreher, 2004).To confirm the power of the used loci, we calculated the probability of identity (PID) and, being more conservative, PID

Wildlife Research Area
for siblings (PID sibs ; Waits et al., 2001) using GIMLET (Valière, 2002).Expected and observed heterozygosity as well as Hardy-Weinberg tests were performed using CERVUS 3.0 (Kalinowski et al., 2007).We calculated genotyping error rates (allelic dropout [ADO] and false alleles [FA]) as recommended in Broquet & Petit (2004).To illustrate the discriminating power of the loci used, we furthermore computed a match probability by multiplying the PID over all loci by an assumed approximate population size of 300 roe deer.As a blind test for genotyping reliability, we reanalysed 18 faeces samples without the lab staff knowing to which of the already analysed samples these further 18 samples should match.Furthermore, DNA was extracted and genotyping was carried out from tissue samples of all roe deer hunted in the study area after our sampling.The resulting genotypes were compared to those obtained via faeces sampling.

Population estimation
We calculated population size estimates using maximum likelihood mixture models for closed captures with heterogeneity (Pledger, 2000) implemented in Program MARK (White & Burnham, 1999).These models are widely applied in non-invasive genetic studies and we used them to facilitate comparison.Each of the five sampling days was considered as a separate 'capture' session.We defined a set of plausible candidate models with varying assumptions concerning capture probability (p) and recapture probability (c): (1) model p (.) = c (.) with constant capture-and recapture probability; (2) model p (.) , c (.) accounting for behavioural response to sampling; (3) model p t = c t with captureand recapture probabilities varying over time; and (4) model p 1 = c 1 , p 2 = c 2 with two different mixtures for capture and recapture probabilities accounting for individual heterogeneity For each of the four basic models, we considered two different cases: 'basic model only' and 'basic model including sex', in which the sex of the animals is included as a grouping factor (table 3).For population estimation, only those samples of each individual which were detected on different days were treated as 'recaptures', so that only one recapture per day was taken into account in order to fit into the traditional CMR framework (Otis et al., 1978).The different models were compared and ranked using Akaike's Information Criterion with an additional bias correction term (AICc; Burnham & Anderson, 2002).
Additionally, we calculated model averages, i.e. weighted averages over all models according to their support in the data as indexed by the AICc weights, in order to account for model selection uncertainty (Burnham & Anderson, 2002).Furthermore, we calculated CI using the unconditional standard error (SE) and the equation reported in Rexstad & Burnham (1992, page 19), because confidence intervals (CI) in Program MARK for model averages do not account for the minimum number of observed individuals.We additionally aimed to obtain CI for the total population (male + female).When sex is included as a grouping variable, as in our analysis, population sizes and CI are estimated separately for both sexes.We therefore calculated the sum of a random number of the female and male probability distribution, iterated this 10,000 times and calculated mean (total population size) and standard error from the resulting distribution.We used mean and standard error to calculate 95% CI for the estimated total population size based on the corresponding Rexstadt & Burnham (1992) equations.
In addition to the closed captures estimates based on mixture models, we calculated population estimates using a Bayesian model (Gazey & Staley, 1986;Petit & Valière, 2006) This single session approach is especially suitable for non-invasive data because it can make full use of all detections for each individual in the data set.We examined the assumption of capture homogeneity by carrying out a test in which the sampling process is simulated under the assumption of homogeneity and the expected number of captures is compared with the observed number of captures per individual (Puechmaille & Petit, 2007).
The Bayesian estimate and the test were performed using the R package (Ihaka & Gentleman, 1996) and a script provided by E. Petit (pers. comm.).

Faeces sampling and genotyping
During the five sampling days, 1790 roe deer faeces were collected of which we selected 410 samples in the subsampling process for DNA extraction and microsatellite analysis.Of these, 328 samples (i.e.80%) yielded a useable consensus genotype consisting of at least 5 markers.For fifteen samples, one marker failed to amplify and for four samples two markers were missing.The proportion of positive PCR varied among loci between 0.93 and 0.98 (table 2).Mean expected heterozygosity (H exp ) was 0.75 and mean observed heterozygosity (H obs ) was 0.73; no significant deviations from Hardy-Weinberg-equilibrium were detected (table 2).The observed PID was 2.16 x 10 -8 , and PID sibs was estimated as 0.0012.The ADO rate varied between markers with a minimum of 0.028 (Roe8), a maximum of 0.077 (BMC1009) and an average of 0.048 (table 2).No false alleles were detected in the data set.The match probability was calculated as 6.48 x 10 -6 .After re-checking the genotypes, no 1-MM pairs and seven 2-MM pairs remained in the data set.All samples of the 2-MM pairs matched with at least one other sample and were thus considered to belong to different individuals.
The 328 genotypes were assigned to 174 different individuals, 71 males and 103 females.Of the 174 individuals, 89 were sampled once, 46 were sampled twice, 16 were sampled three times, 16 were sampled four times, 6 were sampled five times, and one was sampled six times.
All 18 samples reanalyzed in the blind test matched the genotype of the original sample.Tissue samples from 31 hunted roe deer were genotyped and resulted in 18 matches with genotypes already known from the faeces sampling.

Population estimation
Four of the maximum likelihood models received considerable support with ∆AICc < 2 (table 3).The most supported model incorporated time and sex dependent capture probabilities, whereas the time dependent model, the heterogeneity model and the model with constant p and c were less supported.Mean p was 0.234 for male and 0.256 for female roe deer.Estimated p over the five sampling days were 0.25, 0.22, 0.17, 0.28 and 0.25 for males and 0.32, 0.29, 0.28, 0.25 and 0.14 for females (fig.2).The estimated population size ranged from 89 to 113 male roe deer and from 118 to 164 female roe deer with confidence interval widths varying between 25% and 90% of the estimated population sizes for the four most supported models (table 3).The model averaged population size was 99 (80-155) male and 139 (114-218) female roe  [ADO and FA] were calculated after Broquet & Petit, 2004).

Marker
H The Bayesian population estimate was 94 (82-111) for male roe deer and 136 (121-156) for female roe deer.No heterogeneity in capture probability was found for the female part of the population, whereas for the males, the heterogeneity test revealed that the observed capture frequencies differed significantly from the expected capture frequencies, thus indicating heterogeneity.

Discussion
Considering that 80% of the samples yielded a useable consensus genotype after three to six PCR repeats, the DNA extraction and genotyping protocol can be regarded as efficient and the PCR success rate as rather high (Broquet et al., 2007).Although several genotyping errors occurred, mainly in the form of ADO, we believe the overall misidentification rate to be low because ambiguous samples were repeated up to six times and error rate was below 5% (Lukacs & Burnham, 2005).Furthermore, all samples which were analysed twice in the blind test resulted in matching genotypes, and 18 of 31 roe deer which were harvested in the study area after our sampling trial had genotypes that matched individuals which were detected via faeces sampling.This indicates that the genotyping protocol is repeatable and consistent.As the PID, the match probability and, particularly, the PID sibs (being well below 0.01) were low, the set of markers seems sufficient to reliably discriminate even closely related individuals from each other (Woods et al., 1999).Mean capture probabilities (p) were consistently above 0.2 with a somewhat higher p for the female part of the population than for males (0.256 and 0.234, respectively).The sex ratios of the sample and the estimated population of between 1.4 and 1.45 can be considered realistic, since in many ungulate species populations tend to be female-biased (Clutton-Brock & McLonergan, 1994) and in a study in southern Germany based on direct counts the mean roe deer sex ratio was estimated as 1.5 (Pegel & Thor, 2000).The fact that the most supported maximum Table 3. Roe deer population estimates derived from non-invasive genetic sampling data based on faeces.The maximum likelihood mixture models are ranked according to Akaike's Information Criterion (AICc).Capture probability is symbolized by p, recapture probability by c.A dot after a parameter denotes that the parameter is held constant in the respective model, a 't' stands for a parameter varying over time.Further parameters are K (number of parameters of each model), w i (model weight), N (estimated population size) for each sex incl.95% confidence intervals and standard error (SE).'Sex' models include the sex of the animals as a grouping factor.Tabla 3. Estimaciones de la población de corzo derivadas del muestreo de datos genéticos no invasivo, basado en las heces.Los modelos mixtos de máxima probabilidad se ordenaron según el Criterio de Información de Akaike (AICc).La probabilidad de captura se simboliza mediante una p, la probabilidad de recaptura, mediante una c.Un punto tras un parámetro significa que dicho parámetro se mantiene constante en el modelo respectivo, una ''t'' se refiere a un parámetro que varía con el tiempo.Otros parámetros son K (el número de parámetros de cada modelo), w i (peso del modelo), N (tamaño de la población estimado) para cada sexo incl.95% de intervalos de confianza y error estándar (SE).Los modelos ''sex'' incluyen el sexo de los animales como factor de agrupamiento.likelihood model includes differences between the sexes in p suggests that males were slightly less prone to detection than females.This observation could perhaps be induced by differences in space or habitat use between the sexes (Mysterud, 1999).Both sexes showed a certain variation of p over time.As p did not decrease significantly over time for males a reaction of the animals to a disturbance by the field workers is not probable.For the females, however, p declined slightly over the five sampling days with a more distinct drop at day five (fig.2).

Model
In the first sampling days the sample size is often higher because faeces have accumulated in the days before the beginning of the trial, and a decline after the first sampling days is therefore to be expected.However, several studies suggest that female ungulates are more wary towards human disturbance than males, which would explain the sex-related differences in p (Stankowich, 2008) Garel et al. (2010) showed that changing observation conditions (e.g.weather) can also have a significant impact on observation results.In our case, this is unlikely as an explanation for the decrease in p over time or for individual heterogeneity, because weather and overall sampling conditions were good and did not change markedly over the five days of sampling.
The Bayesian population estimates for both sexes are very similar to the maximum likelihood estimates but show higher precision.The estimated population sizes of the Bayesian approach and the supported maximum likelihood models are relatively consistent and show moderate precision (table 3), which together with reasonably high capture probabilities suggests that a representative coverage of the population has been achieved.However, in order to allow a quantitative evaluation of management measures, population densities will be needed in addition to population sizes.Therefore, the topics of population closure and density calculation should be addressed in future studies.
The assumption of demographic closure can be met by choosing a short time span for sampling like we did in our study.In contrast, geographic closure can be very difficult to achieve and closure violation can severely bias density estimates (Obbard et al., 2010).This should be taken into account when calculating population densities.An alternative to traditional methods for population density calculation (e.g. using buffers to determine an effectively sampled area by mean maximum recapture distance, Parmenter et al., 2003) are spatially explicit capture-recapture appro-aches (SECR), which hold the potential to mitigate the problem of closure violation in density estimates (Borchers & Efford, 2008).
Furthermore, it would be an interesting topic for further research to evaluate the performance of the non-invasive genetic approach in comparison with traditional methods such as pellet counts or distance sampling.Even though the precision of our roe deer estimates seems to be good compared to many traditional estimates for woodland deer (as discussed e.g. in Cederlund et al., 1998;Smart et al., 2004), a detailed comparison between approaches remains very difficult.Different approaches need to be applied in the same study area and ideally in a population of known size in this context.
Our study shows that non-invasive genetic sampling is a promising tool for roe deer management and suggests it could play a useful role in calibrating measures such as harvest quotas.

Fig. 1
Fig. 1 Location of the study area in south western Germany and transect design.The size of the area covered by the transects within the 100 km² wildlife research area is approximately 40 km².

Fig. 2 .
Fig. 2. Estimated capture probabilities (p) for male and female roe deer during a non-invasive genetic faeces sampling study.Solid rhombi represent male roe deer and open circles represent females; error bars represent the standard error.

Table 1 .
Microsatellite markers used for individual identification of roe deer: conc.Concentration; N. Number of alleles.

Table 2 .
Characteristics of the microsatellite markers used for individual identification of roe deer (H exp .Expected heterozygosity; H obs .Observed heterozygosity; p-value.Bonferroni-corrected p-values for deviations from Hardy-Weinberg expected genotype frequencies; PCR+.Percent positive PCR.(Mean error rates Rudnick et al., 2008))ss, tend to have rather short flight distances when disturbed by humans walking in their habitat (in mean between 39 and 85 m; DeBoer et al., 2004), and thus it is unlikely that animals left the sampled area in reaction to the field workers.Brinkman et al. (2011)observed similar variations in capture probability for Sitka black-tailed deer and suggested variation in activity and habitat use during winter as a reason.The heterogeneity model (p 1 = c 1 , p 2 = c 2 ) also received considerable support, suggesting the incidence of a certain individual heterogeneity (which in fact is almost ubiquitous in non-invasive as well as in traditional CMR studies;Rudnick et al., 2008).