Applying IUCN Red List criteria to birds at different geographical scales : similarities and differences

Applying IUCN Red List criteria to birds at different geographical scales: similarities and differences. Extinction risk and conservation status of species are assessed at the global scale by the International Union for Conservation of Nature (IUCN). To ensure objectivity, repeatability and traceability, assessments follow a standardized process that uses reliable and verifiable information. Assessments are synthesized according to guidelines, which have recently been adjusted for application at sub–global scales. Nevertheless, species may have several, different or overlapping conservation status. To quantitatively compare assessments from global to sub–national scales, in this study we analyzed 15 assessment lists for 66 game bird species in France. Assessments were made following IUCN guidelines. Overall, our results reveal that (1) assessments at large spatial scales tend to give lower threat status than small–scale assessments; (2) large–scale assessments made it possible to formally verify information whereas smaller–scale assessments usually did not; (3) large–scale assessments are more likely to be based on standardized evidence of reduction in population size and are less exposed to 'scale–effects' and 'edge–effects'; (4) large–scale assessments are also more often based on scientific literature sensu stricto; and (5) sources are more accurately synthesized than Red Lists at small spatial scales. Our results suggest that small–scale Red Lists do not fully match IUCN guidelines and differ significantly in their assessment processes when compared to global standards. The use of subjective and unreliable data in small–scale Red Lists (above all in national and sub–national lists) may jeopardise the original aim of IUCN Red Lists to provide comprehensive and scientifically rigorous information, and could thus compromise the credibility and prestige of IUCN Red Lists in the eyes of researchers, the general public, and other stakeholders.

Palabras clave: Evaluación de la biodiversidad, Especies de aves cinegéticas, Estado de conservación, Gestión basada en información, Listas Rojas de la IUCN, Evaluación regional Introduction Evidence-based wildlife management (Sutherland et al., 2004) requires reliable information on, above all, the conservation status and the extinction risk of species. The most widely recognized assessment of the conservation status of species is the Red List of Threatened Species, established by the International Union for Conservation of Nature (IUCN) (de Grammont and Cuarón, 2006;Rodrigues et al., 2006;Szabo et al., 2012;Maes et al., 2015). The great value of the IUCN Red Lists is derived from their original aim to represent a comprehensive source of scientifically rigorous information (Rodrigues et al., 2006). IUCN assessments have to be objective, transparent, repeatable and traceable (Fitzpatrick et al., 2007;Miller et al., 2007). To this aim, Red Lists are derived from assessments that use data published in a searchable format (Rodrigues et al., 2006). Nevertheless, assessments also routinely use expert knowledge (McBride et al., 2012), so that guidelines for the reliable integration of such knowledge are under development (McCarthy et al., 2004;McBride et al., 2012;Drescher et al., 2013;Drolet et al., 2015). Publication of both data and assessments is now consolidated at global scale through to the on-line searchable IUCN databases accessible via the Internet at http://www.iucnredlist.org. Assessments are constructed using explicitly defined categories and quantitative criteria that are applicable and valid at global scales (Akçakaya et al., 2000;IUCN, 2001). Over the past two decades, these criteria and categories have been revised, thresholds have been adjusted and new categories created (Gärdenfors, 2001;IUCN, 2013). The robustness of assessments has been consolidated through the standardization of data-driven procedures and the use of objective criteria that no longer depend on approaches that entail risks of subjectivity (e.g. threat categorizations based directly on expert opinions) (Mace and Lande, 1991;Rodrigues et al., 2006). Nevertheless, some of the issues that still need to be resolved have been underlined by, for instance, the application of IUCN criteria at sub-global scales (Gärdenfors, 2001;Mace et al., 2008). Hereafter we use 'sub-global' as a synonym of 'regional' sensu lato to avoid potential confusions with political districts (regions sensu stricto), that in several countries such as France correspond to sub-national administrative territories.
Given that the majority of conservation actions take place at sub-global scales, and that the most influential institutions working on conservation legislation and action are national and regional governments, the concern for the spatial sub-structuring of the threat status of species is increasing (Gärdenfors, 2001;Gärdenfors et al., 2001;Miller et al., 2007). The regional concept (sensu lato) implies a geographically-defined sub-global area, which could be a continent, a country, a state or a province (IUCN, 2012). Guidelines for the application of IUCN Red List criteria at regional levels were published (Gärdenfors, 2001;IUCN, 2012) and assessment of the conservation status of species at sub-global scales were developed (Miller et al., 2007;Azam et al., 2016). These updated guidelines represent the standardized processes that must be applied (without deviation or modification) if regional Red List authorities wish to state that their assessments follow the IUCN system (IUCN, 2012: p. 3). Nevertheless, a risk of subjectivity in the regional adjustment process has been identified, along with the need for more complementary information for identifying national priorities and responsibilities with regards to species' conservation (Keller and Bollmann, 2004;Rodríguez et al., 2004;Keller et al., 2005).
Despite efforts to create objective processes for assessing species' extinction risks at sub-global scales, some problems still persist (Gärdenfors, 2001;Martín, 2009;Seoane et al., 2011). Natural scarcity or rarity at local scales may result in the overestimation of threat levels and so the ecological bases of their rarity should be taken much more into account (Martín, 2009;Seoane et al., 2011). Moreover, the second step of the IUCN regional guidelines, which consists of adapting categories according to vaguely formulated terms such as the level of contact with neighbouring populations (Keller et al., 2005(Keller et al., : p. 1828), leaves room for interpretation by assessors and a degree of subjectivity (Eaton et al., 2005;Keller et al., 2005). This step is particularly important when assessing very mobile species, such as birds at small landlocked sites (Keller and Bollmann, 2004;Keller et al., 2005) since it requires a lot of accurate data and knowledge on species that is not always available and can be difficult to obtain (Keller and Bollmann, 2004;Eaton et al., 2005;Keller et al., 2005). Indeed, available data for regional assessments are sometimes limited to local populations, and obtaining data for cross-boundary populations is often difficult (Keller et al., 2005). Conversely, in some cases available data may be accurate for large-scale assessments but unsuitable or unreliable for local scales due to limits imposed by their resolution (Hurlbert and Jetz, 2007) and may be negatively affected by confounding methodological factors such as missing data or low number of counts per site (Atkinson et al., 2006).
In addition to these yet unresolved issues, recent concerns about threatened species have led to an increase of regional and global Red Lists, sometimes reflecting different conservation statuses of the same species at different scales. France is a particularly interesting example (Azam et al., 2016), as the conservation status of its birds has been characterized at the global level by IUCN and Birdlife International (IUCN, 2015b), at European level by Birdlife International (BirdLife International, 2015), at the national level by the French Committee for IUCN (UICN France et al., 2011), and at the sub-national level (French regions) by local organizations (Flitti and Vincent-Martin, 2013;LPO Alsace, 2014). As a result, each bird species may be classified under five different conservation statuses in France (detailed below). Thus, in light of the increasing number of Red Lists referring to the same species, the simple question 'what is the conservation status of the considered bird species?' has become far more complex for all concerned. The choice to use one or other of these different classifica-tions according to the particular situation may rest on the understanding of their respective characteristics and limits. A quantitative comparative analysis of their characteristics is thus required, but to our knowledge is not yet available.
In this study, we used a set of game bird species as a case study to compare several Red Lists (from sub-national to global scale) that classify the conservation status of birds in France. Previous studies in other countries have compared global and regional Red Lists or analyzed the regional assessment of certain taxa. Their results highlight the fact that regional lists tend to lead up to higher threat statuses than the global IUCN Red Lists due, for instance, to the scale-dependent chance of meeting Red List criteria (hereafter referred to as 'scale effect') and to the 'edge effect' of small-scale assessments of the conservation status of species (Keller et al., 2005;Milner-Gulland et al., 2006;Brito et al., 2010). Theoretically, different assessments should agree if they use (for instance, for endemic species) a common methodology, identical species and the same information. Nevertheless, in light of the results from other countries (Milner-Gulland et al., 2006), we expect that in this study the assessments of the status of birds (including numerous non-endemic species) at smaller scales would result in higher threat statuses compared to larger scale assessments.
As mentioned above, the variability in conservation status between lists at different scales may be due to 'scale-' or 'edge-effects', in particular in regional assessments, when criterion D of the IUCN is used, which considers small population size (Eaton et al., 2005;Keller et al., 2005). To evaluate whether the variability in conservation statuses between lists at different scales in France is due to the 'population size effect', we compared the proportions of criteria used in assessments. On the basis of the associations reported in previous studies (Eaton et al., 2005;Keller et al., 2005), we expected that a higher proportion of species would be classified as threatened based on criterion D at small-scale assessments compared to large-scale lists.
The differences in status due to the scale of the assessment could also be associated with disparities in the type of information used to evaluate species' conservation status (de Grammont and Cuarón, 2006). One hypothesis suggests that some lists and evaluations may depend primarily on data from grey literature, which conflicts with the comprehensive, scientifically rigorous and transparent nature of Red Lists (Mrosovsky and Godfrey, 2008). To evaluate whether variability in conservation status between different lists is linked to the differences in the type of information that were synthethized, the proportions of categories of information used in French Red Lists were compared. On the basis of the reported greater likelihood of subjectivity in small-scale assessments (Eaton et al., 2005;Keller et al., 2005), we predicted that grey literature would play a more important role in small-scale Red Lists than in large-scale assessments.
The distinction between scientific and grey literature has been widely debated (Schöpfel, 2006;Mrosovsky and Godfrey, 2008) and is detailed below in the methods section. Among other characteristics, grey literature is usually less available, less reliable and more incomplete than scientific literature (Conn et al., 2003;Schöpfel, 2006;Mrosovsky and Godfrey, 2008). Moreover, literature unavailability has been proposed as a factor that might be associated with unreliable citations (Todd and Ladle, 2008). Thus, according to this hypothesis, we predicted that grey literature might be more associated with cases of no supported citations than scientific literature.

Study species
Highly mobile species may be more affected by the challenges posed by regional adaptations to the IUCN system, particularly when applied at small geographical scales and when data on across-boundary population dynamics are required (Akçakaya et al., 2000;Keller and Bollmann, 2004;Keller et al., 2005;IUCN, 2012). To ensure comparability, analyses should be based on overlapping sets of species; as well, data sets should contain well-studied species in order to make quantitative analyses possible. Many birds are widely distributed mobile species that represent a well-studied taxonomic group (van Jaarsveld et al., 1998;Butchart et al., 2004;Fazey et al., 2005). However, it is a vast taxonomic group and so we chose to analyse a subset for potential heterogeneity between Red Lists. Among birds, game species may be the object of multiple and additive conservation actions, such as monitoring programs conducted by wildlife recreationists with a variety of different motivations (Cooper et al., 2015) and hence these species may be better studied than others. Thus, we focused our analyses on 66 game bird species in France (table 1).

IUCN-type Red Lists
France is a biogeographically diverse country crossed by many migratory flyways and there are several Red Lists for birds to assess their conservation status. To ensure comparability, in this study we only analyzed lists based on the IUCN classification system, as these lists are expected to follow the standardized processes detailed in the IUCN guidelines (IUCN, 2001(IUCN, , 2012. The global IUCN Red List was updated during the development of this study at the end of 2015. This allowed us to verify the potential effects of this update on the potential scale-dependent heterogeneity in Red Lists. As a result, we compared 15 lists in this study (table 2): two versions (a pre-update version from October 2015 and an updated version from December 2015) of the IUCN global Red List of Birds (IUCN, 2015a(IUCN, , 2015b; the European Red Lists of Birds (BirdLife International, 2015), available at regional levels for (i) geographical Europe (Europe) and (ii) the member states of the European Union in 2012 (EU27); the national (French) Red List of Birds (UICN France et al., 2011); and 10 sub-national Red Lists of Birds for Île-de-France (IDF) (Birard et al., 2012), Limousin (Roger and Lagarde, 2015), Pays de la Loire (PaysLoire) (Marchadour et al., 2014), Midi-Pyrénées (MidiPyr) (Fremaux, 2015), Provence-Alpes-Côte-d'Azur (PACA) (Flitti and Vincent-Martin, 2013), Languedoc-Roussillon (Lan-gRou) (Meridionalis, 2015), Centre (Nature Centre, 2013), Alsace (LPO Alsace, 2014), Bretagne (Bretagne Environnement, 2015) and Bourgogne (Bourg) (Abel et al., 2015). All these assessments declared to have followed the IUCN system; the considered sub-national lists were approved and labelled by the UICN French committee, a process designed to guarantee that sub-national Red Lists follow IUCN guidelines (UICN France, 2011; http://uicn.fr/ etat-des-lieux-listes-rouges-regionales/).

Compiled data
Conservation status and criteria For all the species on each Red List, we compiled the conservation status (LC, least concern; NT, near threatened; VU, vulnerable; EN, endangered; CR, critically endangered; RE, regionally extinct) and, whenever possible, the criteria underlying it (A, population reduction; B, geographic range; C, small population size and decline; D, very small or restricted regional population) (IUCN, 2001(IUCN, , 2012. Criterion E (based on quantitative analyses that estimate the probability of extinction; IUCN, 2012) was not present in our sample.
In a few cases (6.56 %), the species were qualified as Threatened based on multiple criteria. In such cases, we used the first used criterion according to the classification E > A > B > C > D because small regional populations or local rarities do not necessarily imply a high risk of extinction (Harnik et al., 2012) while criterion A deals with species that are at risk because of a steep rate of decline (Collen et al., 2016).

Bibliographical categories
The references cited in the Red Lists were compiled and classified in four categories on the basis of the following definitions. The first category (A) is 'scientific literature' sensu stricto, that is, work published in scientific journals that is indexed in scientific data sources (Björk et al., 2010) and peer reviewed (Steven et al., 2011). Scientific literature meets methodological standards (Conn et al., 2003), is easily available (Pyšek et al., 2008) and represents, notwithstanding certain flaws, a widely accepted strategy for ensuring quality control in scientific research (Ferreira et al., 2015). The second category (B) includes 'referenced books', in particular, books identified by an International Standard Book Number (ISBN) and referenced academic publications, such as PhD theses. These data sources may be accessible yet allow some freedom from the peer-review process, and thus they summarize science from a personal perspective to present ideas in a liberating manner (McWilliams and Bauchinger, 2012). The third category (C) is 'grey literature' that includes publications that are not peerreviewed (Conn et al., 2003) and articles that appear in non-indexed journals. Such articles are difficult to identify and to access through classical routes and often lack robust methodology and traceability (Corlett, 2011;Friess and Webb, 2011). Finally, the fourth category (D), 'expert opinion', includes estimates based on empirical knowledge or even field experience that is not to be found even in grey literature.

Citation categories
We compiled and classified the way in which data, citations and sources were included in the IUCNbased assessments First, we analysed whether it was possible to link the detailed information in assessments to citations and sources. Next, the reliability of the cited information was classified through consensus between the two authors (MC and MS) into one of four categories as defined in Todd et al. (2010): (1) There is 'Clear support', when the cited article provides unequivocal support of the assertion via either statements in the text or the data presented.
(2) 'Ambiguous', when the material (either text or data) in the cited article has been interpreted one way, but could also be interpreted in other ways, including the opposite. The assertion in the primary article is supported by a portion of the cited article, but that portion runs contrary to the overall thrust of the cited article. The assertion includes two or more components, but the cited article only supports one of them. (3) 'No support', when the cited article does not in any way substantiate the assertion via either statements in the text or the data presented. The cited article may even contradict the assertion in the primary article. (4) 'Empty citation', when the cited article simply cites other articles that support the assertion made in the primary article. Citing a review article is acceptable if the support for the assertion is, for example, a new insight or opinion offered by the author(s) of the review.
As in Todd et al. (2010), if the cited article was classified as 'empty citation' plus 'no support', 'no support' took precedence. If the cited article was classified as 'empty citation' plus 'ambiguous', 'ambiguous' took precedence. Another citation category ('unverifiable') was created for expert opinions and for cases in which the lack of published or available documents make the assessment of the links to the information source impossible.

Statistical analysis
We used Fisher's tests (Millot, 2011) to test if the proportion of threatened categories was significantly different between Red Lists for the overlapping sets of species, taking into account the small sample size in some tests and the need for standardized analyses for comparisons. Additionally, the odds ratio (ranging from 0 to infinity) in bilateral tests on contingency tables was used to analyze the direction of detected differences (Millot, 2011). The further away the odds ratio was from 1 towards infinity, the more the first list in the test was characterized by the considered factor. The more the second list in the test was characterized by the considered factor, the closer the odds ratio was to 0.  Using Fisher's tests and based on the odds ratio we also examined (i) the proportion of adduced criteria for threatened categories, (ii) the proportion of bibliographical categories and (iii) the proportion of citation categories in assessments among the different scales of Red Lists. Multiple estimation of significance values can increase type I errors (i.e., rejecting the null hypothesis H 0 when H 0 is true). Thus, in tables 3-6 we also present p-values corrected using a Benjamini-Hochberg procedure (BH; Benjamini and Hochberg, 1995) to control for potentially false discovery rates (FDR), the expected proportion of 'discoveries' (rejected null hypothesis H 0 ) that might be false (incorrect rejection). Nevertheless, this type of correction incurs reduction in power. Using this kind of procedure for the more detailed studies would have implied a lower probability of finding significant results, increasing the risk of type II errors, sometimes to an unacceptable level (not rejecting H 0 when H 0 is false) (Nakagawa, 2004). Ecological results suggested by BH p-values were generally similar to those indicated by the p-value of Fisher's tests. Thus, in the text we only detail and discuss the results from Fisher's tests.

Traceability
Although all 66 species from our sample were included in the global and national Red Lists, only 17-63 of these species were included in the European and sub-national lists (table 1). The classification criteria were clearly presented in almost all studied lists, the exceptions being the lists for Midi-Pyrénées, Provence-Alpes-Côte-d'Azur and Bretagne. Sources and clear links between data in assessments and sources were simultaneously available for global and European Red Lists but not for national and sub-national lists. Nevertheless, sources (but not links) were given for the Bourgogne, Limousin, Île-de-France, Midi-Pyrénées and Provence-Alpes-Côte-d'Azur sub-national lists (table 2). Accessibility and language constrained the literature review. The analysis of the types of information sources was based on 76.1 % of sources (303 identifiable sources out of 398) for the global Red List, 91.2 % of sources (1078/1182) for the European Red List and 100 % of sources for sub-national lists (3-97 sources, depending on the list). The links between information and sources were analyzed on the basis of (1) the traceable and accessible sources that determined the conservation status for the global Red List and (2) the relevant sources for France that were traceable and accessible in the European Red List. In the current global Red List, 48.9 % (65/133) of citations were untraceable or not reported in the bibliography section. Of the 68 traceable sources (51.1 % of sources) we managed to obtain 49 (72.1 %), thereby allowing us to analyze 51 % (177/347) of all links between information and source used in the evaluation of the conservation status of the species from our sample. For the European Red List, the sources cited for France represent 11.2 % (132) of the whole bibliography for the study species. In this sample, 3.8 % (5/132) of citations were untraceable. Of the 127 sources of identifiable literature (96.2 % of sources), 111 were accessible and obtained (87.4 %), thereby allowing us to analyze 83.2 % (559/672) of the links between information and source cited for France.
These inequalities in information availability determined which comparisons between lists and scales could be performed.

Conservation status
The proportion of conservation statuses attributed varied between the Red Lists at different spatial scales ( fig. 1, table 3). Overall, Red Lists at larger scales were associated with more 'least concern' statuses and fewer 'threatened' statuses. Nevertheless, European and national lists did not differ significantly and proportions of statuses between lists at the same spatial scale did not show any significant differences when examined directly.
Nonetheless, the comparisons of the two global lists, before and after the update at the end of 2015, with the sub-global lists did not give identical results. Overall, the pre-update global Red List exhibited fewer threatened statuses than the European and national lists; however, these differences disappeared when we compared them with the post-update global Red List. In addition, the pre-update global Red List exhibited few more differences than the post-update global Red List in comparison with the sub-national Red Lists.

Red List criteria
The proportion of the Red List criteria that was applied varied between the lists at different spatial scales ( fig. 2, table 4). Criterion 'A' (reduction in population size) was used significantly more in the assessment of species at larger scales (global and European lists) than in national and sub-national lists. On the other hand, criterion 'D' (small regional population) was used more on sub-national lists than in national, European and global Red Lists.

Some significant differences appeared between Red
Lists in adducing criteria B and C, no clear trends emerged when examining spatial scale. Overall, the proportions of adduced Red List criteria did not vary between lists at comparable spatial scales (global and European), although a few exceptions did occur between sub-national lists (table 4).

Bibliographical categories
The Red Lists at different scales used different literature. For instance, Red Lists at the global scale were more based on scientific literature sensu stricto (about 50 %) than European (about 10 %) and subnational Red Lists (0-6 %), which were, in turn, more based on grey literature ( fig. 3, table 5). The Red List for geographical Europe was more based on scientific literature (12 %) and less on grey literature (53 %) than the list for the EU27 (7 % and 60 %, respectively). The European lists were more based on expert opinion (16-18 %) and less on books (15-17 %) than the global lists (0.4-1.6 % and 24.5-26.1 %, respectively). At comparable spatial scales, the updated global Red List was based on a smaller proportion of scientific literature (47 %) than the previous version (56 %) ( fig. 3). There were also a few other differences between sub-national Red Lists related to their use of books and grey literature as sources (table 5).

Citation categories
The global and European Red Lists (the only ones in which the links between information and source could be analysed) had different proportions of citation categories ( fig. 4, table 6). The results revealed that 'clearly supported' assertions were significantly more common in the global Red List (83 %) than in the  In the global Red List, there were no significant differences in citation categories between the citations from different bibliographical category. However, in regard to the European List, 'ambiguous' assertions were more frequently linked to grey literature than to books and 'not supported' assertions were more frequently linked to books than to scientific articles and grey literature.

Discussion
We conducted this study for game bird species in France and therefore the applicability of the results to other species or other geographical areas still remains an open question. Despite the huge volume of work that was required for this study, the sample size was still on occasions a limiting factor when attempting to unravel some of the less evident differences, for instance in comparisons at equivalent geographic scales. Notwithstanding this limitation, this study highlights clear trends in scale-dependent patterns.

IUCN standards and transparency
We found clear differences in the transparency and the traceability of the assessment processes used for Red Lists at different geographic scales. Although all the Lists considered in this study were certified by logos and labels that are directly linked to the IUCN guidelines (or indirectly through the IUCN French committee), national and sub-national lists in France were the product of data and processes that could not be verified and were not presented in a transparent and accessible way. Thus, the national and sub-national Red Lists in France do not fully comply with the standardized processes 'to be applied without deviation or modification, if regional Red List authorities wish to state that their assessment follows the IUCN system' (IUCN, 2012

Conservation status
Our results for game birds in France agreed with the predictions derived from previous studies Keller et al., 2005) and highlight the fact that Red Lists at smaller geographical scales frequently give higher threatened statuses than those at larger scales. The reason for this might be local variability in the status of species when compared to conservation status at larger scales. Species may exibit a threatened status first at a local level prior to exibiting threatened status at a global level or even, in occasion,    . Nevertheless, the averagely higher threatened statuses of birds on Red Lists at smaller scales might also be linked to the risk of pessimistic assessments at regional level owing to over-narrow scale-dependent geographical focus (Keller et al., 2005;IUCN, 2012). As seen above, a risk of subjectivity in the regional adjustment process has already been identified (Keller and Bollmann, 2004;Rodríguez et al., 2004;Keller et al., 2005), in addition to risks of 'scale-' and 'edge-effect' (Keller et al., 2005;Milner-Gulland et al., 2006;Brito et al., 2010). Thus, these potential risks and the observed not supported data that was included in Red Lists does not rule out the possibility that the higher threatened statuses of Red Lists at smaller scales in France might be due, at least in part, to methodological factors or potential bias. Thus, methodological improvements aimed at reducing the risk of subjectivity in the second step of the IUCN regional guidelines and at avoiding the 'scale-' and 'edge-effect' in assessments are needed to strengthen the robustness of the Red Lists.

Red List criteria
Our results also match the predictions derived from previous articles on regional IUCN Red List assess-ments (Eaton et al., 2005;Keller et al., 2005), and underscore the fact that assessments of threatened status in lists at smaller scales in France were predominantly based on the criterion 'D' (small regional population), while criteria linked to reductions of population (criterion 'A') were predominant in Red Lists at larger (European and global) scales. This scale-dependent characteristic of the assessment process may highlight and emphasize 'scale-' or 'edge-effects' in regional Red Lists, as has previously been reported (Eaton et al., 2005, Keller et al., 2005. Variation in the most commonly adduced criteria may reflect what data are available to assess species at the scale in question, and so data availability may hamper the feasibility and reliability of assessments at spatial scales that are too small.

Bibliographical categories
Our results concur with the predictions for bibliographical categories in Red Lists at different geographic scales and reveal that Red Lists at smaller scales were in general based on grey literature, while global Red Lists were more based on scientific articles. These results underlined the greater risk of subjectivity in Red Lists at smaller geographical scales. This risk is even greater in the European Red   List than in global Red Lists due to its higher reliance on expert opinions, even though the revisions of the IUCN assessment criteria in recent decades have been explicitly oriented towards reducing subjectivity (Mace and Lande, 1991;Rodrigues et al., 2006). Thus, these differences may weaken the reliability of regional Red Lists compared to global Red Lists. Furthermore, the greater dependence on scientific literature in the previous global Red List than in the current global Red List highlight the need for greater attention to be paid (1) to ensuring that the IUCN system, designed to provide comprehensive, scientifically rigorous information, is reliably applied at global and regional levels (IUCN, 2012), and (2) to preventing the current uncertain assessment processes at regional level, which use predominantly grey literature and expert opinions, from being increasingly applied at a global scale. If society collectively wants sciencebased and reliable Red Lists at small spatial scales and if peer-reviewed literature is (as it currently is) the widely accepted strategy for ensuring quality control in scientific research (Ferreira et al., 2015), the publication of small-scale studies in peer-reviewed journals will be necessary even despite the inherent difficulties of the publication process. The further integration of these potential needs by the editors of scientific journals might help promote more reliable Red Lists at regional scales in the future.

Citation categories
Finally, our results agreed with the predictions on the citation categories and show that 'not supported' assertions were frequently linked to grey literature and books. Nevertheless, our results also highlighted the high degree to which the type of Red List affects these results. The global Red List is predominantly based on assertions 'clearly supported' by the cited references. 'Ambiguous' and 'not supported' citations were a minority (less than 20 %) in the global Red List but were a majority (more than 65 %) in our sample from the European Red List. Numerous citations of books in the European Red List were particularly questionable, for instance, old references that were cited (e.g. from 1964, 1977 or 1994) as support for assertions regarding recent short-term population trends (e.g. for Alauda arvensis, Aythya ferina, Gallinago gallinago, Limosa limosa, Numenius arquata, Tetrao urogallus). These results underline the fact that numerous assertions that were either ambiguous or not supported by the cited books and grey literature were included in regional assessments. Thus, the studied Red Lists, focused at different geographic scales, exhibit significant heterogeneity in both fundaments and reliability. This highlights the need for additional reviewing processes for sub-global assessments and, in particular, checks of    the accuracy of links between citations and primary sources. Furthermore, the relative lack of data and the difficulties of the publication process may potentially increase the temptation to use grey literature in Red List assessments. Nevertheless, our results reveal that such practices increase the risk of inclusion of ambiguous and not supported data in regional assessments. Compensating for a lack of data (Butchart and Bird, 2010) by using grey literature may misrepresent the situation of species that would otherwise be regarded as 'data deficient' according to IUCN guidelines (DD conservation status), and thus may reduce the visibility of the ignored information that could justify financial support for additional research and species monitoring. Consequently, conservation decisions may be associated with an 'assessment dilemma': should we report data deficiencies strictly to highlight the need for further research and thus have to confront potential delays in evidence-based management?, or should we use all available information to promote reactive management, albeit at the risk of using not-supported and/or unreliable data in assessments, thereby jeopardizing the credibility of Red Lists and reducing the visibility of needs to improve species monitoring? Further studies directly focused on this dilemma could have constructive implications for the monitoring and consensual conservation of species.

Conclusion
This study mainly revealed information about monitoring schemes, data collection and availability at different spatial scales in birds from Europe and France in particular. Sub-national to global Red Lists differed in regard to (i) the reported conservations status, (ii) the transparency and traceability of assessments, (iii) the most commonly adduced criteria, (iv) the categories of the sources synthetized during assessments, (v) and the reliability of assertions compared to cited references. Such variability between lists in terms of both data and transparency confirms that the sources used in the global Red List were cited as reliably as usual in ecological sciences (Todd et al., 2007(Todd et al., , 2010. However, there were many ambiguous and not supported citations on the European Red List and unverifiable assessments on the national and sub-national lists. These results thus open the door for further analysis and improvements of the reliability of Red Lists at regional levels and for other taxa to strengthen evidence-based wildlife management and avoid the decrease in the credibility and prestige of IUCN-based Red Lists (Mrosovsky, 1997;Mrosovsky and Godfrey, 2008) in the eyes of researchers, the general public and other stakeholders.