The composition of species communities is changing rapidly through drivers such as habitat loss and climate change, with potentially serious consequences for the resilience of ecosystem functions on ...which humans depend. To assess such changes in resilience, we analyse trends in the frequency of species in Great Britain that provide key ecosystem functions--specifically decomposition, carbon sequestration, pollination, pest control and cultural values. For 4,424 species over four decades, there have been significant net declines among animal species that provide pollination, pest control and cultural values. Groups providing decomposition and carbon sequestration remain relatively stable, as fewer species are in decline and these are offset by large numbers of new arrivals into Great Britain. While there is general concern about degradation of a wide range of ecosystem functions, our results suggest actions should focus on particular functions for which there is evidence of substantial erosion of their resilience.
The accurate identification of species in images submitted by citizen scientists is currently a bottleneck for many data uses. Machine learning tools offer the potential to provide rapid, objective ...and scalable species identification for the benefit of many aspects of ecological science. Currently, most approaches only make use of image pixel data for classification. However, an experienced naturalist would also use a wide variety of contextual information such as the location and date of recording.
Here, we examine the automated identification of ladybird (Coccinellidae) records from the British Isles submitted to the UK Ladybird Survey, a volunteer‐led mass participation recording scheme. Each image is associated with metadata; a date, location and recorder ID, which can be cross‐referenced with other data sources to determine local weather at the time of recording, habitat types and the experience of the observer. We built multi‐input neural network models that synthesize metadata and images to identify records to species level.
We show that machine learning models can effectively harness contextual information to improve the interpretation of images. Against an image‐only baseline of 48.2%, we observe a 9.1 percentage‐point improvement in top‐1 accuracy with a multi‐input model compared to only a 3.6% increase when using an ensemble of image and metadata models. This suggests that contextual data are being used to interpret an image, beyond just providing a prior expectation. We show that our neural network models appear to be utilizing similar pieces of evidence as human naturalists to make identifications.
Metadata is a key tool for human naturalists. We show it can also be harnessed by computer vision systems. Contextualization offers considerable extra information, particularly for challenging species, even within small and relatively homogeneous areas such as the British Isles. Although complex relationships between disparate sources of information can be profitably interpreted by simple neural network architectures, there is likely considerable room for further progress. Contextualizing images has the potential to lead to a step change in the accuracy of automated identification tools, with considerable benefits for large‐scale verification of submitted records.
Summary
Policy‐makers increasingly demand robust measures of biodiversity change over short time periods. Long‐term monitoring schemes provide high‐quality data, often on an annual basis, but are ...taxonomically and geographically restricted. By contrast, opportunistic biological records are relatively unstructured but vast in quantity. Recently, these data have been applied to increasingly elaborate science and policy questions, using a range of methods. At present, we lack a firm understanding of which methods, if any, are capable of delivering unbiased trend estimates on policy‐relevant time‐scales.
We identified a set of candidate methods that employ data filtering criteria and/or correction factors to deal with variation in recorder activity. We designed a computer simulation to compare the statistical properties of these methods under a suite of realistic data collection scenarios. We measured the Type I error rates of each method–scenario combination, as well as the power to detect genuine trends.
We found that simple methods produce biased trend estimates, and/or had low power. Most methods are robust to variation in sampling effort, but biases in spatial coverage, sampling effort per visit, and detectability, as well as turnover in community composition, all induced some methods to fail. No method was wholly unaffected by all forms of variation in recorder activity, although some performed well enough to be useful.
We warn against the use of simple methods. Sophisticated methods that model the data collection process offer the greatest potential to estimate timely trends, notably Frescalo and occupancy–detection models.
The potential of these methods and the value of opportunistic data would be further enhanced by assessing the validity of model assumptions and by capturing small amounts of information about sampling intensity at the point of data collection.
The increasing availability of digital images, coupled with sophisticated artificial intelligence (AI) techniques for image classification, presents an exciting opportunity for biodiversity ...researchers to create new datasets of species observations. We investigated whether an AI plant species classifier could extract previously unexploited biodiversity data from social media photos (Flickr). We found over 60,000 geolocated images tagged with the keyword “flower” across an urban and rural location in the UK and classified these using AI, reviewing these identifications and assessing the representativeness of images. Images were predominantly biodiversity focused, showing single species. Non-native garden plants dominated, particularly in the urban setting. The AI classifier performed best when photos were focused on single native species in wild situations but also performed well at higher taxonomic levels (genus and family), even when images substantially deviated from this. We present a checklist of questions that should be considered when undertaking a similar analysis.
•AI image classifiers can create biodiversity datasets from social media imagery•Flickr hosts many images of plants; some can be accurately classified to species by AI•Images are spatially aggregated around tourist sites and under-represent native species•Images focused on a single, non-horticultural, plant are most reliably identified
Recent reports of global biodiversity decline make it more important than ever to monitor biodiversity so that we can detect changes and infer their drivers. Online digital media, such as social media images, may be a new source of biodiversity observations, but they are far too numerous for a human to practically review. In this paper we apply an AI image classifier, designed to identify plants from images, to social media imagery to assess this method as a way to generate new biodiversity observations. We find that this approach is able to generate new data on species occurrence but that there are biases in both the social media data and the AI image classifier that need to be considered in analyses. This approach could be applied outside the biodiversity domain, to any phenomena of interest that may be captured in social media imagery. The checklist we provide at the end of this paper should therefore be of interest to anyone considering this approach to generating new data.
We apply newly developed AI image classifiers to large social media image datasets in order to assess whether new datasets of biodiversity observations can be generated in this way. We explore biases in both the dataset of images as well as in the ability of the AI image classifier to make accurate identifications and propose a checklist of questions researchers should ask themselves when considering this approach to data generation.
The structuring of wild animal populations can influence population dynamics, disease spread, and information transfer. Social network analysis potentially offers insights into these processes but is ...rarely, if ever, used to investigate more than one species in a community. We therefore compared the social, temporal and spatial networks of sympatric Myotis bats (M. nattereri (Natterer's bats) and M. daubentonii (Daubenton's bats)), and asked: (1) are there long-lasting social associations within species? (2) do the ranges occupied by roosting social groups overlap within or between species? (3) are M. daubentonii bachelor colonies excluded from roosting in areas used by maternity groups?
Using data on 490 ringed M. nattereri and 978 M. daubentonii from 379 colonies, we found that both species formed stable social groups encompassing multiple colonies. M. nattereri formed 11 mixed-sex social groups with few (4.3%) inter-group associations. Approximately half of all M. nattereri were associated with the same individuals when recaptured, with many associations being long-term (>100 days). In contrast, M. daubentonii were sexually segregated; only a quarter of pairs were associated at recapture after a few days, and inter-sex associations were not long-lasting. Social groups of M. nattereri and female M. daubentonii had small roost home ranges (mean 0.2 km2 in each case). Intra-specific overlap was low, but inter-specific overlap was high, suggesting territoriality within but not between species. M. daubentonii bachelor colonies did not appear to be excluded from roosting areas used by females.
Our data suggest marked species- and sex-specific patterns of disease and information transmission are likely between bats of the same genus despite sharing a common habitat. The clear partitioning of the woodland amongst social groups, and their apparent reliance on small patches of habitat for roosting, means that localised woodland management may be more important to bat conservation than previously recognised.
FreshWater Watch is a global citizen science project that seeks to advance the understanding and stewardship of freshwater ecosystems across the globe through analysis of their physical and chemical ...properties by volunteers. To date, literature concerning citizen science has mainly focused on its potential to generate unprecedented volumes of data. In this paper, we focus instead on the data relating to the volunteer experience and ask key questions about volunteer engagement with the project. For example, we ask what factors influence: a) volunteer data submission following a training event and b) the number of water quality samples volunteers subsequently submit. We used a binomial model to identify the factors that influence the retention of volunteers after training. In addition, we used a generalized linear model (GLM) to examine the factors that affected the number of samples each citizen scientist submitted. In line with other citizen science projects, most people trained did not submit any data, and 1% of participants contributed 47% of the data. We found that the statistically significant factors associated with submission of data after training were: whether training was given on how to upload data, the number of volunteers that attended the training, whether the volunteer was assigned to a research team, the outside temperature, and the average engagement of others in the training group. The statistically significant factors associated with the quantity of data submitted were: the length of time volunteers were active in the project, whether training took place as part of a paid work day, the difficulty of the sampling procedure, how socially involved volunteers were in the project, average sampling group size, and engagement with online learning modules. Based on our results, we suggest that intrinsic motivation may be important for predicting volunteer retention after training and the number of samples collected subsequently. We suggest that, to maximize the contribution of citizen science to our understanding of the world around us, there is an urgent need to better understand the factors that drive volunteer retention and engagement.
•Opportunistic citizen science is valuable for biodiversity trend analysis.•Participants in citizen science vary in their recording behaviour.•Estimated trends in occupancy were robust to differences ...in recorder behaviour.•Species occupancy itself was more sensitive to some aspects of recorder behaviour.
Opportunistic species sightings submitted by citizen science volunteers are a valuable source of species data for trends analysis, as used in biodiversity indicators. However, projects collecting these data give people flexibility where and when to make records, and the recording behaviour of participants varies between individuals. Here we tested the effect of recorder behaviour on outputs of the analysis of temporal biodiversity trends. Using a large (c. 3 million records), 20 year unstructured citizen science dataset of butterfly records in Great Britain, we manipulated recorder behaviour by constructing biased 50% subsamples of the dataset by preferentially including different types of recorders (based on high and low values of four metrics independently describing the temporal, spatial and taxonomic attributes of recorder behaviour). We found that, in general, the three outputs (namely: occupancy trend, precision of the trend, and the estimate of occupancy) showed relatively little deviation from random expectation across most of the different types of recorder behaviour. Occupancy trends showed least deviation, while estimates of occupancy itself showed greatest deviation from the random expectation. Regarding the recorder behaviours, the outputs were most sensitive to variation in ‘recorder potential’, which describes the difference between ‘thorough’ and ‘incidental’ recorders. Importantly, by demonstrating the robustness of occupancy trends to differences in recorder behaviour, this study provides support for the appropriate use of occupancy trend modelling for unstructured citizen science. However, we did not consider change in recorder behaviour over time, so further research is required to assess the impact of this on trend modelling. This study highlights the value of developing solutions to further increase the robustness of biodiversity trend analysis. These solutions should include both analytical developments and enhancements in project design to engage participants.
We investigated the plant‐pollinator interactions of the Mexican grass‐carrying wasp Isodontia mexicana—native to North America and introduced in Europe in the 1960s—through the use of secondary data ...from citizen science observations. We applied a novel data exchange workflow from two global citizen science platforms, iNaturalist and Pl@ntNet. Images from iNaturalist of the wasp were used to query the Pl@ntNet application to identify possible plant species present in the pictures. Simultaneously, botanists manually identified the plants at family, genus and species levels and additionally documented flower color and biotic interactions. The goals were to calibrate Pl@ntNet's accuracy in relation to this workflow, update the list of plant species that I. mexicana visits as well as its flower color preferences in its native and introduced ranges. In addition, we investigated the types and corresponding frequencies of other biotic interactions incidentally captured on the citizen scientists' images. Although the list of known host plants could be expanded, identifying the flora from images that predominantly show an insect proved difficult for both experts and the Pl@ntNet app. The workflow performs with a 75% probability of correct identification of the plant at the species level from a score of 0.8, and with over 90% chance of correct family and genus identification from a score of 0.5. Although the number of images above these scores may be limited due to the flower parts present on the pictures, our approach can help to get an overview into species interactions and generate more specific research questions. It could be used as a triaging method to select images for further investigation. Additionally, the manual analysis of the images has shown that the information they contain offers great potential for learning more about the ecology of an introduced species in its new range.
This study uses citizen‐scientist photographic records to examine pollinator‐plant interactions of the wasp Isodontia mexicana in its native and introduced range. We use a data sharing method by identifying the plants on iNaturalist images of the wasp using the Pl@ntNet API, a plant species identification application. We validated this method by simultaneously identifying the plants by experts and summarizing other interactions seen in the photographs to support the added value of the image data contributed by the public for ecological research.
Volunteer recorders generate large amounts of biodiversity data through citizen science which is used in conservation planning and policy decision‐making. Unstructured sampling, where the volunteer ...can record what they want, where they want, leads to spatial unevenness in these data. While there are many statistical techniques to account for the resulting biases, it may be possible to improve datasets by directing a subset of recorders to sample in the most informative locations, known as adaptive sampling. We investigated the potential for adaptive sampling to improve the performance of species distribution models built on citizen science data using simulated ecological communities.
We simulated ecological assemblages across Great Britain based on current butterfly data and modelled the distributions of each species. We then simulated the sampling of new data based on five adaptive sampling methods (one empirical method based simply on gap‐filling, and four model‐based methods using various measures from the model outputs) and one non‐adaptive method (a method in which recording continued in the current pattern), and re‐ran the species distribution models. In these, we also varied the rate of recording effort that was distributed according to adaptive sampling. The model predictions using the original and adaptively sampled data were compared to true species distributions to evaluate the performance of each method.
We found that all adaptive sampling approaches improved model performance, with greatest improvement for model‐based approaches compared to the empirical sampling method (i.e. simple gap‐filling). All four model‐based adaptive sampling approaches provided similar benefits for model outputs. Improvements in model performance were greatest when the amount of adaptive sampling changed from no uptake to 1% uptake, indicating that only a small amount of change in recorder behaviour is needed to improve model performance.
Directing volunteer recorders to places where records are most needed, based on information from model outputs, can improve species distribution models built on citizen science data, even with minimal uptake of suggested locations. Our results therefore suggest that adaptive sampling by recorders could be beneficial for real‐world citizen science datasets.