Deep learning (DL) algorithms are the state of the art in automated classification of wildlife camera trap images. The challenge is that the ecologist cannot know in advance how many images per ...species they need to collect for model training in order to achieve their desired classification accuracy. In fact there is limited empirical evidence in the context of camera trapping to demonstrate that increasing sample size will lead to improved accuracy.
In this study we explore in depth the issues of deep learning model performance for progressively increasing per class (species) sample sizes. We also provide ecologists with an approximation formula to estimate how many images per animal species they need for certain accuracy level a priori. This will help ecologists for optimal allocation of resources, work and efficient study design.
In order to investigate the effect of number of training images; seven training sets with 10, 20, 50, 150, 500, 1000 images per class were designed. Six deep learning architectures namely ResNet-18, ResNet-50, ResNet-152, DnsNet-121, DnsNet-161, and DnsNet-201 were trained and tested on a common exclusive testing set of 250 images per class. The whole experiment was repeated on three similar datasets from Australia, Africa and North America and the results were compared. Simple regression equations for use by practitioners to approximate model performance metrics are provided. Generalizes additive models (GAM) are shown to be effective in modelling DL performance metrics based on the number of training images per class, tuning scheme and dataset.
Overall, our trained models classified images with 0.94 accuracy (ACC), 0.73 precision (PRC), 0.72 true positive rate (TPR), and 0.03 false positive rate (FPR). Variation in model performance metrics among datasets, species and deep learning architectures exist and are shown distinctively in the discussion section. The ordinary least squares regression models explained 57%, 54%, 52%, and 34% of expected variation of ACC, PRC, TPR, and FPR according to number of images available for training. Generalised additive models explained 77%, 69%, 70%, and 53% of deviance for ACC, PRC, TPR, and FPR respectively.
Predictive models were developed linking number of training images per class, model, dataset to performance metrics. The ordinary least squares regression and Generalised additive models developed provides a practical toolbox to estimate model performance with respect to different numbers of training images.
•Deep Network classifier sample size requirements investigated for long-term wildlife monitoring sites.•Training sample size was linked to model performance metrics such as accuracy.•Logarithmic trends observed between training sample size and accuracy, precision, recall and false positive rate.•Model performance asymptotes with 150–500 images per class providing good accuracy.•The effects of data set, samples per class, model architectures and tuning strategies were investigated and compared.
Camera traps are electrical instruments that emit sounds and light. In recent decades they have become a tool of choice in wildlife research and monitoring. The variability between camera trap models ...and the methods used are considerable, and little is known about how animals respond to camera trap emissions. It has been reported that some animals show a response to camera traps, and in research this is often undesirable so it is important to understand why the animals are disturbed. We conducted laboratory based investigations to test the audio and infrared optical outputs of 12 camera trap models. Camera traps were measured for audio outputs in an anechoic chamber; we also measured ultrasonic (n = 5) and infrared illumination outputs (n = 7) of a subset of the camera trap models. We then compared the perceptive hearing range (n = 21) and assessed the vision ranges (n = 3) of mammals species (where data existed) to determine if animals can see and hear camera traps. We report that camera traps produce sounds that are well within the perceptive range of most mammals' hearing and produce illumination that can be seen by many species.
Camera trapping is widely used in ecological studies. It is often considered nonintrusive simply because animals are not captured or handled. However, the emission of light and sound from camera ...traps can be intrusive. We evaluated the daytime and nighttime behavioral responses of four mammalian predators to camera traps in road‐based, passive (no bait) surveys, in order to determine how this might affect ecological investigations. Wild dogs, European red foxes, feral cats, and spotted‐tailed quolls all exhibited behaviors indicating they noticed camera traps. Their recognition of camera traps was more likely when animals were approaching the device than if they were walking away from it. Some individuals of each species retreated from camera traps and some moved toward them, with negative behaviors slightly more common during the daytime. There was no consistent response to camera traps within species; both attraction and repulsion were observed. Camera trapping is clearly an intrusive sampling method for some individuals of some species. This may limit the utility of conclusions about animal behavior obtained from camera trapping. Similarly, it is possible that behavioral responses to camera traps could affect detection probabilities, introducing as yet unmeasured biases into camera trapping abundance surveys. These effects demand consideration when utilizing camera traps in ecological research and will ideally prompt further work to quantify associated biases in detection probabilities.
Camera traps are heralded as nonintrusive survey tools. We evaluated animal responses to infrared camera traps proving that behavioral responses may effect detection. Therefore introducing an as yet unmeasured bias in abundance surveys.
•Development of a multi-purpose livestock vocalisation classification algorithm.•Comparison of MFCC and Wavelet features.•Machine learning classification using a Support Vector Machine.•High accuracy ...obtained (sheep: 99.29%, cattle: 95.78%, dogs: 99.67%).•Wavelet approach faster to compute (14.81% – 15.38% faster).
Livestock vocalisations have been shown to contain information related to animal welfare and behaviour. Automated sound detection has the potential to facilitate a continuous acoustic monitoring system, for use in a range Precision Livestock Farming (PLF) applications. There are few examples of automated livestock vocalisation classification algorithms, and we have found none capable of being easily adapted and applied to different species’ vocalisations. In this work, a multi-purpose livestock vocalisation classification algorithm is presented, utilising audio-specific feature extraction techniques, and machine learning models. To test the multi-purpose nature of the algorithm, three separate data sets were created targeting livestock-related vocalisations, namely sheep, cattle, and Maremma sheepdogs. Audio data was extracted from continuous recordings conducted on-site at three different operational farming enterprises, reflecting the conditions of real deployment. A comparison of Mel-Frequency Cepstral Coefficients (MFCCs) and Discrete Wavelet Transform-based (DWT) features was conducted. Classification was determined using a Support Vector Machine (SVM) model. High accuracy was achieved for all data sets (sheep: 99.29%, cattle: 95.78%, dogs: 99.67%). Classification performance alone was insufficient to determine the most suitable feature extraction method for each data set. Computational timing results revealed the DWT-based features to be markedly faster to produce (14.81 – 15.38% decrease in execution time). The results indicate the development of a highly accurate livestock vocalisation classification algorithm, which forms the foundation for an automated livestock vocalisation detection system.
A time‐consuming challenge faced by camera trap practitioners is the extraction of meaningful data from images to inform ecological management. An increasingly popular solution is automated image ...classification software. However, most solutions are not sufficiently robust to be deployed on a large scale due to lack of location invariance when transferring models between sites. This prevents optimal use of ecological data resulting in significant expenditure of time and resources to annotate and retrain deep learning models.
We present a method ecologists can use to develop optimized location invariant camera trap object detectors by (a) evaluating publicly available image datasets characterized by high intradataset variability in training deep learning models for camera trap object detection and (b) using small subsets of camera trap images to optimize models for high accuracy domain‐specific applications.
We collected and annotated three datasets of images of striped hyena, rhinoceros, and pigs, from the image‐sharing websites FlickR and iNaturalist (FiN), to train three object detection models. We compared the performance of these models to that of three models trained on the Wildlife Conservation Society and Camera CATalogue datasets, when tested on out‐of‐sample Snapshot Serengeti datasets. We then increased FiN model robustness by infusing small subsets of camera trap images into training.
In all experiments, the mean Average Precision (mAP) of the FiN trained models was significantly higher (82.33%–88.59%) than that achieved by the models trained only on camera trap datasets (38.5%–66.74%). Infusion further improved mAP by 1.78%–32.08%.
Ecologists can use FiN images for training deep learning object detection solutions for camera trap image processing to develop location invariant, robust, out‐of‐the‐box software. Models can be further optimized by infusion of 5%–10% camera trap images into training data. This would allow AI technologies to be deployed on a large scale in ecological applications. Datasets and code related to this study are open source and available on this repository: https://doi.org/10.5061/dryad.1c59zw3tx.
Achieve location invariant camera trap object detectors by using publicly available image data. Optimize object detectors for domain‐specific application using infusion of small subsets of camera trap images.
Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and ...resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.
Camera trapping is a relatively new addition to the wildlife survey repertoire in Australia. Its rapid adoption has been unparalleled in ecological science, but objective evaluation of camera traps ...and their application has not kept pace. With the aim of motivating practitioners to think more about selection and deployment of camera trap models in relation to research goals, we reviewed Australian camera trapping studies to determine how camera traps have been used and how their technological constraints may have affected reported results and conclusions. In the 54 camera trapping articles published between 1991 and 2013, mammals (86%) were studied more than birds (10%) and reptiles (3%), with small to medium-sized mammals being most studied. Australian camera trapping studies, like those elsewhere, have changed from more qualitative to more complex quantitative investigations. However, we found that camera trap constraints and limitations were rarely acknowledged, and we identified eight key issues requiring consideration and further research. These are: camera model, camera detection system, camera placement and orientation, triggering and recovery, camera trap settings, temperature differentials, species identification and behavioural responses of the animals to the cameras. In particular, alterations to animal behaviour by camera traps potentially have enormous influence on data quality, reliability and interpretation. The key issues were not considered in most Australian camera trap papers and require further study to better understand the factors that influence the analysis and interpretation of camera trap data and improve experimental design.
Foot-hold trapping is an important tool used in pest management programs in countries such as Australia, New Zealand and in North America. Research on humane trapping methods including the addition ...of sedatives (Tranquilizer Trap Device) and toxins (Lethal Trap Device) to foot-hold traps to improve the welfare of trapped pest animals is important. Lethal Trap Devices (LTD) are being tested in Australia to determine if deploying a toxin with a foot-hold trap is effective at delivering a lethal dose of toxin to trapped predators. This study aimed to test whether fitting an LTD to two different foot-hold jaw traps (Victor Soft catch #3 and Bridger #5) would affect the jaw closure time and as such affect capture rates. We found that two spring Victor Soft catch traps were faster (20.91, SD 0.72 ms) than four spring Bridger #5 traps (26.79, SD 0.48 ms) even when fitted with a Lethal Trap Device. Fitting a Lethal Trap Device to either of these trap models did not affect closure time and as such would not have any effect on capture efficacy.
We present
a software tool for the automated identification of animal species from camera trap images
is intended to be used by ecologists both in the field and in the office. Users can download a ...pre-trained model specific to their location of interest and then upload the images from a camera trap to a laptop or workstation
will identify animals and other objects (e.g., vehicles) in images, provide a report file with the most likely species detections, and automatically sort the images into sub-folders corresponding to these species categories. False Triggers (no visible object present) will also be filtered and sorted. Importantly, the
software operates on the user's local machine (own laptop or workstation)-not via internet connection. This allows users access to state-of-the-art camera trap computer vision software in situ, rather than only in the office. The software also incurs minimal cost on the end-user as there is no need for expensive data uploads to cloud services. Furthermore, processing the images locally on the users' end-device allows them data control and resolves privacy issues surrounding transfer and third-party access to users' datasets.
The temporal niche has received less attention than the spatial niche in ecological research on free-ranging animals. Most studies that have examined the effect of season on the diel activity ...patterns of small mammals have been conducted in temperate climates where daily temperatures and day length are important predictors of activity. Extremely seasonal rainfall in northern Australia possibly exerts a strong influence on mammalian activity due to the influx of food resources. Using camera traps set over a 3-year period, we documented the diel activity patterns of 5 species of small mammals co-occurring on Groote Eylandt, in the wet-dry tropics of northern Australia. All species were strictly nocturnal but some responded differently to the effect of season. The northern quoll (Dasyurus hallucatus) displayed a bimodal activity pattern that did not differ between the seasons. The northern brown bandicoot (Isoodon macrourus) displayed bimodal activity in the wet season and unimodal activity in the dry. The more sustained activity of I. macrourus in the dry season may be the result of this species utilizing more cellulose-rich food in times of lower insect abundance, whereas D. hallucatus possibly exhibits lower dietary plasticity. The northern hopping-mouse (Notomys aquilo) was consistently active throughout the night in both seasons. Conversely, the delicate mouse (Pseudomys delicatulus) showed great plasticity in its nocturnal activity which altered significantly depending on both season and habitat. The disparity in activity pattern between these 2 rodents possibly reflects differences in predation risks. The grassland melomys (Melomysburtoni) was recorded only during the dry season in coastal grassland habitat, when its activity peaked sharply after nightfall. Our study highlights the interspecific variation in small mammal activity between the wet and dry seasons in northern Australia, which may be explained by differences in diet, habitat use, and predation risk in these species.