Remote sensing data represent one of the most important sources for automized yield prediction. High temporal and spatial resolution, historical record availability, reliability, and low cost are key ...factors in predicting yields around the world. Yield prediction as a machine learning task is challenging, as reliable ground truth data are difficult to obtain, especially since new data points can only be acquired once a year during harvest. Factors that influence annual yields are plentiful, and data acquisition can be expensive, as crop-related data often need to be captured by experts or specialized sensors. A solution to both problems can be provided by deep transfer learning based on remote sensing data. Satellite images are free of charge, and transfer learning allows recognition of yield-related patterns within countries where data are plentiful and transfers the knowledge to other domains, thus limiting the number of ground truth observations needed. Within this study, we examine the use of transfer learning for yield prediction, where the data preprocessing towards histograms is unique. We present a deep transfer learning framework for yield prediction and demonstrate its successful application to transfer knowledge gained from US soybean yield prediction to soybean yield prediction within Argentina. We perform a temporal alignment of the two domains and improve transfer learning by applying several transfer learning techniques, such as L2-SP, BSS, and layer freezing, to overcome catastrophic forgetting and negative transfer problems. Lastly, we exploit spatio-temporal patterns within the data by applying a Gaussian process. We are able to improve the performance of soybean yield prediction in Argentina by a total of 19% in terms of RMSE and 39% in terms of R2 compared to predictions without transfer learning and Gaussian processes. This proof of concept for advanced transfer learning techniques for yield prediction and remote sensing data in the form of histograms can enable successful yield prediction, especially in emerging and developing countries, where reliable data are usually limited.
Land Surface Temperature (LST) is an important resource for a variety of tasks. The data are mostly free of charge and combine high spatial and temporal resolution with reliable data collection over ...a historical timeframe. When remote sensing is used to provide LST data, such as the MODA11 product using information from the MODIS sensors attached to NASA satellites, data acquisition can be hindered by clouds or cloud shadows, occluding the sensors' view on different areas of the world. This makes it difficult to take full advantage of the high resolution of the data. A common solution to interpolating LST data is statistical interpolation methods, such as fitting polynomials or thin plate spine interpolation. These methods have difficulties in incorporating additional knowledge about the research area and learning local dependencies that can help with the interpolation process. We propose a novel approach to interpolating remote sensing LST data in a fixed research area considering local ground-site air temperature measurements. The two-step approach consists of learning the LST from air temperature measurements, where the ground-site weather stations are located, and interpolating the remaining missing values with partial convolutions within a U-Net deep learning architecture. Our approach improves the interpolation of LST for our research area by 44% in terms of RMSE, when compared to state-of-the-art statistical methods. Due to the use of air temperature, we can provide coverage of 100%, even when no valid LST measurements were available. The resulting gapless coverage of high resolution LST data will help unlock the full potential of remote sensing LST data.
Camera traps, an invaluable tool for biodiversity monitoring, capture wildlife activities day and night. In low-light conditions, near-infrared (NIR) imaging is commonly employed to capture images ...without disturbing animals. However, the reflection properties of NIR light differ from those of visible light in terms of chrominance and luminance, creating a notable gap in human perception. Thus, the objective is to enrich near-infrared images with colors, thereby bridging this domain gap. Conventional colorization techniques are ineffective due to the difference between NIR and visible light. Moreover, regular supervised learning methods cannot be applied because paired training data are rare. Solutions to such unpaired image-to-image translation problems currently commonly involve generative adversarial networks (GANs), but recently, diffusion models gained attention for their superior performance in various tasks. In response to this, we present a novel framework utilizing diffusion models for the colorization of NIR images. This framework allows efficient implementation of various methods for colorizing NIR images. We show NIR colorization is primarily controlled by the translation of the near-infrared intensities to those of visible light. The experimental evaluation of three implementations with increasing complexity shows that even a simple implementation inspired by visible-near-infrared (VIS-NIR) fusion rivals GANs. Moreover, we show that the third implementation is capable of outperforming GANs. With our study, we introduce an intersection field joining the research areas of diffusion models, NIR colorization, and VIS-NIR fusion.
Abstract
Automated species identification and delimitation is challenging, particularly in rare and thus often scarcely sampled species, which do not allow sufficient discrimination of infraspecific ...versus interspecific variation. Typical problems arising from either low or exaggerated interspecific morphological differentiation are best met by automated methods of machine learning that learn efficient and effective species identification from training samples. However, limited infraspecific sampling remains a key challenge also in machine learning. In this study, we assessed whether a data augmentation approach may help to overcome the problem of scarce training data in automated visual species identification. The stepwise augmentation of data comprised image rotation as well as visual and virtual augmentation. The visual data augmentation applies classic approaches of data augmentation and generation of artificial images using a generative adversarial networks approach. Descriptive feature vectors are derived from bottleneck features of a VGG-16 convolutional neural network that are then stepwise reduced in dimensionality using Global Average Pooling and principal component analysis to prevent overfitting. Finally, data augmentation employs synthetic additional sampling in feature space by an oversampling algorithm in vector space. Applied on four different image data sets, which include scarab beetle genitalia (Pleophylla, Schizonycha) as well as wing patterns of bees (Osmia) and cattleheart butterflies (Parides), our augmentation approach outperformed a deep learning baseline approach by means of resulting identification accuracy with nonaugmented data as well as a traditional 2D morphometric approach (Procrustes analysis of scarab beetle genitalia). Deep learning; image-based species identification; generative adversarial networks; limited infraspecific sampling; synthetic oversampling.
Monitoring insect populations is vital for estimating the health of ecosystems. Recently, insect population decline has been highlighted both in the scientific world and the media. Investigating such ...decline requires monitoring which includes adequate sampling and correctly identifying sampled taxa. This task requires extensive manpower and is time consuming and hard, even for experts, if the process is not automated. Here we propose DeepABIS based on the concepts of the successful Automated Bee Identification System (ABIS), which allowed mobile field investigations including species identification of live bees in field. DeepABIS features three important advancements. First, DeepABIS reduces the efforts of training the system significantly by employing automated feature generation using deep convolutional networks (CNN). Second, DeepABIS enables participatory sensing scenarios employing mobile smart phones and a cloud-based platform for data collection and communication. Third, DeepABIS is adaptable and transferable to other taxa beyond Hymenoptera, i.e., butterflies, flies, etc. Current results show identification results with an average top-1 accuracy of 93.95% and a top-5 accuracy of 99.61% applied to data material of the ABIS project. Adapting DeepABIS to a butterfly dataset showing morphologically difficult to separate populations of the same species of butterfly yields identification results with an average top-1 accuracy of 96.72% and a top-5 accuracy of 99.99%.
•Reducing training efforts by automated feature generation using deep convolutional networks•Enabling participatory sensing using mobile smart phones and a cloud-based platform•Transferable to other taxa beyond Hymenoptera using end-to-end learning
The development and application of modern technology are an essential basis for the efficient monitoring of species in natural habitats to assess the change of ecosystems, species communities and ...populations, and in order to understand important drivers of change. For estimating wildlife abundance, camera trapping in combination with three-dimensional (3D) measurements of habitats is highly valuable. Additionally, 3D information improves the accuracy of wildlife detection using camera trapping. This study presents a novel approach to 3D camera trapping featuring highly optimized hardware and software. This approach employs stereo vision to infer the 3D information of natural habitats and is designated as StereO CameRA Trap for monitoring of biodivErSity (SOCRATES). A comprehensive evaluation of SOCRATES shows not only a 3.23% improvement in animal detection (bounding box mAP75), but also its superior applicability for estimating animal abundance using camera trap distance sampling. The software and documentation of SOCRATES is openly provided.
Wine growers prefer cultivars with looser bunch architecture because of the decreased risk for bunch rot. As a consequence, grapevine breeders have to select seedlings and new cultivars with regard ...to appropriate bunch traits. Bunch architecture is a mosaic of different single traits which makes phenotyping labor-intensive and time-consuming. In the present study, a fast and high-precision phenotyping pipeline was developed. The optical sensor Artec Spider 3D scanner (Artec 3D, L-1466, Luxembourg) was used to generate dense 3D point clouds of grapevine bunches under lab conditions and an automated analysis software called 3D-Bunch-Tool was developed to extract different single 3D bunch traits, i.e., the number of berries, berry diameter, single berry volume, total volume of berries, convex hull volume of grapes, bunch width and bunch length. The method was validated on whole bunches of different grapevine cultivars and phenotypic variable breeding material. Reliable phenotypic data were obtained which show high significant correlations (up to r² = 0.95 for berry number) compared to ground truth data. Moreover, it was shown that the Artec Spider can be used directly in the field where achieved data show comparable precision with regard to the lab application. This non-invasive and non-contact field application facilitates the first high-precision phenotyping pipeline based on 3D bunch traits in large plant sets.
Rapid changes of the biosphere observed in recent years are caused by both small and large scale drivers, like shifts in temperature, transformations in land-use, or changes in the energy budget of ...systems. While the latter processes are easily quantifiable, documentation of the loss of biodiversity and community structure is more difficult. Changes in organismal abundance and diversity are barely documented. Censuses of species are usually fragmentary and inferred by often spatially, temporally and ecologically unsatisfactory simple species lists for individual study sites. Thus, detrimental global processes and their drivers often remain unrevealed. A major impediment to monitoring species diversity is the lack of human taxonomic expertise that is implicitly required for large-scale and fine-grained assessments. Another is the large amount of personnel and associated costs needed to cover large scales, or the inaccessibility of remote but nonetheless affected areas.
To overcome these limitations we propose a network of Automated Multisensor stations for Monitoring of species Diversity (AMMODs) to pave the way for a new generation of biodiversity assessment centers. This network combines cutting-edge technologies with biodiversity informatics and expert systems that conserve expert knowledge. Each AMMOD station combines autonomous samplers for insects, pollen and spores, audio recorders for vocalizing animals, sensors for volatile organic compounds emitted by plants (pVOCs) and camera traps for mammals and small invertebrates. AMMODs are largely self-containing and have the ability to pre-process data (e.g. for noise filtering) prior to transmission to receiver stations for storage, integration and analyses. Installation on sites that are difficult to access require a sophisticated and challenging system design with optimum balance between power requirements, bandwidth for data transmission, required service, and operation under all environmental conditions for years. An important prerequisite for automated species identification are databases of DNA barcodes, animal sounds, for pVOCs, and images used as training data for automated species identification. AMMOD stations thus become a key component to advance the field of biodiversity monitoring for research and policy by delivering biodiversity data at an unprecedented spatial and temporal resolution.
Camera traps have become important tools for the monitoring of animal populations. However, the study‐specific estimation of animal detection probabilities is key if unbiased abundance estimates of ...unmarked species are to be obtained. Since this process can be very time‐consuming, we developed the first semi‐automated workflow for animals of any size and shape to estimate detection probabilities and population densities. In order to obtain observation distances, a deep learning algorithm is used to create relative depth images that are calibrated with a small set of reference photos for each location, with distances then extracted for animals automatically detected by MegaDetector 4.0. Animal detection by MegaDetector was generally independent of the distance to the camera trap for 10 animal species at two different study sites. If an animal was detected both manually and automatically, the difference in the distance estimates was often minimal at a distance about 4 m from the camera trap. The difference increased approximately linearly for larger distances. Nonetheless, population density estimates based on manual and semi‐automated camera trap distance sampling workflows did not differ significantly. Our results show that a readily available software for semi‐automated distance estimation can reliably be used within a camera trap distance sampling workflow, reducing the time required for data processing, by >13‐fold. This greatly improves the accessibility of camera trap distance sampling for wildlife research and management.
The estimation of observation distances is essential to derive detection probabilities in the context of population density estimation of unmarked species based on camera trapping data. We implemented a semi‐automated deep learning approach to speed up this process and evaluated the resulting distance and population density estimates. The latter did not differ significantly from those obtained by a completely manual workflow.
Behavioral analysis of animals in the wild plays an important role for ecological research and conservation and has been mostly performed by researchers. We introduce an action detection approach ...that automates this process by detecting animals and performing action recognition on the detected animals in camera trap videos. Our action detection approach is based on SWIFT (segmentation with filtering of tracklets), which we have already shown to successfully detect and track animals in wildlife videos, and MAROON (mask-guided action recognition), an action recognition network that we are introducing here. The basic ideas of MAROON are the exploitation of the instance masks detected by SWIFT and a triple-stream network. The instance masks enable more accurate action recognition, especially if multiple animals appear in a video at the same time. The triple-stream approach extracts features for the motion and appearance of the animal. We evaluate the quality of our action recognition on two self-generated datasets, from an animal enclosure and from the wild. These datasets contain videos of red deer, fallow deer and roe deer, recorded both during the day and night. MAROON improves the action recognition accuracy compared to other state-of-the-art approaches by an average of 10 percentage points on all analyzed datasets and achieves an accuracy of 69.16% on the Rolandseck Daylight dataset, in which 11 different action classes occur. Our action detection system makes it possible todrasticallyreduce the manual work of ecologists and at the same time gain new insights through standardized results.