Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen ...involvement in science. While this labor is usually provided for free, it is still limited, making it important for researchers to use volunteer contributions as efficiently as possible. Using volunteer labor efficiently becomes complicated when individual tasks are assigned to multiple volunteers to increase confidence that the correct classification has been reached. In this paper, we develop a system to decide when enough information has been accumulated to confidently declare an image to be classified and remove it from circulation. We use a Bayesian approach to estimate the posterior distribution of the mean rating in a binary image classification task. Tasks are removed from circulation when user-defined certainty thresholds are reached. We demonstrate this process using a set of over 4.5 million unique classifications by 2783 volunteers of over 190,000 images assessed for the presence/absence of cropland. If the system outlined here had been implemented in the original data collection campaign, it would have eliminated the need for 59.4% of volunteer ratings. Had this effort been applied to new tasks, it would have allowed an estimated 2.46 times as many images to have been classified with the same amount of labor, demonstrating the power of this method to make more efficient use of limited volunteer contributions. To simplify implementation of this method by other investigators, we provide cutoff value combinations for one set of confidence levels.
There is an increasing evidence that smallholder farms contribute substantially to food production globally, yet spatially explicit data on agricultural field sizes are currently lacking. Automated ...field size delineation using remote sensing or the estimation of average farm size at subnational level using census data are two approaches that have been used. However, both have limitations, for example, automatic field size delineation using remote sensing has not yet been implemented at a global scale while the spatial resolution is very coarse when using census data. This paper demonstrates a unique approach to quantifying and mapping agricultural field size globally using crowdsourcing. A campaign was run in June 2017, where participants were asked to visually interpret very high resolution satellite imagery from Google Maps and Bing using the Geo‐Wiki application. During the campaign, participants collected field size data for 130 K unique locations around the globe. Using this sample, we have produced the most accurate global field size map to date and estimated the percentage of different field sizes, ranging from very small to very large, in agricultural areas at global, continental, and national levels. The results show that smallholder farms occupy up to 40% of agricultural areas globally, which means that, potentially, there are many more smallholder farms in comparison with the two different current global estimates of 12% and 24%. The global field size map and the crowdsourced data set are openly available and can be used for integrated assessment modeling, comparative studies of agricultural dynamics across different contexts, for training and validation of remote sensing field size delineation, and potential contributions to the Sustainable Development Goal of Ending hunger, achieve food security and improved nutrition and promote sustainable agriculture.
This paper demonstrates a unique approach to quantifying and mapping agricultural field size globally using crowdsourcing. A campaign was run in June 2017 where participants were asked to visually interpret very high resolution satellite imagery from Google Maps and Bing using the Geo‐Wiki application. The results show that smallholder farms occupy up to 40% of agricultural areas globally, which means that, potentially, there are many more smallholder farms in comparison with the two different current global estimates of 12% and 24%. The global field size map and the crowdsourced data set are openly available.
The idea that closer things are more related than distant things, known as ‘Tobler’s first law of geography’, is fundamental to understanding many spatial processes. If this concept applies to ...volunteered geographic information (VGI), it could help to efficiently allocate tasks in citizen science campaigns and help to improve the overall quality of collected data. In this paper, we use classifications of satellite imagery by volunteers from around the world to test whether local familiarity with landscapes helps their performance. Our results show that volunteers identify cropland slightly better within their home country, and do slightly worse as a function of linear distance between their home and the location represented in an image. Volunteers with a professional background in remote sensing or land cover did no better than the general population at this task, but they did not show the decline with distance that was seen among other participants. Even in a landscape where pasture is easily confused for cropland, regional residents demonstrated no advantage. Where we did find evidence for local knowledge aiding classification performance, the realized impact of this effect was tiny. Rather, the inherent difficulty of a task is a much more important predictor of volunteer performance. These findings suggest that, at least for simple tasks, the geographical origin of VGI volunteers has little impact on their ability to complete image classifications.
Abstract
The development of remotely sensed products such as land cover requires large amounts of high-quality reference data, needed to train remote sensing classification algorithms and for ...validation. However, due to the lack of sharing and the high costs associated with data collection, particularly ground-based information, the amount of reference data available has not kept up with the vast increase in the availability of satellite imagery, e.g. from Landsat, Sentinel and Planet satellites. To fill this gap, the Geo-Wiki platform for the crowdsourcing of reference data was developed, involving visual interpretation of satellite and aerial imagery. Here we provide an overview of the crowdsourcing campaigns that have been run using Geo-Wiki over the last decade, including the amount of data collected, the research questions driving the campaigns and the outputs produced such as new data layers (e.g. a global map of forest management), new global estimates of areas or percentages of land cover/land use (e.g. the amount of extra land available for biofuels) and reference data sets, all openly shared. We demonstrate that the amount of data collected and the scientific advances in the field of land cover and land use would not have been possible without the participation of citizens. A relatively conservative estimate reveals that citizens have contributed more than 5.3 years of the data collection efforts of one person over short, intensive campaigns run over the last decade. We also provide key observations and lessons learned from these campaigns including the need for quality assurance mechanisms linked to incentives to participate, good communication, training and feedback, and appreciating the ingenuity of the participants.
Citizens are increasingly becoming involved in data collection, whether for scientific purposes, to carry out micro-tasks, or as part of a gamified, competitive application. In some cases, ...volunteered data collection overlaps with that of mapping agencies, e.g., the citizen-based mapping of features in OpenStreetMap. LUCAS (Land Use Cover Area frame Sample) is one source of authoritative in-situ data that are collected every three years across EU member countries by trained personnel at a considerable cost to taxpayers. This paper presents a mobile application called FotoQuest Austria, which involves citizens in the crowdsourcing of in-situ land cover and land use data, including at locations of LUCAS sample points in Austria. The results from a campaign run during the summer of 2015 suggest that land cover and land use can be crowdsourced using a simple protocol based on LUCAS. This has implications for remote sensing as this data stream represents a new source of potentially valuable information for the training and validation of land cover maps as well as for area estimation purposes. Although the most detailed and challenging classes were more difficult for untrained citizens to recognize, the agreement between the crowdsourced data and the LUCAS data for basic high level land cover and land use classes in homogeneous areas (ca. 80%) shows clear potential. Recommendations for how to further improve the quality of the crowdsourced data in the context of LUCAS are provided so that this source of data might one day be accurate enough for land cover mapping purposes.
Very high resolution (VHR) satellite imagery from Google Earth and Microsoft Bing Maps is increasingly being used in a variety of applications from computer sciences to arts and humanities. In the ...field of remote sensing, one use of this imagery is to create reference data sets through visual interpretation, e.g., to complement existing training data or to aid in the validation of land-cover products. Through new applications such as Collect Earth, this imagery is also being used for monitoring purposes in the form of statistical surveys obtained through visual interpretation. However, little is known about where VHR satellite imagery exists globally or the dates of the imagery. Here we present a global overview of the spatial and temporal distribution of VHR satellite imagery in Google Earth and Microsoft Bing Maps. The results show an uneven availability globally, with biases in certain areas such as the USA, Europe and India, and with clear discontinuities at political borders. We also show that the availability of VHR imagery is currently not adequate for monitoring protected areas and deforestation, but is better suited for monitoring changes in cropland or urban areas using visual interpretation.
Raw observations (carrier-phase and code observations) from the Global Navigation Satellite System (GNSS) can now be accessed from Android mobile phones (Version 7.0 onwards). This paves the way for ...GNSS data to be utilized for low-cost precise positioning or in ionospheric or tropospheric applications. This paper presents results from data collection campaigns using the CAMALIOT mobile app. In the first campaign, 116.3 billion measurements from 11,828 mobile devices were collected from all continents. Although participation decreased during the second campaign, data are still being collected globally. In this contribution, we demonstrate the potential of volunteered geographic information (VGI) from mobile phones to fill data gaps in geodetic station networks that collect GNSS data, e.g. in Brazil, but also how the data can provide a denser set of observations than current networks in countries across Europe. We also show that mobile phones capable of dual-frequency reception, which is an emerging technology that can provide a richer source of GNSS data, are contributing in a substantial way. Finally, we present the results from a survey of participants to indicate that participation is diverse in terms of backgrounds and geography, where the dominant motivation for participation is to contribute to scientific research.
There are many new land use and land cover (LULC) products emerging yet there is still a lack of in situ data for training, validation, and change detection purposes. The LUCAS (Land Use Cover Area ...frame Sample) survey is one of the few authoritative in situ field campaigns, which takes place every three years in European Union member countries. More recently, a study has considered whether citizen science and crowdsourcing could complement LUCAS survey data, e.g., through the FotoQuest Austria mobile app and crowdsourcing campaign. Although the data obtained from the campaign were promising when compared with authoritative LUCAS survey data, there were classes that were not well classified by the citizens. Moreover, the photographs submitted through the app were not always of sufficient quality. For these reasons, in the latest FotoQuest Go Europe 2018 campaign, several improvements were made to the app to facilitate interaction with the citizens contributing and to improve their accuracy in LULC identification. In addition to extending the locations from Austria to Europe, a change detection component (comparing land cover in 2018 to the 2015 LUCAS photographs) was added, as well as an improved LC decision tree. Furthermore, a near real-time quality assurance system was implemented to provide feedback on the distance to the target location, the LULC classes chosen and the quality of the photographs. Another modification was a monetary incentive scheme in which users received between 1 to 3 Euros for each successfully completed quest of sufficient quality. The purpose of this paper is to determine whether citizens can provide high quality in situ data on LULC through crowdsourcing that can complement LUCAS. We compared the results between the FotoQuest campaigns in 2015 and 2018 and found a significant improvement in 2018, i.e., a much higher match of LC between FotoQuest Go Europe and LUCAS. As shown by the cost comparisons with LUCAS, FotoQuest can complement LUCAS surveys by enabling continuous collection of large amounts of high quality, spatially explicit field data at a low cost.
The creation of crop type maps from satellite data has proven challenging and is often impeded by a lack of accurate in situ data. Street-level imagery represents a new potential source of in situ ...data that may aid crop type mapping, but it requires automated algorithms to recognize the features of interest. This paper aims to demonstrate a method for crop type (i.e., maize, wheat and others) recognition from street-level imagery based on a convolutional neural network using a bottom-up approach. We trained the model with a highly accurate dataset of crowdsourced labelled street-level imagery using the Picture Pile application. The classification results achieved an AUC of 0.87 for wheat, 0.85 for maize and 0.73 for others. Given that wheat and maize are two of the most common food crops grown globally, combined with an ever-increasing amount of available street-level imagery, this approach could help address the need for improved global crop type monitoring. Challenges remain in addressing the noise aspect of street-level imagery (i.e., buildings, hedgerows, automobiles, etc.) and uncertainties due to differences in the time of day and location. Such an approach could also be applied to developing other in situ data sets from street-level imagery, e.g., for land use mapping or socioeconomic indicators.
Abstract
Here we present a geographically diverse, temporally consistent, and nationally relevant land cover (LC) reference dataset collected by visual interpretation of very high spatial resolution ...imagery, in a national-scale crowdsourcing campaign (targeting seven generic LC classes) and a series of expert workshops (targeting seventeen detailed LC classes) in Indonesia. The interpreters were citizen scientists (crowd/non-experts) and local LC visual interpretation experts from different regions in the country. We provide the raw LC reference dataset, as well as a quality-filtered dataset, along with the quality assessment indicators. We envisage that the dataset will be relevant for: (1) the LC mapping community (researchers and practitioners), i.e., as reference data for training machine learning algorithms and map accuracy assessment (with appropriate quality-filters applied), and (2) the citizen science community, i.e., as a sizable empirical dataset to investigate the potential and limitations of contributions from the crowd/non-experts, demonstrated for LC mapping in Indonesia for the first time to our knowledge, within the context of complementing traditional data collection by expert interpreters.