•A dataset for 3D apple location using Structure-from-Motion (SfM) is presented.•The proposed fruit detection algorithm combines 2D instance segmentation and SfM.•Mask R-CNN detections were projected ...onto 3D point clouds based on SfM.•Fruit detection results in 2D images reported an F1-score of 0.82.•Fruit location results in 3D point clouds reported an F1-score of 0.88.
The development of remote fruit detection systems able to identify and 3D locate fruits provides opportunities to improve the efficiency of agriculture management. Most of the current fruit detection systems are based on 2D image analysis. Although the use of 3D sensors is emerging, precise 3D fruit location is still a pending issue. This work presents a new methodology for fruit detection and 3D location consisting of: (1) 2D fruit detection and segmentation using Mask R-CNN instance segmentation neural network; (2) 3D point cloud generation of detected apples using structure-from-motion (SfM) photogrammetry; (3) projection of 2D image detections onto 3D space; (4) false positives removal using a trained support vector machine. This methodology was tested on 11 Fuji apple trees containing a total of 1455 apples. Results showed that, by combining instance segmentation with SfM the system performance increased from an F1-score of 0.816 (2D fruit detection) to 0.881 (3D fruit detection and location) with respect to the total amount of fruits. The main advantages of this methodology are the reduced number of false positives and the higher detection rate, while the main disadvantage is the high processing time required for SfM, which makes it presently unsuitable for real-time work. From these results, it can be concluded that the combination of instance segmentation and SfM provides high performance fruit detection with high 3D data precision. The dataset has been made publicly available and an interactive visualization of fruit detection results is accessible at http://www.grap.udl.cat/documents/photogrammetry_fruit_detection.html.
•The range corrected intensity from RGB-D sensors is used for fruit detection.•First multi-modal (color, depth, intensity) fruit detection dataset is presented.•Faster R-CNN object detection network ...is adapted to be used with 5-channel images.•An improvement of 4.46% in F1-score is achieved when using all modalities.•Results show an F1-score of 0.8983 and a mean average precision of 94.8%.
Fruit detection and localization will be essential for future agronomic management of fruit crops, with applications in yield prediction, yield mapping and automated harvesting. RGB-D cameras are promising sensors for fruit detection given that they provide geometrical information with color data. Some of these sensors work on the principle of time-of-flight (ToF) and, besides color and depth, provide the backscatter signal intensity. However, this radiometric capability has not been exploited for fruit detection applications. This work presents the KFuji RGB-DS database, composed of 967 multi-modal images containing a total of 12,839 Fuji apples. Compilation of the database allowed a study of the usefulness of fusing RGB-D and radiometric information obtained with Kinect v2 for fruit detection. To do so, the signal intensity was range corrected to overcome signal attenuation, obtaining an image that was proportional to the reflectance of the scene. A registration between RGB, depth and intensity images was then carried out. The Faster R-CNN model was adapted for use with five-channel input images: color (RGB), depth (D) and range-corrected intensity signal (S). Results show an improvement of 4.46% in F1-score when adding depth and range-corrected intensity channels, obtaining an F1-score of 0.898 and an AP of 94.8% when all channels are used. From our experimental results, it can be concluded that the radiometric capabilities of ToF sensors give valuable information for fruit detection.
•Four fruit size estimation algorithms were evaluated and compared.•A new method to estimate the visibility percentage of apples was proposed.•Estimated visibility was used to identify and ...discriminate highly occluded apples.•Best diameter estimations reported a MAE of 3.7 mm and an RR2 of 0.91.•A dataset for estimating apple size using Structure-from-Motion (SfM) is presented.
In-field fruit monitoring at different growth stages provides important information for farmers. Recent advances have focused on the detection and location of fruits, although the development of accurate fruit size estimation systems is still a challenge that requires further attention. This work proposes a novel methodology for automatic in-field apple size estimation which is based on four main steps: 1) fruit detection; 2) point cloud generation using structure-from-motion (SfM) and multi-view stereo (MVS); 3) fruit size estimation; and 4) fruit visibility estimation. Four techniques were evaluated in the fruit size estimation step. The first consisted of obtaining the fruit diameter by measuring the two most distant points of an apple detection (largest segment technique). The second and third techniques were based on fitting a sphere to apple points using least squares (LS) and M−estimator sample consensus (MSAC) algorithms, respectively. Finally, template matching (TM) was applied for fitting an apple 3D model to apple points. The best results were obtained with the LS, MSAC and TM techniques, which showed mean absolute errors of 4.5 mm, 3.7 mm and 4.2 mm, and coefficients of determination (R2) of 0.88, 0.91 and 0.88, respectively. Besides fruit size, the proposed method also estimated the visibility percentage of apples detected. This step showed an R2 of 0.92 with respect to the ground truth visibility. This allowed automatic identification and discrimination of the measurements of highly occluded apples. The main disadvantage of the method is the high processing time required (in this work 2760 s for 3D modelling of 6 trees), which limits its direct application in large agricultural areas. The code and the dataset have been made publicly available and a 3D visualization of results is accessible at http://www.grap.udl.cat/en/publications/apple_size_estimation_SfM.
•A system for simultaneous fruit location and canopy characterization is presented.•The use of forced air flow helped to reduce the number of fruit occlusions.•Results show a fruit location success ...of more than 80% of the annotated fruits.•The system was able to predict the yield with an RMSE lower than 6%.
Yield monitoring and geometric characterization of crops provide information about orchard variability and vigor, enabling the farmer to make faster and better decisions in tasks such as irrigation, fertilization, pruning, among others. When using LiDAR technology for fruit detection, fruit occlusions are likely to occur leading to an underestimation of the yield. This work is focused on reducing the fruit occlusions for LiDAR-based approaches, tackling the problem from two different approaches: applying forced air flow by means of an air-assisted sprayer, and using multi-view sensing. These approaches are evaluated in fruit detection, yield prediction and geometric crop characterization. Experimental tests were carried out in a commercial Fuji apple (Malus domestica Borkh. cv. Fuji) orchard. The system was able to detect and localize more than 80% of the visible fruits, predict the yield with a root mean square error lower than 6% and characterize canopy height, width, cross-section area and leaf area. The forced air flow and multi-view approaches helped to reduce the number of fruit occlusions, locating 6.7% and 6.5% more fruits, respectively. Therefore, the proposed system can potentially monitor the yield and characterize the geometry in apple trees. Additionally, combining trials with and without forced air flow and multi-view sensing presented significant advantages for fruit detection as they helped to reduce the number of fruit occlusions.
The use of 3D sensors combined with appropriate data processing and analysis has provided tools to optimise agricultural management through the application of precision agriculture. The recent ...development of low-cost RGB-Depth cameras has presented an opportunity to introduce 3D sensors into the agricultural community. However, due to the sensitivity of these sensors to highly illuminated environments, it is necessary to know under which conditions RGB-D sensors are capable of operating. This work presents a methodology to evaluate the performance of RGB-D sensors under different lighting and distance conditions, considering both geometrical and spectral (colour and NIR) features. The methodology was applied to evaluate the performance of the Microsoft Kinect v2 sensor in an apple orchard. The results show that sensor resolution and precision decreased significantly under middle to high ambient illuminance (>2000 lx). However, this effect was minimised when measurements were conducted closer to the target. In contrast, illuminance levels below 50 lx affected the quality of colour data and may require the use of artificial lighting. The methodology was useful for characterizing sensor performance throughout the full range of ambient conditions in commercial orchards. Although Kinect v2 was originally developed for indoor conditions, it performed well under a range of outdoor conditions.
•An error propagation analysis of GNSS over the laser scan measurements.•A scan matching approach fused with GNSS measurements.•Analysis of the crown surface area, crown volume, and crown porosity.
...Currently, 3D point clouds are obtained via LiDAR (Light Detection and Ranging) sensors to compute vegetation parameters to enhance agricultural operations. However, such a point cloud is intrinsically dependent on the GNSS (global navigation satellite system) antenna used to have absolute positioning of the sensor within the grove. Therefore, the error associated with the GNSS receiver is propagated to the LiDAR readings and, thus, to the crown or orchard parameters. In this work, we first describe the error propagation of GNSS over the laser scan measurements. Second, we present our proposal to overcome this effect based only on the LiDAR readings. Such a proposal uses a scan matching approach to reduce the error associated with the GNSS receiver. To accomplish such purpose, we fuse the information from the scan matching estimations with the GNSS measurements. In the experiments, we statistically analyze the dependence of the grove parameters extracted from the 3D point cloud -specifically crown surface area, crown volume, and crown porosity- to the localization error. We carried out 150 trials with positioning errors ranging from 0.01 meters (ground truth) to 2 meters. When using only GNSS as a localization system, the results showed that errors associated with the estimation of vegetation parameters increased more than 100 % when positioning error was equal or bigger than 1 meter. On the other hand, when our proposal was used as a localization system, the results showed that for the same case of 1 meter, the estimation of orchard parameters improved in 20 % overall. However, in lower positioning errors of the GNSS, the estimation of orchard parameters were improved up to 50% overall. These results suggest that our work could lead to better decisions in agricultural operations, which are based on foliar parameter measurements, without the use of external hardware.
The development of reliable fruit detection and localization systems provides an opportunity to improve the crop value and management by limiting fruit spoilage and optimised harvesting practices. ...Most proposed systems for fruit detection are based on RGB cameras and thus are affected by intrinsic constraints, such as variable lighting conditions. This work presents a new technique that uses a mobile terrestrial laser scanner (MTLS) to detect and localise Fuji apples. An experimental test focused on Fuji apple trees (Malus domestica Borkh. cv. Fuji) was carried out. A 3D point cloud of the scene was generated using an MTLS composed of a Velodyne VLP-16 LiDAR sensor synchronised with an RTK-GNSS satellite navigation receiver. A reflectance analysis of tree elements was performed, obtaining mean apparent reflectance values of 28.9%, 29.1%, and 44.3% for leaves, branches and trunks, and apples, respectively. These results suggest that the apparent reflectance parameter (at 905 nm wavelength) can be useful to detect apples. For that purpose, a four-step fruit detection algorithm was developed. By applying this algorithm, a localization success of 87.5%, an identification success of 82.4%, and an F1-score of 0.858 were obtained in relation to the total amount of fruits. These detection rates are similar to those obtained by RGB-based systems, but with the additional advantages of providing direct 3D fruit location information, which is not affected by sunlight variations. From the experimental results, it can be concluded that LiDAR-based technology and, particularly, its reflectance information, has potential for remote apple detection and 3D location.
•Apples present higher IR reflectance than leaves and trunks.•A new methodology for fruit detection using an MTLS has been developed.•Results show a success rate of 82.4%, with a 10.4% of false detections.•The methodology is not affected by lighting and it provides apple locations in 3D.
Fruit size at harvest is an economically important variable for high-quality table fruit production in orchards and vineyards. In addition, knowing the number and size of the fruit on the tree is ...essential in the framework of precise production, harvest, and postharvest management. A prerequisite for analysis of fruit in a real-world environment is the detection and segmentation from background signal. In the last five years, deep learning convolutional neural network have become the standard method for automatic fruit detection, achieving F1-scores higher than 90 %, as well as real-time processing speeds. At the same time, different methods have been developed for, mainly, fruit size and, more rarely, fruit maturity estimation from 2D images and 3D point clouds. These sizing methods are focused on a few species like grape, apple, citrus, and mango, resulting in mean absolute error values of less than 4 mm in apple fruit. This review provides an overview of the most recent methodologies developed for in-field fruit detection/counting and sizing as well as few upcoming examples of maturity estimation. Challenges, such as sensor fusion, highly varying lighting conditions, occlusions in the canopy, shortage of public fruit datasets, and opportunities for research transfer, are discussed.
•Deep learning has meant a breakthrough in fruit detection/counting.•There is a need for publicly available codes and datasets for fruit detection.•Advanced fruit sizing methods based on 3D data do not require calibration targets.•Future fruit sizing methods should provide results at different degrees of occlusion.•Commercial fruit counting and sizing systems will be facilitated by lightweight CNNs.
•A CNN for simultaneous modal and amodal instance segmentation was implemented.•Amodal segmentation was applied to predict visible and occluded apple regions.•Modal and amodal masks were used to ...estimate the % of visibility of apples.•The method was robust for detection and sizing of partially occluded fruits.•Results showed an F1 = 0.86 and a MAE = 2.93 mm for detection and sizing, respectively.
The detection and sizing of fruits with computer vision methods is of interest because it provides relevant information to improve the management of orchard farming. However, the presence of partially occluded fruits limits the performance of existing methods, making reliable fruit sizing a challenging task. While previous fruit segmentation works limit segmentation to the visible region of fruits (known as modal segmentation), in this work we propose an amodal segmentation algorithm to predict the complete shape, which includes its visible and occluded regions. To do so, an end-to-end convolutional neural network (CNN) for simultaneous modal and amodal instance segmentation was implemented. The predicted amodal masks were used to estimate the fruit diameters in pixels. Modal masks were used to identify the visible region and measure the distance between the apples and the camera using the depth image. Finally, the fruit diameters in millimetres (mm) were computed by applying the pinhole camera model. The method was developed with a Fuji apple dataset consisting of 3925 RGB-D images acquired at different growth stages with a total of 15,335 annotated apples, and was subsequently tested in a case study to measure the diameter of Elstar apples at different growth stages. Fruit detection results showed an F1-score of 0.86 and the fruit diameter results reported a mean absolute error (MAE) of 4.5 mm and R2 = 0.80 irrespective of fruit visibility. Besides the diameter estimation, modal and amodal masks were used to automatically determine the percentage of visibility of measured apples. This feature was used as a confidence value, improving the diameter estimation to MAE = 2.93 mm and R2 = 0.91 when limiting the size estimation to fruits detected with a visibility higher than 60%. The main advantages of the present methodology are its robustness for measuring partially occluded fruits and the capability to determine the visibility percentage. The main limitation is that depth images were generated by means of photogrammetry methods, which limits the efficiency of data acquisition. To overcome this limitation, future works should consider the use of commercial RGB-D sensors. The code and the dataset used to evaluate the method have been made publicly available at https://github.com/GRAP-UdL-AT/Amodal_Fruit_Sizing.
•Colour and depth images provided by an RGB-D Azure Kinect DK camera are used.•Automatic algorithms to estimate size and weight of apples are evaluated.•Allometric models show a high ability to ...predict the weight of apples.•Different sizing and weighting algorithms can be used for yield prediction.
Data acquired using an RGB-D Azure Kinect DK camera were used to assess different automatic algorithms to estimate the size, and predict the weight of non-occluded and occluded apples. The programming of the algorithms included: (i) the extraction of images of regions of interest (ROI) using manual delimitation of bounding boxes or binary masks; (ii) estimating the lengths of the major and minor geometric axes for the purpose of apple sizing; and (iii) predicting the final weight by allometric modelling. In addition to the use of bounding boxes, the algorithms also allowed other post-mask settings (circles, ellipses and rotated rectangles) to be implemented, and different depth options (distance between the RGB-D camera and the fruits detected) for subsequent sizing through the application of the thin lens theory. Both linear and nonlinear allometric models demonstrated the ability to predict apple weight with a high degree of accuracy (R2 greater than 0.942 and RMSE < 16 g). With respect to non-occluded apples, the best weight predictions were achieved using a linear allometric model including both the major and minor axes of the apples as predictors. The mean absolute percentage error (MAPE) ranged from 5.1% to 5.7% with respective RMSE of 11.09 g and 13.02 g, depending to whether circles, ellipses, or bounding boxes were used to adjust fruit shape. The results were therefore promising and open up the possibility of implementing reliable in-field apple measurements in real time. Importantly, final weight prediction error and intermediate size estimation errors (from sizing algorithms) interact but in a way that is not easily quantifiable when weight allometric models with implicit prediction error are used. In addition, allometric models should be reviewed when applied to other apple cultivars, fruit development stages or even for different fruit growth conditions depending on canopy management.