Automatic ship detection from spaceborne systems such as satellites or aircrafts, raises considerable attention in sea surface monitoring because of the several applications in military and civilian ...field. In this context, processing satellite images on-board would reduce the latency time especially for emergency situations. In this paper, an hardware-oriented (HO) ship detection system based on a customized Convolutional Neural Network (CNN), here referred to as HO-ShipNet, is proposed and tested on a revised version of the “Ships in Satellite Imagery” (SSI) Kaggle dataset, reporting detection accuracy of up to 95%. Furthermore, the explainability of HO-ShipNet is investigated by means of explainable Artificial Intelligence (xAI) techniques (i.e., Local Interpretable Model-Agnostic Explanation (LIME) and Occlusion Sensitivuty Analysis (OSA)), in order to understand the reasoning behind the HO-ShipNet decisions by detecting the most important input features and consequently ensure the trustworthiness of the model itself. Finally, HO-ShipNet is also implemented on the heterogeneous Xilinx xc7z045ffg900-2 SoC Field Programmable Gate Array (FPGA) outperforming state-of-the-art FPGA-based accelerators dealing with high-resolution frames. The promising results encourage the potential deployment of the proposed system for on-board applications.
Clouds and cloud shadows block land surface information in optical satellite images. Accurate detection of clouds and cloud shadows can help exclude these contaminated pixels in further applications. ...Existing cloud screening methods are challenged by cloudy regions where most of satellite images are contaminated by clouds. To solve this problem for landscapes where the typical frequency of cloud-free observations of a pixel is too small to use existing methods to mask clouds and shadows, this study presents a new Automatic Time-Series Analysis (ATSA) method to screen clouds and cloud shadows in multi-temporal optical images. ATSA has five main steps: (1) calculate cloud and shadow indices to highlight cloud and cloud shadow information; (2) obtain initial cloud mask by unsupervised classifiers; (3) refine initial cloud mask by analyzing time series of a cloud index; (4) predict the potential shadow mask using geometric relationships; and (5) refine the potential shadow mask by analyzing time series of a shadow index. Compared with existing methods, ATSA needs fewer predefined parameters, does not require a thermal infrared band, and is more suitable for areas with persistent clouds. The performance of ATSA was tested with Landsat-8 OLI images, Landsat-4 MSS images, and Sentinel-2 images in three sites. The results were compared with a popular method, Function of Mask (Fmask), which has been adopted by USGS to produce Landsat cloud masks. These tests show that ATSA and Fmask can get comparable cloud and shadow masks in some of the tested images. However, ATSA can consistently obtain high accuracy in all images, while Fmask has large omission or commission errors in some images. The quantitative accuracy was assessed using manual cloud masks of 15 images. The average cloud producer's accuracy of these 15 images is as high as 0.959 and the average shadow producer's accuracy reaches 0.901. Given that it can be applied to old satellite sensors and it is capable for cloudy regions, ATSA is a valuable supplement to the existing cloud screening methods.
•ATSA screens thick clouds, thin haze and cloud shadows in optical time series.•ATSA needs fewer parameters and is suitable for areas with persistent clouds.•Cloud and shadow masks from ATSA are more accurate than existing methods.•ATSA requires few clear observations in time series and no thermal band.•ATSA can be applied to historical optical images with limited bands.
Tropical forests concentrate the largest diversity of species on the planet and play a key role in maintaining environmental processes. Due to the importance of those forests, there is growing ...interest in mapping their components and getting information at an individual tree level to conduct reliable satellite-based forest inventory for biomass and species distribution qualification. Individual tree crown information could be manually gathered from high resolution satellite images; however, to achieve this task at large-scale, an algorithm to identify and delineate each tree crown individually, with high accuracy, is a prerequisite. In this study, we propose the application of a convolutional neural network—Mask R-CNN algorithm—to perform the tree crown detection and delineation. The algorithm uses very high-resolution satellite images from tropical forests. The results obtained are promising—the R e c a l l , P r e c i s i o n , and F 1 score values obtained were were 0.81 , 0.91 , and 0.86 , respectively. In the study site, the total of tree crowns delineated was 59,062 . These results suggest that this algorithm can be used to assist the planning and conduction of forest inventories. As the algorithm is based on a Deep Learning approach, it can be systematically trained and used for other regions.
Since landslide detection using the combination of AIRSAR data and GIS-based susceptibility mapping has been rarely conducted in tropical environments, the aim of this study is to compare and ...validate support vector machine (SVM) and index of entropy (IOE) methods for landslide susceptibility assessment in Cameron Highlands area, Malaysia. For this purpose, ten conditioning factors and observed landslides were detected by AIRSAR data, WorldView-1 and SPOT 5 satellite images. A spatial database was generated including a total of 92 landslide locations encompassing the same number of observed and detected landslides, which was divided into training (80%; 74 landslide locations) and validation (20%; 18 landslide locations) datasets. Results of the difference between observed and detected landslides using root mean square error (RMSE) indicated that only 16.3% error exists, which is fairly acceptable. The validation process was performed using statistical-based measures and the area under the receiver operating characteristic (AUROC) curves. Results of validation process indicated that the SVM model has the highest values of sensitivity (88.9%), specificity (77.8%), accuracy (83.3%), Kappa (0.663) and AUROC (84.5%), followed by the IOE model. Overall, the SVM model applied to detected landslides is considered to be a promising technique that could be tested and utilized for landslide susceptibility assessment in tropical environments.
Optical remote sensing from satellites such as SPOT, Landsat, Sentinel2, Terra Modis, offers exceptional solutions for monitoring the earth’s surface. Indeed, these satellites deliver multi-source, ...multi-date images. These are Bigdata, voluminous, heterogeneous and their treatment are not only complex but also require a lot of time. Studies have been carried out on the formalization of experimental protocols in satellite image processing. Those propose automated processing chains. But, these solutions are specific and their reuse are not possible for other cases. Our reflection allows us to propose a new approach for generalizing the optical satellite image processing chains. This approach respects standards in knowledge capitalization. The objective is to ensure the interoperability to the use of optical satellite images. Metamodels for the generalization of processing chains, a method of execution, generalization rules and implementation prototypes have been developed. These results are obtained through theoretical analyses and also experiments on the processing of multidate, multisource optical satellite images applied in multisectoral domains. Thus, this research solves today’s interrogability problem in the processing of satellite images optical.
In this paper, we propose a general deep learning based framework, named Sat-MVSF, to perform three-dimensional (3D) reconstruction of the Earth’s surface from multi-view optical satellite images. ...The framework is a complete processing pipeline, including pre-processing, a multi-view stereo (MVS) network for satellite imagery (Sat-MVSNet), and post-processing. The pre-processing handles the geometric and radiometric configuration of the multi-view images and their cropping. The cropped multi-view patches are then fed into Sat-MVSNet, which includes deep feature extraction, rational polynomial camera (RPC) warping, pyramid cost volume construction, regularization, and regression, to obtain the height maps. The error matches are then filtered out and a digital surface model (DSM) is generated in the post-processing. Considering the complexity and diversity of real-world scenes, we also introduce a self-refinement strategy that does not require any ground-truth labels to enhance the performance and robustness of the Sat-MVSF framework. We comprehensively compare the proposed framework with popular commercial software and open-source methods, to demonstrate the potential of the proposed deep learning framework. On the WHU-TLC dataset, where the images are captured with a three-line camera (TLC), the proposed framework outperforms all the other solutions in terms of reconstruction fineness, and also outperforms most of the other methods in terms of efficiency. On the challenging MVS3D dataset, where the images are captured by the WorldView-3 satellite at different times and seasons, the proposed framework also exceeds the existing methods when using the model pretrained on aerial images and the introduced self-refinement strategy, demonstrating a high generalization ability. We also note that the lack of training samples hinders research in this field, and the availability of more high-quality open-source training data will greatly accelerate the research into deep learning based MVS satellite image reconstruction. The code will be available at https://gpcv.whu.edu.cn/data.
Optical satellite images are a critical data source; however, cloud cover often compromises their quality, hindering image applications and analysis. Consequently, effectively removing clouds from ...optical satellite images has emerged as a prominent research direction. Recent advances in deep learning-based cloud removal methods have been significant, but image generation quality still needs improvement. Diffusion models have demonstrated remarkable success in diverse image-generation tasks, showcasing their potential in addressing this challenge. This paper presents a novel framework called DiffCR, which leverages conditional guided diffusion with deep convolutional networks for high-performance cloud removal for optical satellite imagery. Specifically, we introduce a decoupled encoder for conditional image feature extraction, providing a robust color representation to ensure the close similarity of appearance information between the conditional input and the synthesized output. Moreover, we propose a novel and efficient time and condition fusion block within the cloud removal model to accurately simulate the correspondence between the appearance in the conditional image and the target image at a low computational cost. Extensive experimental evaluations on three commonly used benchmark datasets demonstrate that DiffCR consistently achieves state-of-the-art performance on all metrics, with parameter and computational complexities amounting to only 5.1% and 5.4%, respectively, of those previous best methods. The source code, pre-trained models, and all the experimental results will be publicly available at https://github.com/XavierJiezou/DiffCR upon the paper's acceptance of this work.
Nowadays, the substantially increasing optical remote sensing satellites are constantly generating tremendous amount of images. However, superabundant images would easily lead to unnecessary ...computation and time costs for photogrammetric mapping product generation; thus, data redundancy should be properly reduced to improve production efficiency. In this study, we proposed an optimal selection method for extracting a minimal subset from extremely redundant satellite images, aiming at providing full coverage of the area of interest (AOI) while maintaining the minimally required overlap between adjacent scenes for efficient large-scale mapping applications. We first constructed a novel optimization model by rasterizing the target AOI into regular grids and converting the image selection problem into a grid voting problem. Then, we applied the constraints on image quality, which we efficiently quantified using metadata information, and the distribution reasonability, which we designed by penalizing the adjacent grids voting for different images, to the model to achieve an optimal selection result. We modeled the optimization problem as a Markov random field. The experimental results on four datasets, which all cover large-scale areas with 165,456, 38,252, 729, and 25,922 scenes of optical satellite images, respectively, demonstrated that the proposed approach can substantially reduce the amount of raw data whiling maintaining high image quality and sufficient overlap. The quantitative evaluation indicated that the proposed method considerably outperformed the state-of-the-art methods in most of eight evaluation metrics describing data quality, simplicity, and feasibility. Furthermore, the additional mapping production experiments on the provincial-level dataset, revealed that the proposed image selection method can significantly improve the production efficiency with only a marginal decrease in product accuracy.
Tasks such as the monitoring of natural disasters or the detection of change highly benefit from complementary information about an area or a specific object of interest. The required information is ...provided by fusing high accurate coregistered and georeferenced datasets. Aligned high-resolution optical and synthetic aperture radar (SAR) data additionally enable an absolute geolocation accuracy improvement of the optical images by extracting accurate and reliable ground control points (GCPs) from the SAR images. In this paper, we investigate the applicability of a deep learning based matching concept for the generation of precise and accurate GCPs from SAR satellite images by matching optical and SAR images. To this end, conditional generative adversarial networks (cGANs) are trained to generate SAR-like image patches from optical images. For training and testing, optical and SAR image patches are extracted from TerraSAR-X and PRISM image pairs covering greater urban areas spread over Europe. The artificially generated patches are then used to improve the conditions for three known matching approaches based on normalized cross-correlation (NCC), scale-invariant feature transform (SIFT), and binary robust invariant scalable key (BRISK), which are normally not usable for the matching of optical and SAR images. The results validate that a NCC-, SIFT-, and BRISK-based matching greatly benefit, in terms of matching accuracy and precision, from the use of the artificial templates. The comparison with two state-of-the-art optical and SAR matching approaches shows the potential of the proposed method but also revealed some challenges and the necessity for further developments.