Color constancy is an important part of the human visual system, as it allows us to perceive the colors of objects invariant to the color of the illumination that is illuminating them. Modern digital ...cameras have to be able to recreate this property computationally. However, this is not a simple task, as the response of each pixel on the camera sensor is the product of the combination of spectral characteristics of the illumination, object, and the sensor. Therefore, many assumptions have to be made to approximately solve this problem. One common procedure was to assume only one global source of illumination. However, this assumption is often broken in real-world scenes. Thus, multi-illuminant estimation and segmentation is still a mostly unsolved problem. In this paper, we address this problem by proposing a novel framework capable of estimating per-pixel illumination of any scene with two sources of illumination. The framework consists of a deep-learning model capable of segmenting an image into regions with uniform illumination and models capable of single-illuminant estimation. First, a global estimation of the illumination is produced, and is used as input to the segmentation model along with the original image, which segments the image into regions where that illuminant is dominant. The output of the segmentation is used to mask the input and the masked images are given to the estimation models, which produce the final estimation of the illuminations. The models comprising the framework are first trained separately, then combined and fine-tuned jointly. This allows us to utilize well researched single-illuminant estimation models in a multi-illuminant scenario. We show that such an approach improves both segmentation and estimation capabilities. We tested different configurations of the proposed framework against other single- and multi-illuminant estimation and segmentation models on a large dataset of multi-illuminant images. On this dataset, the proposed framework achieves the best results, in both multi-illumination estimation and segmentation problems. Furthermore, generalization properties of the framework were tested on often used single-illuminant datasets. There, it achieved comparable performance with state-of-the-art single-illumination models, even though it was trained only on the multi-illuminant images.
Determining the demographic characteristics of a person post-mortem is a fundamental task for forensic experts, and the dental system is a crucial source of those information. Those characteristics, ...namely age and sex, can reliably be determined. The mandible and individual teeth survive even the harshest conditions, making them a prime target for forensic analysis. Current methods in forensic odontology rely on time-consuming manual measurements and reference tables, many of which rely on the correct determination of the tooth type. This study thoroughly explores the applicability of deep learning for sex assessment, age estimation, and tooth type determination from x-ray images of individual teeth. A series of models that use state-of-the-art feature extraction architectures and attention have been trained and evaluated. Their hyperparameters have been explored and optimized using a combination of grid and random search, totaling over a thousand experiments and 14076 hours of GPU compute time. Our dataset contains 86495 individual tooth x-ray image samples, with a subset of 7630 images having additional information about tooth alterations. The best-performing models are fine-tuned, the impact of tooth alterations is analyzed, and model performance is compared to current methods in forensic odontology literature. We achieve an accuracy of 76.41% for sex assessment, a median absolute error of 4.94 years for age estimation, and an accuracy of 87.24% to 99.15% for tooth type determination. The constructed models are fully automated and fast, their results are reproducible, and the performance is equal to or better than current state-of-the-art methods in forensic odontology.
Color constancy is one of the key steps in the process of image formation in digital cameras. Its goal is to process the image so that there is no influence of illumination color on the colors of ...objects and surfaces. To capture the target scene colors as accurately as possible, it is crucial to estimate the illumination vector with high accuracy. Unfortunately, the illumination estimation is an ill-posed problem, and solving it most often relies on assumptions. To date, various assumptions have been proposed, which resulted in a wide variety of illumination estimation methods. Statistics-based methods have shown to be appropriate for hardware implementation, but learning-based methods achieve state-of-the-art results, especially those that use deep neural networks. The large learning capacities and generalization abilities of deep neural networks can be used to develop the illumination estimation methods, which are more general and precise. This approach avoids introducing many new assumptions, which often only work in some specific situations. In this paper, a new method for illumination estimation based on light source classification is proposed. In the first step, the set of possible illuminations is reduced by classifying the input image in one of three classes. The classes include images captured in outdoor scenes under natural illuminations, images captured in outdoor scenes under artificial illuminations, and images captured in indoor scenes under artificial illuminations. In the second step, a deep illumination estimation network, which is trained exclusively on images in the class that was predicted in the first step, is applied to the input image. Dividing the illumination space into smaller regions makes the training of illumination estimation networks simpler because the distribution of image scenes and illuminations is less diverse. The experiments on the Cube+ image dataset have shown the median illumination estimation error of 1.27°, which is an improvement of more than 25% compared to the use of the single network for all illuminations.
In the image processing pipelines of digital cameras, one of the first steps is to achieve invariance in terms of scene illumination, namely computational color constancy. Usually, this is done in ...two successive steps which are illumination estimation and chromatic adaptation. The illumination estimation aims at estimating a three-dimensional vector from image pixels. This vector represents the scene illumination, and it is used in the chromatic adaptation step, which aims at eliminating the bias in image colors caused by the color of the illumination. An accurate illumination estimation is crucial for successful computational color constancy. However, this is an ill-posed problem, and many methods try to comprehend it with different assumptions. In this paper, an iterative method for estimating the scene illumination color is proposed. The method calculates the illumination vector by a series of intermediate illumination estimations and chromatic adaptations of an input image using a convolutional neural network. The network has been trained to iteratively compute intermediate incremental illumination estimates from the original image. Incremental illumination estimates are combined by per element multiplication to obtain the final illumination estimation. The approach is aimed to reduce large estimation errors usually occurring with highly saturated light sources. Experimental results show that the proposed method outperforms the vast majority of illumination estimation methods in terms of median angular error. Moreover, in terms of worst-performing samples, i.e., the samples for which a method errs the most, the proposed method outperforms all other methods by a margin of more than 18% with respect to the mean of estimation errors in the third quartile.
Development of a convolutional neural network that can precisely and quickly identify teeth from x-ray images, without using neighbouring structures as a frame of reference.
Using a database of 11403 ...x-ray images that were precisely annotated by dental professionals we have trained, validated and tested a convolutional neural network (CNN) that can identify teeth according to their position in the oral cavity. Four “levels” were tested, the first one being classification according to the type of the tooth morphologically. This consisted of 4 categories: incisor, canine, premolar and molar. The second “level” added the differentiation between types of incisors, premolars and molars. This “level” had 8 categories, imitating a dental quadrant. The third “level” added maxillary or mandibular origin and a total of 16 categories. Finally, the fourth “level” had 32 categories, meaning every tooth had its own.
The first level offered an 97.83% accuracy on unseen data. The second level offered 92.13%. “Level” three offered 91.14%. The fourth level, while being the most demanding, offered a 91.13%.
The results were the best in the 4 category “level” and the least successful in the 32 category “level”. Interestingly, the difference between the 32 and 16 category level was not significant at all. The developed CNN can identify the morphological type of the tooth with a very high accuracy rate. This opens a door into implementation of artificial intelligence in rapid analysis and cross referencing in (forensic) dental medicine.
This study has been supported as a part of the Croatian Science Foundation under the project IP-2020-02-9423.
Age estimation is a key component in forensic analysis, be it in legal proceedings or archeological research. Current methods in forensic odontology are based on manual measurements of a wide array ...of morphometric parameters, typically from dental x-ray images, and occasionally from material remains. While those parameters follow a set progression during human development, thereby allowing current methods to precisely estimate the age of juveniles, estimation for adults and seniors proves to be more difficult. In this study, we explore the applicability of deep learning to the problem of chronological age estimation. We determine the best convolutional neural network model derived from state-of-the-art architectures, we determine the best performing model parameters using pretrained general-purpose vision model parameters as the starting point, and we perform ablation experiments to highlight which anatomical regions of the dental system contribute the most to the estimation. The proposed approach attains the lowest estimation error in literature for adult and senior subjects, which we verify on one of the largest datasets of panoramic dental x-ray images in literature. The dataset consists of 4035 panoramic dental x-ray images of male and female subjects with ages between 19 and 90 years. This study also evaluates the feasibility of the proposed model for age estimations of individual teeth, achieving an estimation error competitive with current methods while being fully automated. The estimation error is verified on our dataset of 76416 individual tooth images, which is the largest dataset to date in forensic odontology literature. Unlike current methods, dental alterations, decay, illnesses, or missing teeth do not pose a problem to the proposed model. With a median estimation error of 2.95 years for panoramic dental x-ray images and 4.68 years for individual teeth, and by deriving the model from state-of-the-art architectures, verifying those results on the largest dataset in forensic odontology literature and demonstrating the importance of different anatomical regions of the dental system for estimation, this study sets the baseline for future research of automated chronological age estimation in forensic odontology.
•Age estimation of adults is a hard problem in forensic odontology.•Only manual measurement methods are currently used in practice.•This deep learning approach eclipses expert performance in a fraction of the time.•The interpretability analysis gives new insight into anatomical changes due to age.•The method is verified on the largest dataset in literature.
Computational color constancy has the important task of reducing the influence of the scene illumination on the object colors. As such, it is an essential part of the image processing pipelines of ...most digital cameras. One of the important parts of the computational color constancy is illumination estimation, i.e. estimating the illumination color. When an illumination estimation method is proposed, its accuracy is usually reported by providing the values of error metrics obtained on the images of publicly available datasets. However, over time it has been shown that many of these datasets have problems such as too few images, inappropriate image quality, lack of scene diversity, absence of version tracking, violation of various assumptions, GDPR regulation violation, lack of additional shooting procedure info, etc. In this paper a new illumination estimation dataset is proposed that aims to alleviate many of the mentioned problems and to help the illumination estimation research. It consists of 4890 images with known illumination colors as well as with additional semantic data that can further make the learning process more accurate. Due to the usage of the SpyderCube color target, for every image there are two ground-truth illumination records covering different directions. Because of that, the dataset can be used for training and testing of methods that perform single or two-illuminant estimation. This makes it superior to many similar existing datasets. The datasets, it's smaller version SimpleCube++, and the accompanying code are available at https://github.com/Visillect/CubePlusPlus/ .
Non-destructive testing (NDT) is a set of techniques used for material inspection and detection of defects. Ultrasonic testing (UT) is one of the NDT techniques, commonly used to inspect components ...in the oil and gas industry, aerospace, and various types of power plants. Acquisition of the UT data is currently done automatically using robotic manipulators. This ensures the precision and uniformity of the acquired data. On the other hand, the analysis is still done manually by trained experts. Since the acquired UT data can be represented in the form of images, computer vision algorithms can be applied to analyze the content of images and localize defects. In this work, we propose a novel deep learning architecture designed specifically for defect detection from UT images. We propose a lightweight feature extractor that improves the precision and efficiency of the detector. We also modify the detection head to improve the detection of the objects with extreme aspect ratios which are common in UT images. We tested our approach on an in-house dataset with over 4000 images. The proposed architecture outperformed the previous state-of-the-art method by 1.7% (512 × 512 px input resolution) and 2.7% (384 × 384 px input resolution) while significantly decreasing the inference time.
Nondestructive evaluation (NDE) is a set of techniques used for material inspection and defect detection without causing damage to the inspected component. One of the commonly used nondestructive ...techniques is called ultrasonic inspection. The acquisition of ultrasonic data was mostly automated in recent years, but the analysis of the collected data is still performed manually. This process is thus very expensive, inconsistent, and prone to human errors. An automated system would significantly increase the efficiency of analysis, but the methods presented so far fail to generalize well on new cases and are not used in real-life inspection. Many of the similar data analysis problems were recently tackled by deep learning methods. This approach outperforms classical methods but requires lots of training data, which is difficult to obtain in the NDE domain. In this work, we train a deep learning architecture EfficientDet to automatically detect defects from ultrasonic images. We showed how some of the hyperparameters can be tweaked in order to improve the detection of defects with extreme aspect ratios that are common in ultrasonic images. The proposed object detector was trained on the largest dataset of ultrasonic images that was so far seen in the literature. In order to collect the dataset, six steel blocks containing 68 defects were scanned with a phased-array probe. More than 4000 VC-B-scans were acquired and used for training and evaluation of EfficientDet. The proposed model achieved 89.6% of mean average precision (mAP) during fivefold cross validation, which is a significant improvement compared to some similar methods that were previously used for this task. A detailed performance overview for each of the folds revealed that EfficientDet-D0 successfully detects all of the defects present in the inspected material.
Display omitted
•A novel GAN architecture for generating images with objects at precise locations.•Proposed GAN can generate data that can improve the object detector’s performance.•An improvement of ...6% of detector's precision when training on generated images.
Non-destructive testing is a set of techniques for defect detection in materials. While the set of imaging techniques is manifold, ultrasonic imaging is the one used the most. The analysis is mainly performed by human inspectors manually analyzing the acquired images. A low number of defects in real ultrasonic inspections and legal issues concerning data from such inspections make it difficult to obtain proper results from automatic ultrasonic image (B-scan) analysis. The goal of presented research is to obtain an improvement of the detection results by expanding the training data set with realistic synthetic samples. In this paper, we present a novel deep learning Generative Adversarial Network model for generating realistic ultrasonic B-scans with defects in distinct locations. Furthermore, we show that generated B-scans can be used for synthetic data augmentation, and can improve the performances of deep convolutional neural object detection networks. Our novel method was developed on a dataset with almost 4000 images and more than 6000 annotated defects. When trained only on real data, detector can achieve an average precision of 70%. By training only on generated data the results increased to 72%, and by mixing generated and real data we achieve almost 76% average precision. We believe that synthetic data generation can generalize to other tasks with limited data. It could also be used for training human personnel.