Different approaches were proposed to design deep CNNs for semantic segmentation. Usually, they are built upon an encoder–decoder architecture and require computationally expensive operations on ...high-resolution activation maps. Since for real-time segmentation the costs are critical, efficient approaches compromise spatial information to achieve real-time segmentation but with a considerable drop in accuracy. We introduce a new module based on depthwise separable, shuffled and grouped convolutions that optimize up-sampling operations by using a sizeable receptive field and preserving spatial information. Then, we designed an efficient network based on dense connectivity to achieve a remarkable trade-off accuracy and speed. We show through set of experiments that even by up-sampling with a lightweight decoder, our applied architecture scores on Cityscape 69.5% Mean IoU with
1024
×
512
inputs and 95.2 FPS on the test set.
Deep neural networks are the most used machine learning systems in the literature, for they are able to train huge amounts of data with a large number of parameters in a very effective way. However, ...one of the problems that such networks face is overfitting. There are many ways to address the overfitting issue, one of which is regularization using the dropout function. The use of dropout has the benefit of using a combination of different networks in one architecture and preventing units from co-adapting in an excessive way. The dropout function is known to work well in fully-connected layers as well as in pooling layers. In this work, we propose a novel method called Mixed-Pooling-Dropout that adapts the dropout function with a mixed-pooling strategy. The dropout operation is represented by a binary mask with each element drawn independently from a Bernoulli distribution. Experimental results show that our proposed method outperforms conventional pooling methods as well as the max-pooling-dropout method with an interesting margin (0.926 vs 0.868) regardless of the retaining probability.
In autonomous driving systems, object detection plays a pivotal role by facilitating their ability to perceive the surrounding road environment effectively. Object detection's foremost challenge ...pertains to its real-time operational capabilities. Achieving this necessitates reducing the detectors' computational complexity while preserving their accuracy. Nevertheless, most of the approach in object detection involves dividing image processing over multiple heads, each tasked with detecting objects at particular scales. Even though this approach improves detection accuracy, it adds an extra computational burden. In this study, our objective is to assess the feasibility of employing a single head within the originally multi-headed architecture of the FCOS detector. In response to the challenges posed by this significant modification, we propose a set of straightforward solutions, resulting in the development of a novel Fully Convolutional One-Stage with a Single Head (FCOSH) detector. Through experiments on the BDD100K benchmark, our FCOSH detector exhibits substantial improvements in computational efficiency relative to the original FCOS while concurrently achieving a superior detection 0.5% accuracy. Specifically, FCOSH achieves an 18% reduction in inference time, a 24% reduction in required FLOPs, and a 10% decrease in the number of model parameters compared to FCOS.
•Relying on a single head within FCOS demonstrates considerable potential for reducing the computational complexity.•The single-head approach showcases improvement in relation to detection accuracy.•The effective addressing of challenges associated with this approach is crucial to unlocking its full benefits.
3D facial attractiveness enhancement using free form deformation Manal, El Rhazi; Arsalane, Zarghili; Aicha, Majda ...
Journal of King Saud University. Computer and information sciences,
June 2022, 2022-06-00, 2022-06-01, Volume:
34, Issue:
6
Journal Article
Peer reviewed
Open access
Physical attractiveness has an important and great influence in human social life as it was widely linked to the possession of a variety of positive qualities that makes an individual better than the ...others, as well as offering many advantages to the individual’s life especially in a society obsessed with beauty and attractiveness. As consequences to the attractiveness influence, people tend to look more and more attractive using makeup or in most of the time plastic surgeries. In parallel to the increased need to plastic surgeries, the need to simplify and show how the face will look like after the surgery is recommended, to plan the procedures, verify the satisfaction of the patient, and reduce the risks during the surgery as well. In this paper, we introduce a novel system to analyze and enhance the attractiveness of 3D faces using a Free Form Deformation technique based on the Bézier function. This system was tested on two 3D face datasets and the experimental results show that the edited 3D faces are more attractive compared to the original faces in terms of respecting the beauty canons which are: facial symmetry, golden ration, neoclassical proportions, and the angular profile.
Lung CT image segmentation is a necessary initial step for lung image analysis, it is a prerequisite step to provide an accurate lung CT image analysis such as lung cancer detection.
In this work, we ...propose a lung CT image segmentation using the U-net architecture, one of the most used architectures in deep learning for image segmentation. The architecture consists of a contracting path to extract high-level information and a symmetric expanding path that recovers the information needed. This network can be trained end-to-end from very few images and outperforms many methods.
Experimental results show an accurate segmentation with 0.9502 Dice-Coefficient index.
Facial landmarks detection is an important and basic step in many face analysis applications. For this reason, it is considered a challenging task as the final results of the analysis depend on the ...accuracy of the landmarks detection. Decades of research have investigated approaches for two-dimensional (2D) facial landmarks detection but; however, the good obtained results, they still suffer from some weakness regarding the pose and illumination variations. Recently, the large availability of 3D scans makes the use of 3D face models easier hence, overcome the problems caused using 2D images. Many papers have studied the problem of 3D facial landmarks detection; nevertheless, there is a lack of literature reviews allowing an overview of the studies and researches related to the 3D face landmarks detection. In this study, the authors present a detailed survey of the latest (2010–2018) approaches based geometric information for 3D face landmarks detection, including the limitations and strengths of each work.
In this paper, we propose a 3D object recognition approach, based on the shape distribution D2 and artificial neural networks. The challenge is to discriminate between similar and dissimilar shapes ...by finding a shape signature that can be constructed and classified quickly. We propose a connectionist system to recognize 3D objects in VRML (Virtual Reality Modeling Language) format. The key idea is to represent the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of an object. The proposed strategy is the following: from a polygon object to be recognized, a triangulation is performed. Then, distances are calculated between two random points of the triangulated surface of the 3D object. The frequency of these distances will be represented by a normalized histogram. The values of these histograms feed a multi-layer neural network with back- propagation training. We demonstrate the potential of this approach in a set of experiments, which proved our system could achieve above 91.7% recognition rate. In addition, to evaluate the efficiency of our method, we compare our classifier with Support vector machine and k- nearest neighbours. The simulation results highlight the performance of the proposed approach.
In computer vision, there are various machine learning algorithms that have proven to be very effective. Con-volutional Neural Networks (CNNs) are a kind of deep learning algorithms that became ...mostly used in image processing with a remarkable success rate compared to conventional machine learning algorithms. CNNs are widely used in different computer vision fields, especially in the medical domain. In this study, we perform a semantic brain tumor segmentation using a novel deep learning architecture we called multi-scale ConvLSTM Attention Neural Network, that resides in Convolutional Long-Short-Term-Memory (ConvLSTM) and Attention units with the use of multiple feature extraction blocks such as Inception, Squeeze-Excitation and Residual Network block. The use of such blocks separately is known to boost the performance of the model, in our case we show that their combination has also a beneficial effect on the accuracy. Experimental results show that our model performs brain tumor segmentation effectively compared to standard U-Net, Attention U-net and Fully Connected Network (FCN), with 79.78 Dice score using our method compared to 78.61, 73.65 and 72.89 using Attention U-net, standard U-net and FCN respectively.
In today’s medicine, Computer-Aided Diagnosis Systems (CAD) are very used to improve the screening test accuracy of pulmonary nodules. Processing, classification, and detection techniques form the ...basis of CAD architecture. In this work, we focus on the classification step in a CAD system where we use Discrete Cosine Transform (DCT) along with Convolutional Neural Network (CNN) to perform an efficient classification method for pulmonary nodules. Combining both DCT and CNN, the proposed method provides high-level accuracy that outperforms the conventional CNN model.
In this paper, a color face recognition system is developed to identify human faces using Back propagation neural network. The architecture we adopt is All-Class-in-One-Network, where all the classes ...are placed in a single network. To accelerate the learning process we propose the use of Bhattacharyya distance as total error to train the network. In the experimental section we compare how the algorithm converge using the mean square error and the Bhattacharyya distance. Experimental results indicated that the image faces can be recognized by the proposed system effectively and swiftly.