Illumination variation often occurs in visual tracking, which has a severe impact on the system performance. Many trackers based on Discriminative correlation filter (DCF) have recently obtained ...promising performance, showing robustness to illumination variation. However, when the target objects undergo significant appearance variation due to intense illumination variation, the features extracted from the object will not have the ability to be discriminated from the background, which causes the tracking algorithm to lose the target in the scene. In this paper, in order to improve the accuracy and robustness of the Discriminative correlation filter (DCF) trackers under intense illumination variation, we propose a very effective strategy by performing multiple region detection and using alternate templates (MRAT). Based on parallel computation, we are able to perform simultaneous detection of multiple regions, equivalently enlarging the search region. Meanwhile the alternate template is saved by a template update mechanism in order to improve the accuracy of the tracker under strong illumination variation. Experimental results on large-scale public benchmark datasets show the effectiveness of the proposed method compared to state-of-the-art methods.
Deep neural network (DNN) exhibits state-of-the-art performance in many fields including microstructure recognition where big dataset is used in training. However, DNN trained by conventional methods ...with small datasets commonly shows worse performance than traditional machine learning methods, e.g. shallow neural network and support vector machine. This inherent limitation prevented the wide adoption of DNN in material study because collecting and assembling big dataset in material science is a challenge. In this study, we attempted to predict solidification defects by DNN regression with a small dataset that contains 487 data points. It is found that a pre-trained and fine-tuned DNN shows better generalization performance over shallow neural network, support vector machine, and DNN trained by conventional methods. The trained DNN transforms scattered experimental data points into a map of high accuracy in high-dimensional chemistry and processing parameters space. Though DNN with big datasets is the optimal solution, DNN with small datasets and pre-training can be a reasonable choice when big datasets are unavailable in material study.
Display omitted
•The deep neural network model for predicting solidification cracking susceptibility of stainless steels are developed.•Stacked auto-encoder is used to pre-train deep neural network with a small dataset for optimization of initial weights.•Deep neural network model shows better generalization performance than shallow neural network and support vector machine.
Running gait patterns have implications for revealing the causes of injuries between higher-mileage runners and low-mileage runners. However, there is limited research on the possible relationships ...between running gait patterns and weekly running mileages. In recent years, machine learning algorithms have been used for pattern recognition and classification of gait features to emphasize the uniqueness of gait patterns. However, they all have a representative problem of being a black box that often lacks the interpretability of the predicted results of the classifier. Therefore, this study was conducted using a Deep Neural Network (DNN) model and Layer-wise Relevance Propagation (LRP) technology to investigate the differences in running gait patterns between higher-mileage runners and low-mileage runners. It was found that the ankle and knee provide considerable information to recognize gait features, especially in the sagittal and transverse planes. This may be the reason why high-mileage and low-mileage runners have different injury patterns due to their different gait patterns. The early stages of stance are very important in gait pattern recognition because the pattern contains effective information related to gait. The findings of the study noted that LRP completes a feasible interpretation of the predicted results of the model, thus providing more interesting insights and more effective information for analyzing gait patterns.
Sparse representation-based visual tracking approaches have attracted increasing interests in the community in recent years. The main idea is to linearly represent each target candidate using a set ...of target and trivial templates, while imposing a sparsity constraint onto the representation coefficients. After we obtain the coefficients using ℓ 1 -norm minimization methods, the candidate with the lowest error, when it is reconstructed using only the target templates and the associated coefficients, is considered as the tracking result. In spite of promising system performance widely reported, it is unclear if the performance of these trackers can be maximized. In addition, computational complexity caused by the dimensionality of the feature space limits these algorithms in real-time applications. In this paper, we propose a real-time visual tracking method based on structurally random projection (RP) and weighted least squares (WLS) techniques. In particular, to enhance the discriminative capability of the tracker, we introduce background templates to the linear representation framework. To handle appearance variations over time, we relax the sparsity constraint using a WLS method to obtain the representation coefficients. To further reduce the computational complexity, structurally RP is used to reduce the dimensionality of the feature space, while preserving the pairwise distances between the data points in the feature space. Experimental results show that the proposed approach outperforms several state-of-the-art tracking methods.
A scale invariant feature transform (SIFT) based mean shift algorithm is presented for object tracking in real scenarios. SIFT features are used to correspond the region of interests across frames. ...Meanwhile, mean shift is applied to conduct similarity search via color histograms. The probability distributions from these two measurements are evaluated in an expectation–maximization scheme so as to achieve maximum likelihood estimation of similar regions. This mutual support mechanism can lead to consistent tracking performance if one of the two measurements becomes unstable. Experimental work demonstrates that the proposed mean shift/SIFT strategy improves the tracking performance of the classical mean shift and SIFT tracking algorithms in complicated real scenarios.
Learning to recognize novel visual categories from a few examples is a challenging task for machines in real-world industrial applications. In contrast, humans have the ability to discriminate even ...similar objects with little supervision. This article attempts to address the few-shot fine-grained image classification problem. We propose a feature fusion model to explore discriminative features by focusing on key regions. The model utilizes the focus-area location mechanism to discover the perceptually similar regions among objects. High-order integration is employed to capture the interaction information among intraparts. We also design a center neighbor loss to form robust embedding space distributions. Furthermore, we build a typical fine-grained and few-shot learning dataset mini PPlankton from the real-world application in the area of marine ecological environments. Extensive experiments are carried out to validate the performance of our method. The results demonstrate that our model achieves competitive performance compared with state-of-the-art models. Our work is a valuable complement to the model domain-specific industrial applications.
•Examined the spatiotemporal pattern of BSS using random forest with PDP.•Analyzed the relative importance of influencing factors across different scenarios.•Revealed the nonlinear and threshold ...effect of built environment factors.•Compared the heterogeneous impact of influencing factors between study areas.
To better understand dockless bike-sharing (DBS) usage and advance the knowledge on shared bicycle service, this study empirically investigated the riding behavior in the time and space dimensions based on multisource datasets. Taking Central Business District (CBD) and Beijing West Railway Station (BWRS) as study areas, this study analyzed and compared the DBS usage based on the traffic grid between the two study areas. Furthermore, the random forest (RF) model was applied to investigate the contribution of influencing factors on origin/ destination and origin–destination pair trip volume. Partial Dependence Plots (PDP) analysis was conducted to explore the nonlinear effects of influencing factors. Results show considerable variation across different scenarios. Variables such as government agencies, restaurants, bus stop distance, and metro distance show nonlinear and threshold effects on DBS usage. The findings offer valuable insights for urban infrastructure development and bike rebalancing strategies, and the formulation of green and sustainable transportation policies.
Synthetic aperture radar automatic target recognition (SAR-ATR) has made great progress in recent years. Most of the established recognition methods are supervised, which have strong dependence on ...image labels. However, obtaining the labels of radar images is expensive and time-consuming. In this paper, we present a semi-supervised learning method that is based on the standard deep convolutional generative adversarial networks (DCGANs). We double the discriminator that is used in DCGANs and utilize the two discriminators for joint training. In this process, we introduce a noisy data learning theory to reduce the negative impact of the incorrectly labeled samples on the performance of the networks. We replace the last layer of the classic discriminators with the standard softmax function to output a vector of class probabilities so that we can recognize multiple objects. We subsequently modify the loss function in order to adapt to the revised network structure. In our model, the two discriminators share the same generator, and we take the average value of them when computing the loss function of the generator, which can improve the training stability of DCGANs to some extent. We also utilize images of higher quality from the generated images for training in order to improve the performance of the networks. Our method has achieved state-of-the-art results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset, and we have proved that using the generated images to train the networks can improve the recognition accuracy with a small number of labeled samples.
In recent years, with the improvement of synthetic aperture radar (SAR) imaging resolution, it is urgent to develop methods with higher accuracy and faster speed for ship detection in high-resolution ...SAR images. Among all kinds of methods, deep-learning-based algorithms bring promising performance due to end-to-end detection and automated feature extraction. However, several challenges still exist: (1) standard deep learning detectors based on anchors have certain unsolved problems, such as tuning of anchor-related parameters, scale-variation and high computational costs. (2) SAR data is huge but the labeled data is relatively small, which may lead to overfitting in training. (3) To improve detection speed, deep learning detectors generally detect targets based on low-resolution features, which may cause missed detections for small targets. In order to address the above problems, an anchor-free convolutional network with dense attention feature aggregation is proposed in this paper. Firstly, we use a lightweight feature extractor to extract multiscale ship features. The inverted residual blocks with depth-wise separable convolution reduce the network parameters and improve the detection speed. Secondly, a novel feature aggregation scheme called dense attention feature aggregation (DAFA) is proposed to obtain a high-resolution feature map with multiscale information. By combining the multiscale features through dense connections and iterative fusions, DAFA improves the generalization performance of the network. In addition, an attention block, namely spatial and channel squeeze and excitation (SCSE) block is embedded in the upsampling process of DAFA to enhance the salient features of the target and suppress the background clutters. Third, an anchor-free detector, which is a center-point-based ship predictor (CSP), is adopted in this paper. CSP regresses the ship centers and ship sizes simultaneously on the high-resolution feature map to implement anchor-free and nonmaximum suppression (NMS)-free ship detection. The experiments on the AirSARShip-1.0 dataset demonstrate the effectiveness of our method. The results show that the proposed method outperforms several mainstream detection algorithms in both accuracy and speed.
In this paper, we propose a biologically inspired appearance model for robust visual tracking. Motivated in part by the success of the hierarchical organization of the primary visual cortex (area ...V1), we establish an architecture consisting of five layers: whitening, rectification, normalization, coding, and pooling. The first three layers stem from the models developed for object recognition. In this paper, our attention focuses on the coding and pooling layers. In particular, we use a discriminative sparse coding method in the coding layer along with spatial pyramid representation in the pooling layer, which makes it easier to distinguish the target to be tracked from its background in the presence of appearance variations. An extensive experimental study shows that the proposed method has higher tracking accuracy than several state-of-the-art trackers.