The integrated development of city clusters has given rise to an increasing demand for intercity travel. Intercity ride-pooling service exhibits considerable potential in upgrading traditional ...intercity bus services by implementing demand-responsive enhancements. Nevertheless, its online operations suffer the inherent complexities due to the coupling of vehicle resource allocation among cities and pooled-ride vehicle routing. To tackle these challenges, this study proposes a two-level framework designed to facilitate online fleet management. Specifically, a novel multi-agent feudal reinforcement learning model is proposed at the upper level of the framework to cooperatively assign idle vehicles to different intercity lines, while the lower level updates the routes of vehicles using an adaptive large neighborhood search heuristic. Numerical studies based on the realistic dataset of Xiamen and its surrounding cities in China show that the proposed framework effectively mitigates the supply and demand imbalances, and achieves significant improvement in both the average daily system profit and order fulfillment ratio.
•Investigate online fleet operation problem for on-demand intercity ride-pooling services.•Propose a bi-level framework for the coupling problem of vehicle allocation and routing.•Develop a multi-agent feudal network for fleet assignment to enhance agent cooperation.•Verify the performance under various network topologies and supply-demand fluctuations.•Conduct experiments based on realistic operational data to provide managerial insights.
This paper presents a novel multi-pooling architecture generated by combining the advantages of wavelet and max-pooling operations in convolutional neural networks (CNNs), focusing on semantic ...segmentation tasks. CNNs often use pooling to reduce the number of parameters, improve invariance to certain distortions, and enlarge the receptive field. However, pooling can cause information loss and thus is detrimental to further operations such as feature extraction and analysis. This problem is particularly critical for semantic segmentation, where each pixel of an image is assigned to a specific class to divide the image into disjoint regions of interest. To address this problem, pooling strategies based on wavelets-operations have been proposed with the promise to achieve a better trade-off between receptive field size and computational efficiency. Previous works have confirmed the superiority of wavelet pooling over the traditional one in semantic segmentation tasks. However, we have observed in our computational experiments that the expressive gains reported from the use of wavelet pooling in other segmentation tasks were not observed in the scope of aerial imagery due to imprecision in the segmentation of image details. The combination of wavelet pooling and max-pooling, a solution not yet reported in the literature, can address that issue. Such gap observed in the pooling area motivated the two proposals that are the main contributions of this paper: (a) A new multi-pooling strategy combining wavelet and traditional pooling in a new network structure suitable for aerial image segmentation tasks; (b) Two-stream architectures using the traditional max-pooling and wavelet pooling as streams. These proposals were implemented using the Segnet, a known architecture for semantic segmentation. The computational experiments, based on the IRRG images from the Potsdam and Vaihingen data sets, demonstrated that the proposed architectures surpassed the original Segnet architecture’s performance with results comparable to state-of-the-art approaches.
Display omitted
•A new multi-pooling strategy combining wavelet and traditional pooling.•A new version of Segnet, named MPSegnet, using the new multi-pooling strategy.•A two-stream architecture that combines the Segnet network with MPSegnet.•The multi-pooling scheme efficiency was verified using complexity and visual analysis.•Experiments showing that our methods are comparable to the state-of-the-art.
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems. Existing methods heavily rely on the expert-designed networks, which may lead to a sub-optimal solution for FAS task. ...Here we propose the first FAS method based on neural architecture search (NAS), called NAS-FAS, to discover the well-suited task-aware networks. Unlike previous NAS works mainly focus on developing efficient search strategies in generic object classification, we pay more attention to study the search spaces for FAS task. The challenges of utilizing NAS for FAS are in two folds: the networks searched on 1) a specific acquisition condition might perform poorly in unseen conditions, and 2) particular spoofing attacks might generalize badly for unseen attacks. To overcome these two issues, we develop a novel search space consisting of central difference convolution and pooling operators. Moreover, an efficient static-dynamic representation is exploited for fully mining the FAS-aware spatio-temporal discrepancy. Besides, we propose Domain/Type-aware Meta-NAS, which leverages cross-domain/type knowledge for robust searching. Finally, in order to evaluate the NAS transferability for cross datasets and unknown attack types, we release a large-scale 3D mask dataset, namely CASIA-SURF 3DMask, for supporting the new 'cross-dataset cross-type' testing protocol. Experiments demonstrate that the proposed NAS-FAS achieves state-of-the-art performance on nine FAS benchmark datasets with four testing protocols.
•A novel construction of multi-scale convolutional transfer learning network is established.•The proposed method focuses on the rolling bearing fault diagnosis without any signal preprocessing or ...feature pre-extraction.•For the interference of the different working conditions and domains, the proposed method shows excellent adaptability.•The proposed model improves the richness of the features through parallel stack in thr width direction.
Intelligent fault detection and diagnosis, as an important approach, play a crucial role in ensuring the stable, reliable and safe operation of rolling bearings, which is one of the most main components in the rotating machinery. However, the data distribution shift is inevitable in the practical scene due to changes in internal and external environments, it is still challenging to establish an effective fault diagnosis model that can eliminate the same distribution assumption. In light of the above demands, a novel transfer learning framework based on deep multi-scale convolutional neural network (MSCNN) is presented in this paper. First, a novel multi-scale module is ingenious established based on dilated convolution, which is used as the key part to obtain differential features through different perceptual fields. Then, in order to further reduce the complexity of the proposed model, a global average pooling technology is adopted to replace the traditional fully-connected layer. Finally, the architecture and weights of the MSCNN pre-trained on source domain are transferred to the other different but similar tasks with proper fine-tuning instead of training a network from scratch. The proposed MSCNN is evaluated by different transfer scenarios constructed on two famous rolling bearing test-bed. Three case studies show that the proposed framework not only has excellent performance on the source domain, but also has superior transferability on variable working conditions and domains.
We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature ...extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features.
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an ...encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network 1 . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN 2 and also with the well known DeepLab-LargeFOV 3 , DeconvNet 4 architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/.
We propose a deep bilinear model for blind image quality assessment that works for both synthetically and authentically distorted images. Our model constitutes two streams of deep convolutional ...neural networks (CNNs), specializing in two distortion scenarios separately. For synthetic distortions, we first pre-train a CNN to classify the distortion type and the level of an input image, whose ground truth label is readily available at a large scale. For authentic distortions, we make use of a pre-train CNN (VGG-16) for the image classification task. The two feature sets are bilinearly pooled into one representation for a final quality prediction. We fine-tune the whole network on the target databases using a variant of stochastic gradient descent. The extensive experimental results show that the proposed model achieves state-of-the-art performance on both synthetic and authentic IQA databases. Furthermore, we verify the generalizability of our method on the large-scale Waterloo Exploration Database, and demonstrate its competitiveness using the group maximum differentiation competition methodology.