Forecasting the flow of crowds is of great importance to traffic management and public safety, and very challenging as it is affected by many complex factors, including spatial dependencies (nearby ...and distant), temporal dependencies (closeness, period, trend), and external conditions (e.g. weather and events). We propose a deep-learning-based approach, called ST-ResNet, to collectively forecast two types of crowd flows (i.e. inflow and outflow) in each and every region of a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the residual neural network framework to model the temporal closeness, period, and trend properties of crowd traffic. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We have developed a real-time system based on Microsoft Azure Cloud, called UrbanFlow, providing the crowd flow monitoring and forecasting in Guiyang City of China. In addition, we present an extensive experimental evaluation using two types of crowd flows in Beijing and New York City (NYC), where ST-ResNet outperforms nine well-known baselines.
Video clip retrieval and captioning tasks play an essential role in multimodal research and are the fundamental research problem for multimodal understanding and generation. The CLIP (Contrastive ...Language-Image Pre-training) model has demonstrated the power of visual concepts learning from web collected image-text datasets. In this paper, we propose a CLIP4Clip model to transfer the knowledge of the image-text pretrained CLIP model to video-text tasks in an end-to-end manner. Furthermore, we conduct several empirical studies including 1) Whether image feature is enough for video-text retrieval and captioning? 2) How a post-pretraining on a large-scale video-text dataset based on the CLIP affect the performance? 3) What is the practical mechanism to model temporal dependency between video frames? And 4) The Hyper-parameters sensitivity of the model. Extensive experimental results present that the CLIP4Clip model transferred from the CLIP can achieve SOTA results on various video-text datasets, including MSR-VTT, MSVD, LSMDC, and DiDeMo for multimodal understanding and generation tasks.
With the rapid evolution of scientific research, there are a huge volume of papers published every year and the number of scholars is also growing fast. How to effectively predict the scientific ...impact has become an important research problem, attracting the attention of researchers in various fields, and it is of great significance in improving research efficiency and assisting in decision-making and scientific evaluation. In this paper, we propose a new framework to perform a systematical survey of scientific impact prediction research. Specifically, we take the four common academic entities into account: papers, scholars, venues and institutions. We reviewed all the prediction tasks reported in the literature in detail; the input features are divided into six groups: paper-related, author-related, venue-related, institution-related, network-related and altmetrics-related. Moreover, we classify the forecasting methods into mathematical statistics-based, traditional machine learning-based, deep learning-based and graph-based, and subdivide each category according to the characteristics. Finally, we discuss open issues and existing challenges, and provide potential research directions.
The amount of data collected from different real-world applications is increasing rapidly. When the volume of data is too large to be loaded to memory, it may be impossible to analyze it using a ...single computer. Although efforts have been taken to manage big data by using a single computer, the problem may not be solved in an acceptable time frame, making parallel computing an indispensable way to handle big data. In this paper, we investigate approaches to attribute reduction in parallel using dominance-based neighborhood rough sets (DNRS), which take into consideration the partial orders among numerical and categorical attribute values, and can be utilized in a multicriteria decision-making method. We first present some properties of attribute reduction in DNRS, and then investigate principles of parallel attribute reduction in DNRS. Parallelization on different components of attribute reduction are explored in detail. Furthermore, parallel attribute reduction algorithms in DNRS are proposed. Experimental results on UCI data and big data show that the proposed parallel algorithm is both effective and efficient.
•We proposed a TES based defrosting method for cascade air source heat pump.•The method can shorten defrosting duration and reduce defrosting energy consumption.•The thermal energy stored can also ...become a source for indoor space heating.
To encourage a wider application of air source heat pumps (ASHPs) to colder areas due to the advantage of a higher energy efficiency, adopting cascade air source heat pumps (CASHPs) is a promising option. When CASHPs are operated in heating mode, frosting/defrosting has been problematic. However, the defrosting methods commonly used by conventional ASHPs cannot be directly applied to CASHPs. Therefore, a thermal energy storage (TES) based reverse cycle defrosting method for CASHPs has been proposed and an experimental study on its operating performances was carried out. Comparative tests when using both the standard hot gas by-pass defrosting method and the TES based reverse cycle defrosting method were carried out. The results suggested that when using the TES based reverse cycle defrosting method, defrosting duration was shortened by 71.4–80.5%, and defrosting energy consumption reduced by 65.1–85.2%, as compared to those when using the standard hot gas by-pass defrosting method. In addition, the thermal energy stored can also become a source for indoor space heating, and 37.7% of the normal heating capacity can be provided to a heated indoor environment when using the TES based reverse cycle defrosting method.
Feature selection has aroused extensive attention and aims at selecting features that are highly relevant to classification from raw datasets to improve the performance of a learning model. Fuzzy ...rough set theory is a powerful mathematical method for feature selection. The classical fuzzy rough set model is very sensitive to the noise while the noise samples in classification data often appear. In addition, fuzzy rough set theory does not fit well when the density distribution of the samples in the dataset varies greatly. Thus, it is of great significance to improve the robustness of fuzzy rough set models and its adaptability to data for feature selection. Inspired by these issues, we focus on the robust fuzzy rough set approach for feature selection. We first propose a robust fuzzy rough set model based on data distribution to achieve the purpose of anti-noise i.e., Noise-aware Fuzzy Rough Sets (NFRS) model. This model proposes a novel search mechanism, which weakens the sensitivity of the approximation operator to noise by considering the distribution of samples in the decision classes to weight the samples, further obtains three kinds of samples, i.e., intra-class samples, boundary samples, and outlier samples. Then, the degrees of relevance of the feature for class is defined by the dependency function based on the NFRS model to evaluate the significance of the feature subset. On this basis, an evaluation function about feature significance is constructed, which simultaneously considers the relevance and redundancy of a candidate feature provided for the selected subset and the remaining feature subset. A novel forward greedy search algorithm is presented to select a feature sequence. The selected features are subsequently evaluated with downstream classification tasks. Experimental using real-world datasets demonstrate the effectiveness of the proposed model and its superiority against comparison baseline methods.
Dividing a network into communities has great benefits in understanding the characteristics of the network. The label propagation algorithm (LPA) is a fast and convenient community detection ...algorithm. However, the community initialization of LPA does not take advantage of topological information of networks, and its robustness is poor. In this paper, we propose a stable community detection algorithm based on density peak clustering and label propagation (DS-LPA). First, the local density calculation method in density peak clustering algorithm is improved in finding the community center of the network, so as to build a suitable initial community, which can improve the quality of community partition. Then, the label update order is determined reasonably by computing the information transmission power of nodes, and the solutions for multiple candidate labels are provided, which greatly improved the robustness of the algorithm. DS-LPA is compared with other seven algorithms on the synthetic network and real-world networks.
NMI
,
ARI
, and modularity are used to evaluate these algorithms. It can be concluded that DS-LPA has a higher performance than most comparison algorithms on synthetic network with ten different mixed parameters by statistical testing. And DS-LPA can quickly calculate the best community partition on different sizes of real-world networks.
•A novel consensus multi-view clustering model via non-negative matrix factorization has been proposed to predict Alzheimer’s disease progression.•Multiple brain MRI views datasets were constructed ...using SIFT, KAZE and Gabor filter technologies.•Demonstrating that the proposed method achieves superior performance.•Providing a new solution to the medical examination and prevention of Alzheimer’s disease progression.
Machine learning has been used in the past for the auxiliary diagnosis of Alzheimer’s Disease (AD). However, most existing technologies only explore single-view data, require manual parameter setting and focus on two-class (i.e., dementia or not) classification problems. Unlike single-view data, multi-view data provide more powerful feature representation capability. Learning with multi-view data is referred to as multi-view learning, which has received certain attention in recent years. In this paper, we propose a new multi-view clustering model called Consensus Multi-view Clustering (CMC) based on nonnegative matrix factorization for predicting the multiple stages of AD progression. The proposed CMC performs multi-view learning idea to fully capture data features with limited medical images, approaches similarity relations between different entities, addresses the shortcoming from multi-view fusion that requires manual setting parameters, and further acquires a consensus representation containing shared features and complementary knowledge of multiple view data. It not only can improve the predication performance of AD, but also can screen and classify the symptoms of different AD’s phases. Experimental results using data with twelve views constructed by brain Magnetic Resonance Imaging (MRI) database from Alzheimer’s Disease Neuroimaging Initiative expound and prove the effectiveness of the proposed model.
Attribute reduction is a key step to discover interesting patterns in the decision system with numbers of attributes available. In recent years, with the fast development of data processing tools, ...the information system may increase quickly in attributes over time. How to update attribute reducts efficiently under the attribute generalization becomes an important task in knowledge discovery related tasks since the result of attribute reduction may alter with the increase of attributes. This paper aims for investigation of incremental attribute reduction algorithm based on knowledge granularity in the decision system under the variation of attributes. Incremental mechanisms to calculate the new knowledge granularity are first introduced. Then, the corresponding incremental algorithms are presented for attribute reduction based on the calculated knowledge granularity when multiple attributes are added to the decision system. Finally, experiments performed on UCI data sets and the complexity analysis show that the proposed incremental methods are effective and efficient to update attribute reducts with the increase of attributes.
•Investigation of matrix-based incremental mechanisms of knowledge granularity.•Development of a matrix-based incremental updating attribution reduction method when adding multiple attributes.•Presentation of an incremental reduction method based on non-matrix to increase the efficiency of the matrix-based method.
Approximations of a concept by a variable precision rough-set model (VPRS) usually vary under a dynamic information system environment. It is thus effective to carry out incremental updating ...approximations by utilizing previous data structures. This paper focuses on a new incremental method for updating approximations of VPRS while objects in the information system dynamically alter. It discusses properties of information granulation and approximations under the dynamic environment while objects in the universe evolve over time. The variation of an attribute's domain is also considered to perform incremental updating for approximations under VPRS. Finally, an extensive experimental evaluation validates the efficiency of the proposed method for dynamic maintenance of VPRS approximations.