Landmark retrieval is to return a set of images with their landmarks similar to those of the query images. Existing studies on landmark retrieval focus on exploiting the geometries of landmarks for ...visual similarity matches. However, the visual content of social images is of large diversity in many landmarks, and also some images share common patterns over different landmarks. On the other side, it has been observed that social images usually contain multimodal contents, i.e., visual content and text tags, and each landmark has the unique characteristic of both visual content and text content. Therefore, the approaches based on similarity matching may not be effective in this environment. In this paper, we investigate whether the geographical correlation among the visual content and the text content could be exploited for landmark retrieval. In particular, we propose an effective multimodal landmark classification paradigm to leverage the multimodal contents of social image for landmark retrieval, which integrates feature refinement and landmark classifier with multimodal contents by a joint model. The geo-tagged images are automatically labeled for classifier learning. Visual features are refined based on low rank matrix recovery, and multimodal classification combined with group sparse is learned from the automatically labeled images. Finally, candidate images are ranked by combining classification result and semantic consistence measuring between the visual content and text content. Experiments on real-world datasets demonstrate the superiority of the proposed approach as compared to existing methods.
Accurately predicting user–item interactions is critically important in many real applications, including recommender systems and user behavior analysis in social networks. One major drawback of ...existing studies is that they generally directly analyze the sparse user–item interaction data without considering their semantic correlations and the structural information hidden in the data. Another limitation is that existing approaches usually embed the users and items into the different embedding spaces in a static way, but ignore the dynamic characteristics of both users and items. In this paper, we propose to learn the dynamic embedding vector trajectories rather than the static embedding vectors for users and items simultaneously. A Metapath-guided Recursive RNN based Shift embedding method named MRRNN-S is proposed to learn the continuously evolving embeddings of users and items for more accurately predicting their future interactions. The proposed MRRNN-S is extended from our previous model RRNN-S which was proposed in the earlier work. Comparedwith RRNN-S, we add the word2vec module and the skip-gram-based meta-path module to better capture the rich auxiliary information from the user–item interaction data. Specifically, we first regard the interaction data of each user with items as sentence data to model their semantic and sequential information and construct the user–item interaction graph. Then we sample the instances of meta-paths to capture the heterogeneity and structural information from the user–item interaction graph. A recursive RNN is proposed to iteratively and mutually learn the dynamic user and item embeddings in the same latent space based on their historical interactions. Next, a shift embedding module is proposed to predict the future user embeddings. To predict which item a user will interact with, we output the item embedding instead of the pairwise interaction probability between users and items, which is much more efficient. Through extensive experiments on three real-world datasets, we demonstrate that MRRNN-S achieves superior performance by extensive comparison with state-of-the-art baseline models.
Many applications use the Global Positioning System data that provide rich context information for multiple purposes. Easier availability and access of Global Positioning System data can facilitate ...various mobile applications, and one of such applications is to infer the mobility of a user. Most existing works for inferring users’ transportation modes need the combination of Global Positioning System data and other types of data such as accelerometer and Global System for Mobile Communications. However, the dependency of the applications to use data sources other than the Global Positioning System makes the use of application difficult if peer data source is not available. In this paper, we introduce a new generic framework for the inference of transportation mode by only using the Global Positioning System data. Our contribution is threefold. First, we propose a new method for Global Positioning System trajectory data preprocessing using grid probability distribution function. Second, we introduce an algorithm for the change point–based trajectory segmentation, to more effectively identify the single-mode segments from Global Positioning System trajectories. Third, we introduce new statistical-based topographic features that are more discriminative for transportation mode detection. Through extensive evaluation on the large trajectory data GeoLife, our approach shows significant performance improvement in terms of accuracy over state-of-the-art baseline models.
Currently, many intelligence systems contain the texts from multi-sources, e.g., bulletin board system posts, tweets and news. These texts can be “comparative” since they may be semantically ...correlated and thus provide us with different perspectives toward the same topics or events. To better organize the multi-sourced texts and obtain more comprehensive knowledge, we propose to study the novel problem of Mutual Clustering on Comparative Texts (MCCT), which aims to cluster the comparative texts simultaneously and collaboratively. The MCCT problem is difficult to address because 1) comparative texts usually present different data formats and structures and thus, they are hard to organize and 2) there lacks an effective method to connect the semantically correlated comparative texts to facilitate clustering them in an unified way. To this aim, in this paper, we propose a Heterogeneous Information Network-based Text clustering framework HINT. HINT first models multi-sourced texts (e.g. news and tweets) as heterogeneous information networks by introducing the shared “anchor texts” to connect the comparative texts. Next, two similarity matrices based on HINT as well as a transition matrix for cross-text-source knowledge transfer are constructed. Comparative texts clustering are then conducted by utilizing the constructed matrices. Finally, a mutual clustering algorithm is also proposed to further unify the separate clustering results of the comparative texts by introducing a clustering consistency constraint. We conduct extensive experimental on three tweets-news datasets, and the results demonstrate the effectiveness and robustness of the proposed method in addressing the MCCT problem.
Introduction
Severe typhoons, as extreme weather events, can cause a large number of casualties and property damage in coastal areas. There are mainly three kinds of methods for the prediction of ...severe typhoon formation, which are the numerical-based methods, the statistical-based methods, and the machine learning-based methods. However, existing methods do not consider the unbalance between the number of ordinary typhoon samples and severe typhoon samples, which makes the accuracies of existing methods in the prediction of severe typhoons much lower than that of ordinary typhoons.
Methods
In this paper, we propose an unbalanced severe typhoon formation prediction (USFP) framework based on transfer learning. We first propose a severe typhoon pre-learning model which is used to learn prior knowledge from a constructed balanced dataset. Then, we propose an unbalanced severe typhoon re-learning model which utilizes the prior knowledge learning from the pre-learning model. Our USFP framework fuses three different variables, which are atmospheric variables, sea surface variables, and ocean hydrographic variables.
Results
Extensive experiments based on datasets of three different regions show that our USFP framework outperforms the numerical model IFS of ECMWF and existing machine learning methods.
Higher-accuracy long-term ocean temperature prediction plays a critical role in ocean-related research fields and climate forecasting (e.g., oceanic internal waves and mesoscale eddies). The ...essential component of traditional physics-based numerical models for ocean temperature prediction is solving partial differential equations (PDEs), which has immense challenges in terms of parameterization, initial values, and boundary conditions setting. Moreover, the existing machine learning models for ocean temperature prediction have “black box” problems, and the influence of external dynamic factors is not considered. Moreover, it is hard to judge whether the model satisfies certain physical laws. In this paper, we propose a physics-guided spatio-temporal data analysis model based on the widely used ConvLSTM model to achieve long-term ocean temperature prediction and adopt two schemes to train the model in vector output and multiple parallel input and multi-step output. Meanwhile, considering the spatio-temporal correlation, physical information such as oceanic stable stratification is introduced to guide the model training. We evaluate our proposed approach on several popular deep learning models in different timesteps and data volumes in the northern coast of the South China Sea, where the frequent occurrence of internal waves leads to an intensity trend of a local transformation of sea temperature. The results show higher prediction accuracy compared with the traditional LSTM, and ConvLSTM models, and the introduction of physical laws can improve data utilization while enhancing the physical consistency of the model.
Image location prediction is used to estimate the geolocation where an image is taken, which is important for many image applications, such as image retrieval, image browsing, and organization. Since ...a social image contains heterogeneous contents, such as visual content and textual content, effectively incorporating these contents to predict location is nontrivial. Moreover, it is observed that image content patterns and the locations where they may appear correlate hierarchically. Traditional image location prediction methods mainly adopt a single-level architecture and assume images are independently distributed in geographical space, which is not directly adaptable to the hierarchical correlation. In this paper, we propose a geographically hierarchical bi-modal deep belief network (GH-BDBN) model, which is a compositional learning architecture that integrates multi-modal deep learning model with a non-parametric hierarchical prior model. GH-BDBN learns a joint representation capturing the correlations among different types of image content using a bi-modal DBN, with a geographically hierarchical prior over the joint representation to model the hierarchical correlation between image content and location. Then, an efficient inference algorithm is proposed to learn the parameters and the geographical hierarchical structure of geographical locations. Experimental results demonstrate the superiority of our model for image location prediction.
Traffic flow prediction has received rising research interest recently since it is a key step to prevent and relieve traffic congestion in urban areas. Existing methods mostly focus on road-level or ...region-level traffic prediction, and fail to deeply capture the high-order spatial-temporal correlations among the road links to perform a road network-level prediction. In this paper, we propose a network-scale deep traffic prediction model called TrafficGAN, in which Generative Adversarial Nets (GAN) is utilized to predict traffic flows under an adversarial learning framework. To capture the spatial-temporal correlations among the road links of a road network, both Convolutional Neural Nets (CNN) and Long-Short Term Memory (LSTM) models are embedded into TrafficGAN. In addition, we also design a deformable convolution kernel for CNN to make it better handle the input road network data. We extensively evaluate our proposal over two large GPS probe datasets in the arterial road network of downtown Chicago and Bay Area of California. The results show that TrafficGAN significantly outperforms both traditional statistical models and state-of-the-art deep learning models in network-scale short-term traffic flow prediction.
Exploiting deep learning techniques for traffic flow prediction has become increasingly widespread. Most existing studies combine CNN or GCN with recurrent neural network to extract the ...spatio-temporal features in traffic networks. The traffic networks can be naturally modeled as graphs which are effective to capture the topology and spatial correlations among road links. The issue is that the traffic network is dynamic due to the continuous changing of the traffic environment. Compared with the static graph, the dynamic graph can better reflect the spatio-temporal features of the traffic network. However, in practical applications, due to the limited accuracy and timeliness of data, it is hard to generate graph structures through frequent statistical data. Therefore, it is necessary to design a method to overcome data defects in traffic flow prediction. In this paper, we propose a long-term traffic flow prediction method based on dynamic graphs. The traffic network is modeled by dynamic traffic flow probability graphs, and graph convolution is performed on the dynamic graphs to learn spatial features, which are then combined with LSTM units to learn temporal features. In particular, we further propose to use graph convolutional policy network based on reinforcement learning to generate dynamic graphs when the dynamic graphs are incomplete due to the data sparsity i sue. By testing our method on city-bike data in New York City, it demonstrates that our model can achieve stable and effective long-term predictions of traffic flow, and can reduce the impact of data defects on prediction results.
Traditional Support Vector Machine (SVM) solution suffers from
O
(
n
2
) time complexity, which makes it impractical to very large datasets. To reduce its high computational complexity, several data ...reduction methods are proposed in previous studies. However, such methods are not effective to extract informative patterns. In this paper, a two-stage informative pattern extraction approach is proposed. The first stage of our approach is data cleaning based on bootstrap sampling. A bundle of weak SVM classifiers are constructed on the sampled datasets. Training data correctly classified by all the weak classifiers are cleaned due to lacking useful information for training. To further extract more informative training data, two informative pattern extraction algorithms are proposed in the second stage. As most training data are eliminated and only the more informative samples remain, the final SVM training time is reduced significantly. Contributions of this paper are three-fold. (1) First, a parallelized bootstrap sampling based method is proposed to clean the initial training data. By doing that, a large number of training data with little information are eliminated. (2) Then, we present two algorithms to effectively extract more informative training data. Both algorithms are based on maximum information entropy according to the empirical misclassification probability of each sample estimated in the first stage. Therefore, training time can be further reduced for training data further reduction. (3) Finally, empirical studies on four large datasets show the effectiveness of our approach in reducing the training data size and the computational cost, compared with the state-of-the-art algorithms, including PEGASOS, LIBLINEAR SVM and RSVM. Meanwhile, the generalization performance of our approach is comparable with baseline methods.