Rapid advances in mobile devices and cloud-based music service now allow consumers to enjoy music anytime and anywhere. Consequently, there has been an increasing demand in studying intelligent ...techniques to facilitate context-aware music recommendation. However, one important context that is generally overlooked is user’s venue, which often includes surrounding atmosphere, correlates with activities, and greatly influences the user’s music preferences. In this article, we present a novel venue-aware music recommender system called VenueMusic to effectively identify suitable songs for various types of popular venues in our daily lives. Toward this goal, a Location-aware Topic Model (LTM) is proposed to (i) mine the common features of songs that are suitable for a venue type in a latent semantic space and (ii) represent songs and venue types in the shared latent space, in which songs and venue types can be directly matched. It is worth mentioning that to discover meaningful latent topics with the LTM, a Music Concept Sequence Generation (MCSG) scheme is designed to extract effective semantic representations for songs. An extensive experimental study based on two large music test collections demonstrates the effectiveness of the proposed topic model and MCSG scheme. The comparisons with state-of-the-art music recommender systems demonstrate the superior performance of VenueMusic system on recommendation accuracy by associating venue and music contents using a latent semantic space. This work is a pioneering study on the development of a venue-aware music recommender system. The results show the importance of considering the influence of venue types in the development of context-aware music recommender systems.
As an emerging technology to support scalable content-based image retrieval (CBIR), hashing has recently received great attention and became a very active research domain. In this study, we propose a ...novel unsupervised visual hashing approach called semantic-assisted visual hashing (SAVH). Distinguished from semi-supervised and supervised visual hashing, its core idea is to effectively extract the rich semantics latently embedded in auxiliary texts of images to boost the effectiveness of visual hashing without any explicit semantic labels. To achieve the target, a unified unsupervised framework is developed to learn hash codes by simultaneously preserving visual similarities of images, integrating the semantic assistance from auxiliary texts on modeling high-order relationships of inter-images, and characterizing the correlations between images and shared topics. Our performance study on three publicly available image collections: Wiki, MIR Flickr, and NUS-WIDE indicates that SAVH can achieve superior performance over several state-of-the-art techniques.
From social media has emerged continuous needs for automatic travel recommendations. Collaborative filtering (CF) is the most well-known approach. However, existing approaches generally suffer from ...various weaknesses. For example , sparsity can significantly degrade the performance of traditional CF. If a user only visits very few locations, accurate similar user identification becomes very challenging due to lack of sufficient information for effective inference. Moreover, existing recommendation approaches often ignore rich user information like textual descriptions of photos which can reflect users' travel preferences. The topic model (TM) method is an effective way to solve the "sparsity problem," but is still far from satisfactory. In this paper, an author topic model-based collaborative filtering (ATCF) method is proposed to facilitate comprehensive points of interest (POIs) recommendations for social users. In our approach, user preference topics, such as cultural, cityscape, or landmark, are extracted from the geo-tag constrained textual description of photos via the author topic model instead of only from the geo-tags (GPS locations). Advantages and superior performance of our approach are demonstrated by extensive experiments on a large collection of data.
Hashing compresses high-dimensional features into compact binary codes. It is one of the promising techniques to support efficient mobile image retrieval, due to its low data transmission cost and ...fast retrieval response. However, most of existing hashing strategies simply rely on low-level features. Thus, they may generate hashing codes with limited discriminative capability. Moreover, many of them fail to exploit complex and high-order semantic correlations that inherently exist among images. Motivated by these observations, we propose a novel unsupervised hashing scheme, called topic hypergraph hashing (THH), to address the limitations. THH effectively mitigates the semantic shortage of hashing codes by exploiting auxiliary texts around images. In our method, relations between images and semantic topics are first discovered via robust collective non-negative matrix factorization. Afterwards, a unified topic hypergraph, where images and topics are represented with independent vertices and hyperedges, respectively, is constructed to model inherent high-order semantic correlations of images. Finally, hashing codes and functions are learned by simultaneously enforcing semantic consistence and preserving the discovered semantic relations. Experiments on publicly available datasets demonstrate that THH can achieve superior performance compared with several state-of-the-art methods, and it is more suitable for mobile image retrieval.
Due to the popularity of social media websites, extensive research efforts have been dedicated to tag-based social image search. Both visual information and tags have been investigated in the ...research field. However, most existing methods use tags and visual characteristics either separately or sequentially in order to estimate the relevance of images. In this paper, we propose an approach that simultaneously utilizes both visual and textual information to estimate the relevance of user tagged images. The relevance estimation is determined with a hypergraph learning approach. In this method, a social image hypergraph is constructed, where vertices represent images and hyperedges represent visual or textual terms. Learning is achieved with use of a set of pseudo-positive images, where the weights of hyperedges are updated throughout the learning process. In this way, the impact of different tags and visual words can be automatically modulated. Comparative results of the experiments conducted on a dataset including 370+images are presented, which demonstrate the effectiveness of the proposed approach.
How can we find a general way to choose the most suitable samples for training a classifier? Even with very limited prior information? Active learning, which can be regarded as an iterative ...optimization procedure, plays a key role to construct a refined training set to improve the classification performance in a variety of applications, such as text analysis, image recognition, social network modeling, etc. Although combining representativeness and informativeness of samples has been proven promising for active sampling, state-of-the-art methods perform well under certain data structures. Then can we find a way to fuse the two active sampling criteria without any assumption on data? This paper proposes a general active learning framework that effectively fuses the two criteria. Inspired by a two-sample discrepancy problem, triple measures are elaborately designed to guarantee that the query samples not only possess the representativeness of the unlabeled data but also reveal the diversity of the labeled data. Any appropriate similarity measure can be employed to construct the triple measures. Meanwhile, an uncertain measure is leveraged to generate the informativeness criterion, which can be carried out in different ways. Rooted in this framework, a practical active learning algorithm is proposed, which exploits a radial basis function together with the estimated probabilities to construct the triple measures and a modified best-versus-second-best strategy to construct the uncertain measure, respectively. Experimental results on benchmark datasets demonstrate that our algorithm consistently achieves superior performance over the state-of-the-art active learning algorithms.
Weakly-supervised image segmentation is a challenging problem with multidisciplinary applications in multimedia content analysis and beyond. It aims to segment an image by leveraging its image-level ...semantics (i.e., tags). This paper presents a weakly-supervised image segmentation algorithm that learns the distribution of spatially structural superpixel sets from image-level labels. More specifically, we first extract graphlets from a given image, which are small-sized graphs consisting of superpixels and encapsulating their spatial structure. Then, an efficient manifold embedding algorithm is proposed to transfer labels from training images into graphlets. It is further observed that there are numerous redundant graphlets that are not discriminative to semantic categories, which are abandoned by a graphlet selection scheme as they make no contribution to the subsequent segmentation. Thereafter, we use a Gaussian mixture model (GMM) to learn the distribution of the selected post-embedding graphlets (i.e., vectors output from the graphlet embedding). Finally, we propose an image segmentation algorithm, termed representative graphlet cut, which leverages the learned GMM prior to measure the structure homogeneity of a test image. Experimental results show that the proposed approach outperforms state-of-the-art weakly-supervised image segmentation methods, on five popular segmentation data sets. Besides, our approach performs competitively to the fully-supervised segmentation models.
The vocabulary gap between health seekers and providers has hindered the cross-system operability and the inter-user reusability. To bridge this gap, this paper presents a novel scheme to code the ...medical records by jointly utilizing local mining and global learning approaches, which are tightly linked and mutually reinforced. Local mining attempts to code the individual medical record by independently extracting the medical concepts from the medical record itself and then mapping them to authenticated terminologies. A corpus-aware terminology vocabulary is naturally constructed as a byproduct, which is used as the terminology space for global learning. Local mining approach, however, may suffer from information loss and lower precision, which are caused by the absence of key medical concepts and the presence of irrelevant medical concepts. Global learning, on the other hand, works towards enhancing the local medical coding via collaboratively discovering missing key terminologies and keeping off the irrelevant terminologies by analyzing the social neighbors. Comprehensive experiments well validate the proposed scheme and each of its component. Practically, this unsupervised scheme holds potential to large-scale data.
In this paper, a novel approach is developed to achieve automatic image collection summarization. The effectiveness of the summary is reflected by its ability to reconstruct the original set or each ...individual image in the set. We have leveraged the dictionary learning for sparse representation model to construct the summary and to represent the image. Specifically we reformulate the summarization problem into a dictionary learning problem by selecting bases which can be sparsely combined to represent the original image and achieve a minimum global reconstruction error, such as MSE (Mean Square Error). The resulting “Sparse Least Square” problem is NP-hard, thus a simulated annealing algorithm is adopted to learn such dictionary, or image summary, by minimizing the proposed optimization function. A quantitative measurement is defined for assessing the quality of the image summary by investigating both its reconstruction ability and its representativeness of the original image set in large size. We have also compared the performance of our image summarization approach with that of six other baseline summarization tools on multiple image sets (ImageNet, NUS-WIDE-SCENE and Event image set). Our experimental results have shown that the proposed dictionary learning approach can obtain more accurate results as compared with other six baseline summarization algorithms.
► Reformulate the summarization problem into a dictionary learning problem. ► Proposed an adopted simulated annealing algorithm for basis selection. ► Evaluate the proposed algorithm with six baseline algorithms on three image datasets. ► Design a user-interactive system for subjective results evaluation.
While content-based landmark image search has recently received a lot of attention and became a very active domain, it still remains a challenging problem. Among the various reasons, high diverse ...visual content is the most significant one. It is common that for the same landmark, images with a wide range of visual appearances can be found from different sources and different landmarks may share very similar sets of images. As a consequence, it is very hard to accurately estimate the similarities between the landmarks purely based on single type of visual feature. Moreover, the relationships between landmark images can be very complex and how to develop an effective modeling scheme to characterize the associations still remains an open question. Motivated by these concerns, we propose multimodal hypergraph (MMHG) to characterize the complex associations between landmark images. In MMHG, images are modeled as independent vertices and hyperedges contain several vertices corresponding to particular views. Multiple hypergraphs are firstly constructed independently based on different visual modalities to describe the hidden high-order relations from different aspects. Then, they are integrated together to involve discriminative information from heterogeneous sources. We also propose a novel content-based visual landmark search system based on MMHG to facilitate effective search. Distinguished from the existing approaches, we design a unified computational module to support query-specific combination weight learning. An extensive experiment study on a large-scale test collection demonstrates the effectiveness of our scheme over state-of-the-art approaches.