This paper has concentrated on how to retrieve a list of songs from music database similar to the specific one. Content-based retrieval of music is one of the most popular research subjects, which ...mostly focuses on querying the exactly one from database by humming a tune or submitting a recording of music. However, getting some songs similar to, but not exactly the given one could be also interested by people. In this paper, we propose a classification framework to solve this problem using string-based methods. Introducing string-based similarity measure, our framework has lower computational complexity and better effect. We also developed a new distributed clustering algorithm under MapReduce framework, which performed well for massive audio data. Experiments are performed and analyzed to show the efficiency and the effectiveness of our proposed framework.
On Implementing a Text-Database-as-a-Service Yaoliang Chen; Chang Liu; Jianfeng Zhang ...
2016 IEEE International Conference on Web Services (ICWS),
06/2016
Conference Proceeding
The emergence of heterogeneous big data in the last decade calls for a hybrid data service that can manage all different kinds of data, including relational data, JSON data, and text data in a ...unified way. Among them, text data play an important role in many fields such as Internet-of-Things, biology, social network, and etc. For example, a smart meter application detecting the anomaly of the electricity use might want to link each anomaly of a certain area to a meaningful social event mined from the news in plain text. As a result, text data services have raised more and more attentions by the research community. Most of such services are implemented based on a content management system such as ElasticSearch and Solr. However, we found that the mere content management capabilities are not enough. On one hand, text data query often requires join operations to relational data or JSON data in an existing DBMS. On the other hand, users often have to pull the big text data out to an independent system or service for further text analytics. In this paper, we present our Text-DataBase-as-a-Service (TDBaaS), which is built on top of the Hybrid Data Service (HDS) from IBM Research. The TDBaaS is designed to manage the text data together with relational data and JSON data in a single service. Basic text analytics can be conducted directly inside the database in the form of general SQLs. Moreover, the extensible framework allows the service to have abundant text analytic capabilities with high performance. As a case study, we investigate in the implementation of the top-k word algorithm, and show how the common computations are shared across different tenants in the TDBaaS. The experimental results demonstrate the high performance of the TDBaaS on both text data management and text data analytics.
Based on the definition of spare parts deployment efficiency, factors influencing of deployment efficiency were analyzed; spare parts fill rate and utilization rate were introduced for measuring ...spare parts deployment efficiency. Spare parts utilization rate was calculated using Closed Jackson Queuing networks, while spare fill rate was figured out using Dynamic equilibrium equation of Logistics. With fill rate, utilization rate and cost being as constrains, optimization model of spare parts deployment efficiency was established. With SUMT which transform constrain problem to minimum question, combining horizontal search; iterate in the feasible region, optimized point can be figured out. The method will provide a new way to increase spare parts deployment efficiency.
A program analysis tool can play an important role in helping users understand and improve large application codes. Dragon is a robust interactive program analysis tool based on the Open64 compiler, ...which is an Open source C/C++/Fortran77/90 compiler for Intel Itanium systems. We designed and developed the Dragon analysis tool to support manual optimization and parallelization of large applications by exploiting the powerful analyses of the Open64 compiler. Dragon enables users to visualize and print the essential program structure of and obtains information on their large applications. Current features include the call graph, flow graph, and data dependences. Ongoing work extends both Open64 and Dragon by a new call graph construction algorithm and its related interprocedural analysis, global variable definition and usage analysis, and an external interface that can be used by other tools such as profilers and debuggers to share program analysis information. Future work includes supporting the creation and optimization of shared memory parallel programs written using OpenMP.
Recent computer architectures provide new kinds of on-chip parallelism, including support for multithreading. This trend toward hardware support for multithreading is expected to continue for PC, ...workstation and high-end architectures. Given the need to find sequences of independent instructions, and the difficulty of achieving this via compiler technology alone, OpenMP could become an excellent means for application developers to describe the parallelism inherent in applications for such architectures. In this paper, we report on several experiments designed to increase our understanding of the behavior of current OpenMP on such architectures. We have tested two different systems: a Sun Fire V490 with Chip Multiprocessor technology and a Dell Precision 450 workstation with Simultaneous MultiThreading technology. OpenMP performance is studied using the EPCC Microbenchmark suite, subsets of the benchmarks in SPEC OMPM2001 and the NAS parallel benchmark 3.0 suites.