Many previous works of data mining user queries in peer-to-peer systems focused their attention on the distribution of query contents. However, few has been done towards a better understanding of the ...time series distribution of these queries, which is vital for system performance. To remedy this situation, this paper mines query steams by using automatic time series analysis to evaluate different linear models (Box-Jenkins models and some simple windowed-mean models) for predicting the number of duplicated queries from 10 minutes to 2 hours into the future. Both the predictive power and the computational costs of these models are evaluated over 318,942,450 real world Gnutella queries collected over 3 months. We find the number of duplicated queries is consistently predictable. Simple, practical models like AR perform well on prediction
The idea of building query-oriented routing indices has changed the way of improving routing efficiency from the basis as it can learn the content distribution during the query routing process. It ...gradually improves routing efficiency with no excessive network overhead of the routing index construction and maintenance. However, the previously proposed mechanism is not practically effective due to the slow improvement of routing efficiency. In this paper, we propose a novel mechanism for query-oriented routing indices which quickly achieves high routing efficiency at low cost. The maintenance method employs reinforcement learning to utilize mass peer behaviors to construct and maintain routing indices. It explicitly uses the expected value of returned content number to depict the content distribution, which helps quickly approximate the real distribution. Meanwhile, the routing method is to retrieve as many contents as possible. It also helps speed up the learning process further. The experimental evaluation shows that the mechanism has high routing efficiency, quick learning ability and satisfactory performance under churn
Locality sensitive hash (LSH) is widely used in peer-to-peer (P2P) systems. Although it can support range or similarity queries, it breaks the load balance mechanism of traditional distributed hash ...table (DHT) based system by replacing consistent hash with LSH. To solve the imbalance problem, current systems either weaken the locality preserve ability from similarity preserved to order preserved or adopt load aware peer join mechanism. The first method does not support similarity query as it loses the similarity information and the second method is greatly affected by the dynamic nature of P2P networks. In this paper, we propose a novel system, cuckoo ring, which can preserve similarity information while load balanced. It does not guide the newly joining peer to the hot areas but move the items in the hot areas to cold areas so that the short life time peers are distributed uniformly across the network instead of being guided to the hot areas. Compared to traditional DHT systems, cuckoo ring only maintains a little more information about the global light load peers and the moved indexed items