In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts. Resolving conflicts and finding ...the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc. In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive. So their source quality measurements are inadequate. Therefore, in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth. The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.
This paper addresses the challenge of truth discovery from noisy social sensing data. The work is motivated by the emergence of social sensing as a data collection paradigm of growing interest, where ...humans perform sensory data collection tasks. A challenge in social sensing applications lies in the noisy nature of data. Unlike the case with well-calibrated and well-tested infrastructure sensors, humans are less reliable, and the likelihood that participants' measurements are correct is often unknown a priori. Given a set of human participants of unknown reliability together with their sensory measurements, this paper poses the question of whether one can use this information alone to determine, in an analytically founded manner, the probability that a given measurement is true. The paper focuses on binary measurements. While some previous work approached the answer in a heuristic manner, we offer the first optimal solution to the above truth discovery problem. Optimality, in the sense of maximum likelihood estimation, is attained by solving an expectation maximization problem that returns the best guess regarding the correctness of each measurement. The approach is shown to outperform the state of the art fact-finding heuristics, as well as simple baselines such as majority voting.
Incentive mechanisms are essential for stimulating adequate worker participation to achieve good truth discovery performance in mobile crowdsensing (MCS) systems. However, most of existing incentive ...mechanisms only consider compensating workers' sensing cost, while the cost incurred by potential privacy leakage has been largely neglected. Moreover, none of existing privacy-preserving incentive mechanisms has incorporated workers' different privacy preferences to provide personalized payments for them. In this paper, we propose a contract-based personalized privacy-preserving incentive mechanism for truth discovery in MCS systems, named Paris-TD, which provides personalized payments for workers as a compensation for privacy cost while achieving accurate truth discovery. The basic idea is that the platform offers a set of different contracts to workers with different privacy preferences, and each worker chooses to sign a contract which specifies a privacy-preserving degree (PPD) and the corresponding payment the worker will receive if she submits perturbed data with that PPD. Specifically, we respectively design a set of optimal contracts analytically under both full and incomplete information models, which maximize the truth discovery accuracy under a given budget, while satisfying the individual rationality and incentive compatibility properties. The feasibility and effectiveness of Paris-TD are validated through experiments on both synthetic and real-world datasets.
In mobile crowdsensing, finding the best match between tasks and users is crucial to ensure both the quality and effectiveness of a crowdsensing system. Existing works usually assume a centralized ...task assignment by the crowdsensing platform, without addressing the need of fine-grained personalized task matching. In this paper, we argue that it is essential to match tasks to users based on a careful characterization of both the users' preference and reliability. To that end, we propose a personalized task recommender system for mobile crowdsensing, which recommends tasks to users based on a recommendation score that jointly takes each user's preference and reliability into consideration. We first present a hybrid preference metric to characterize users' preference by exploiting their implicit feedback. Then, to profile users' reliability levels, we formalize the problem as a semi-supervised learning model, and propose an efficient block coordinate descent algorithm to solve the problem. For some tasks that lack users' historical information, we further propose a matrix factorization method to infer the users' reliability levels on those tasks. We conduct extensive experiments to evaluate the performance of our system, and the evaluation results demonstrate that our system can achieve superior performance to the benchmarks in both user profiling and personalized task recommendation.
Differential privacy (DP) has gained popularity in truth discovery recently due to its strong privacy guarantee. However, existing DP mechanisms for streaming data publication are not suitable for ...truth discovery as they fail to consider the different reliabilities of individuals, while the DP-based approaches for truth discovery are not suitable for streaming data because they ignore the correlations between truths over time. Directly applying these existing methods to streaming crowdsourced data would lead to low accuracy of the discovered truth. To solve this problem, in this paper, we propose an edge computing based privacy-preserving truth discovery mechanism, named PrivSTD, for streaming crowdsourced data to realize high accuracy of discovered truth while protecting the privacy of workers. Specifically, edge servers are introduced between the untrusted cloud server and workers to securely calculate the local truths and workers' reliabilities. A truth-dependent budget recycle mechanism is proposed for each edge server to adaptively determine the perturbed timestamp and allocate the privacy budget according to the changing pattern of local truths. Besides, a reliability-based perturbation mechanism is proposed to reduce the perturbation magnitude on the basis of worker's reliability. We theoretical analyze the data utility and computation cost of PrivSTD, and prove that PrivSTD can satisfy <inline-formula><tex-math notation="LaTeX">w</tex-math> <mml:math><mml:mi>w</mml:mi></mml:math><inline-graphic xlink:href="ren-ieq1-3062775.gif"/> </inline-formula>-event (<inline-formula><tex-math notation="LaTeX">\epsilon,\delta</tex-math> <mml:math><mml:mrow><mml:mi>ε</mml:mi><mml:mo>,</mml:mo><mml:mi>δ</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="ren-ieq2-3062775.gif"/> </inline-formula>)-differential privacy. Extensive experimental results on synthetic and real-world datasets demonstrate that PrivSTD achieves better utility than the state-of-the-art approaches.
Mobile crowdsensing (MCS) has emerged as a popular and promising paradigm for solving challenging problems by utilizing collective wisdom and resources. However, the system architecture and ...operational rules for MCS have not been well-defined, and obtaining accurate and reliable results from conflicting data collected by workers is difficult due to discrepancies in sensor quality and privacy protection requirements. In this paper, we combine the methodologies of Dynamic Truth Discovery (DTD), Combinatorial Multi-Armed Bandit (CMAB), and Multi-Attribute Reverse Auction to develop a novel MCS ecosystem, with the objective of maximizing the sensing accuracy-aware utility under the budget constraint. We first establish the data collection model by jointly considering the task completion duration as well as the deviation caused by both endogenous errors and privacy protection-oriented injected noise. Then, we theoretically evaluate the accuracy of truth discovery and quantify the contribution of each worker to MCS to form the worker selection criterion. As the qualities of workers are initially unknown, the platform faces the explorationexploitation dilemma. Therefore, we apply CMAB to transform the worker recruitment problem into a combinatorial arm-pulling problem and elaborately design an Upper Confidence Bound (UCB) algorithm to achieve a desirable exploration-exploitation tradeoff. Moreover, we design an auction-based payment method for the platform, stimulating workers to provide their quoted price honestly while enabling individual rationality. Extensive simulations and comparison results demonstrate the feasibility and effectiveness of our proposed MCS ecosystem.
Although crowdsensing has emerged as a popular information collection paradigm, its security and privacy vulnerabilities have come to the forefront in recent years. However, one big limitation of ...previous research is that the security domain and the privacy domain are typically considered separately. Therefore, it is unclear whether the defense methods in the privacy domain will have unexpected impact on the security domain. To bridge this gap, in this paper, we propose a novel Disguise-based Data Poisoning Attack (DDPA) against the differentially private crowdsensing systems empowered with the truth discovery method. Specifically, we propose a novel stealth strategy, i.e., disguising the malicious behavior as privacy behavior, to avoid being detected by truth discovery methods. With this stealth strategy, the shortcoming of failing to maximize the attack effectiveness is avoided naturally through structuring a bi-level optimization problem, which can be solved with the alternating optimization algorithm. Moreover, we show that the differentially private crowdsensing systems are vulnerable to data poisoning attacks, and enhancing the level of privacy will bring more serious security threats. Finally, the evaluation results on the real-world dataset Emotion and the synthetic dataset SynData demonstrate that DDPA can not only achieve maximum utility damage but also remain undetected.
In recent years, an increasing number of map apps have provided route planning services to their users. However, the quality of route planning services relies heavily on having correct data about ...transportation infrastructure. As many planned subway lines are being built across cities, there are conflicts between the actual conditions and the data provided by map apps for temporary bus stops, which may result in complaints against public transportation operators. However, it is difficult to tackle these complaints, as public transportation operators can obtain only inaccurate information about the locations of temporary bus stops. To resolve these conflicts, crowd bus sensing (CBS) is proposed in this paper. CBS is a new sensing paradigm that takes advantage of the extensive deployment of GPS trackers and prior knowledge about the transportation infrastructures covered by scheduled bus routes. Extensive experimental evaluations on real-world and synthetic datasets show that the proposed CBS system outperforms state-of-the-art methods.
Crowdsensed Data Trading (CDT) is a novel data trading paradigm, in which each data consumer can publicize its data demand as some crowdsensing tasks, and some mobile users (i.e., data sellers) can ...compete for these tasks, collect the corresponding data, and sell the results to the consumers. Existing CDT systems generally depend on a data trading broker, which will inevitably cause data consumers' concerns on the trustworthiness of the systems and truthfulness of the data. To address this problem, we propose a B lockchain-based C rowdsensed D ata T rading (BCDT) system, mainly containing a smart contract, called BCDToken. First, we replace the data trading broker with blockchain to guarantee the trustworthiness of the data trading. Meanwhile, BCDToken adopts Blockchain-based Reverse Auction (BRA) to assign sensing tasks to data sellers. The BRA mechansim holds truthfulness and individual rationality, which can ensure the data sellers to report data collection costs honestly and prevent sellers to manipulate the auction. Moreover, we implement a Secure Truth Discovery and reliability Rating (STDR) mechanism in BCDToken based on homomorphic cryptography, which can incentivize sellers to upload the truthful data and consumers to rate truthfully the reliabilities of sellers based on the collected data without revealing any privacy of data. Additionally, we also deploy BCDToken on an Ethereum test network to demonstrate its practicability and significant performances.