Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating ...experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s).
In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.
In this paper, we present a new multimedia retrieval paradigm to innovate large-scale search of heterogenous multimedia data. It is able to return results of different media types from heterogeneous ...data sources, e.g., using a query image to retrieve relevant text documents or images from different data sources. This utilizes the widely available data from different sources and caters for the current users' demand of receiving a result list simultaneously containing multiple types of data to obtain a comprehensive understanding of the query's results. To enable large-scale inter-media retrieval, we propose a novel inter-media hashing (IMH) model to explore the correlations among multiple media types from different data sources and tackle the scalability issue. To this end, multimedia data from heterogeneous data sources are transformed into a common Hamming space, in which fast search can be easily implemented by XOR and bit-count operations. Furthermore, we integrate a linear regression model to learn hashing functions so that the hash codes for new data points can be efficiently generated. Experiments conducted on real-world large-scale multimedia datasets demonstrate the superiority of our proposed method compared with state-of-the-art techniques.
Private information retrieval (PIR) is the problem of retrieving, as efficiently as possible, one out of <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> messages from ...<inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula> non-communicating replicated databases (each holds all <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> messages) while keeping the identity of the desired message index a secret from each individual database. Symmetric PIR (SPIR) is a generalization of PIR to include the requirement that beyond the desired message, the user learns nothing about the other <inline-formula> <tex-math notation="LaTeX">K-1 </tex-math></inline-formula> messages. The information theoretic capacity of SPIR (equivalently, the reciprocal of minimum download cost) is the maximum number of bits of desired information that can be privately retrieved per bit of downloaded information. We show that the capacity of SPIR is <inline-formula> <tex-math notation="LaTeX">1-1/N </tex-math></inline-formula> regardless of the number of messages <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula>, if the databases have access to common randomness (not available to the user) that is independent of the messages, in the amount that is at least <inline-formula> <tex-math notation="LaTeX">1/(N-1) </tex-math></inline-formula> bits per desired message bit. Otherwise, if the amount of common randomness is less than <inline-formula> <tex-math notation="LaTeX">1/(N-1) </tex-math></inline-formula> bits per message bit, then the capacity of SPIR is zero. Extensions to the capacity region of SPIR and the capacity of finite length SPIR are provided.
The milestone improvements brought about by deep representation learning and pre-training techniques have led to large performance gains across downstream NLP, IR and Vision tasks. Multimodal ...modeling techniques aim to leverage large high-quality visio-linguistic datasets for learning complementary information across image and text modalities. In this paper, we introduce the Wikipedia-based Image Text (WIT) Dataset to better facilitate multimodal, multilingual learning. WIT is composed of a curated set of 37.5 million entity rich image-text examples with 11.5 million unique images across 108 Wikipedia languages. Its size enables WIT to be used as a pretraining dataset for multimodal models, as we show when applied to downstream tasks such as image-text retrieval. WIT has four main and unique advantages. First, WIT is the largest multimodal dataset by the number of image-text examples by 3x (at the time of writing). Second, WIT is massively multilingual (first of its kind) with coverage over 100+ languages (each of which has at least 12K examples) and provides cross-lingual texts for many images. Third, WIT represents a more diverse set of concepts and real world entities relative to what previous datasets cover. Lastly, WIT provides a very challenging real-world test set, as we empirically illustrate using an image-text retrieval task as an example. WIT Dataset is available for download and use via a Creative Commons license here: https://github.com/google-research-datasets/wit.
Intelligent assistants are increasingly being used on smart speaker devices, such as Amazon Echo, Google Home, Apple Homepod, and Harmon Kardon Invoke with Cortana. Typically, user satisfaction ...measurement relies on user interaction signals, such as clicks and scroll movements, in order to determine if a user was satisfied. However, these signals do not exist for smart speakers, which creates a challenge for user satisfaction evaluation on these devices. In this paper, we propose a new signal, user intent, as a means to measure user satisfaction. We propose to use this signal to model user satisfaction in two ways: 1) by developing intent sensitive word embeddings and then using sequences of these intent sensitive query representations to measure user satisfaction; 2) by representing a user's interactions with a smart speaker as a sequence of user intents and thus using this sequence to identify user satisfaction. Our experimental results indicate that our proposed user satisfaction models based on the intent-sensitive query representations have statistically significant improvements over several baselines in terms of common classification evaluation metrics. In particular, our proposed task satisfaction prediction model based on intent-sensitive word embeddings has a 11.81% improvement over a generative model baseline and 6.63% improvement over a user satisfaction prediction model based on Skip-gram word embeddings in terms of the F1 metric. Our findings have implications for the evaluation of Intelligent Assistant systems.
Casting sequential recommendation (SR) as a reinforcement learning (RL) problem is promising and some RL-based methods have been proposed for SR. However, these models are sub-optimal due to the ...following limitations: (a) they fail to leverage the supervision signals in the RL training to capture users’ explicit preferences, leading to slow convergence; and (b) they do not utilize auxiliary information (e.g., knowledge graph) to avoid blindness when exploring users’ potential interests. To address the above-mentioned limitations, we propose a multiplex information-guided RL model (MELOD), which employs a novel RL training framework with Teach and Explore components for SR. We adopt a Teach component to accurately capture users’ explicit preferences and speed up RL convergence. Meanwhile, we design a dynamic intent induction network (DIIN) as a policy function to generate diverse predictions. We utilize the DIIN for the Explore component to mine users’ potential interests by conducting a sequential and knowledge information joint-guided exploration. Moreover, a sequential and knowledge-aware reward function is designed to achieve stable RL training. These components significantly improve MELOD’s performance and convergence against existing RL algorithms to achieve effectiveness and efficiency. Experimental results on seven real-world datasets show that our model significantly outperforms state-of-the-art methods.
This paper studies the problem of \em answering Why-questions for graph pattern queries. Given a query Q, its answers $Q(G)$ in a graph G, and an exemplar $\E$ that describes desired answers, it aims ...to compute a query rewrite $Q'$, such that $Q'(G)$ incorporates relevant entities and excludes irrelevant ones wrt $\E$ under a closeness measure. (1) We characterize the problem by \em Q-Chase. It rewrites Q by applying a sequence of applicable operators guided by $\E$, and backtracks to derive optimal query rewrite. (2) We develop feasible Q-Chase-based algorithms, from anytime solutions to fixed-parameter approximations to compute query rewrites. These algorithms implement Q-Chase by detecting picky operators at run time, which discriminately enforce $\E$ to retain answers that are closer to exemplars, and effectively prune both operators and irrelevant matches, by consulting a cache of star patterns (called \em star views ). Using real-world graphs, we experimentally verify the efficiency and effectiveness of \qchase techniques and their applications.
Controlling Fairness and Bias in Dynamic Learning-to-Rank Morik, Marco; Singh, Ashudeep; Hong, Jessica ...
Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval,
07/2020
Conference Proceeding
Odprti dostop
Rankings are the primary interface through which many online platforms match users to items (e.g. news, products, music, video). In these two-sided markets, not only the users draw utility from the ...rankings, but the rankings also determine the utility (e.g. exposure, revenue) for the item providers (e.g. publishers, sellers, artists, studios). It has already been noted that myopically optimizing utility to the users -- as done by virtually all learning-to-rank algorithms -- can be unfair to the item providers. We, therefore, present a learning-to-rank approach for explicitly enforcing merit-based fairness guarantees to groups of items (e.g. articles by the same publisher, tracks by the same artist). In particular, we propose a learning algorithm that ensures notions of amortized group fairness, while simultaneously learning the ranking function from implicit feedback data. The algorithm takes the form of a controller that integrates unbiased estimators for both fairness and utility, dynamically adapting both as more data becomes available. In addition to its rigorous theoretical foundation and convergence guarantees, we find empirically that the algorithm is highly practical and robust.
Outsourced symmetric private information retrieval Jarecki, Stanislaw; Jutla, Charanjit; Krawczyk, Hugo ...
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security,
11/2013
Conference Proceeding
Odprti dostop
In the setting of searchable symmetric encryption (SSE), a data owner D outsources a database (or document/file collection) to a remote server E in encrypted form such that D can later search the ...collection at E while hiding information about the database and queries from E. Leakage to E is to be confined to well-defined forms of data-access and query patterns while preventing disclosure of explicit data and query plaintext values. Recently, Cash et al. presented a protocol, OXT, which can run arbitrary boolean queries in the SSE setting and which is remarkably efficient even for very large databases.
In this paper we investigate a richer setting in which the data owner D outsources its data to a server E but D is now interested to allow clients (third parties) to search the database such that clients learn the information D authorizes them to learn but nothing else while E still does not learn about the data or queried values as in the basic SSE setting. Furthermore, motivated by a wide range of applications, we extend this model and requirements to a setting where, similarly to private information retrieval, the client's queried values need to be hidden also from the data owner D even though the latter still needs to authorize the query. Finally, we consider the scenario in which authorization can be enforced by the data owner D without D learning the policy, a setting that arises in court-issued search warrants.
We extend the OXT protocol of Cash et al. to support arbitrary boolean queries in all of the above models while withstanding adversarial non-colluding servers (D and E) and arbitrarily malicious clients, and while preserving the remarkable performance of the protocol.
The application of information retrieval techniques to search tasks in software engineering is made difficult by the lexical gap between search queries, usually expressed in natural language (e.g. ...English), and retrieved documents, usually expressed in code (e.g. programming languages). This is often the case in bug and feature location, community question answering, or more generally the communication between technical personnel and non-technical stake holders in a software project. In this paper, we propose bridging the lexical gap by projecting natural language statements and code snippets as meaning vectors in a shared representation space. In the proposed architecture, word embeddings are rst trained on API documents, tutorials, and reference documents, and then aggregated in order to estimate semantic similarities between documents. Empirical evaluations show that the learned vector space embeddings lead to improvements in a previously explored bug localization task and a newly de ned task of linking API documents to computer programming questions.