As seen in "a NULL" Wired and "a NULL" Time
A revealing look at how negative biases against women of color are embedded in search engine results and algorithms
Run a Google search for "black ...girls"-what will you find? "Big Booty" and other sexually explicit terms are likely to come up as top search terms. But, if you type in "white girls," the results are radically different. The suggested porn sites and un-moderated discussions about "why black women are so sassy" or "why black women are so angry" presents a disturbing portrait of black womanhood in modern society.
In Algorithms of Oppression, Safiya Umoja Noble challenges the idea that search engines like Google offer an equal playing field for all forms of ideas, identities, and activities. Data discrimination is a real social problem; Noble argues that the combination of private interests in promoting certain sites, along with the monopoly status of a relatively small number of Internet search engines, leads to a biased set of search algorithms that privilege whiteness and discriminate against people of color, specifically women of color.
Through an analysis of textual and media searches as well as extensive research on paid online advertising, Noble exposes a culture of racism and sexism in the way discoverability is created online. As search engines and their related companies grow in importance-operating as a source for email, a major vehicle for primary and secondary school learning, and beyond-understanding and reversing these disquieting trends and discriminatory practices is of utmost importance.
An original, surprising and, at times, disturbing account of bias on the internet, Algorithms of Oppression contributes to our understanding of how racism is created, maintained, and disseminated in the 21st century.
Safiya Noble discusses search engine bias in an interview with USC Annenberg School for Communication and Journalism
We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully ...graph-based, without any need for additional search structures (typically used at the coarse search stage of the most proximity graph techniques). Hierarchical NSW incrementally builds a multi-layer structure consisting of a hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. The maximum layer in which an element is present is selected randomly with an exponentially decaying probability distribution. This allows producing graphs similar to the previously studied Navigable Small World (NSW) structures while additionally having the links separated by their characteristic distance scales. Starting the search from the upper layer together with utilizing the scale separation boosts the performance compared to NSW and allows a logarithmic complexity scaling. Additional employment of a heuristic for selecting proximity graph neighbors significantly increases performance at high recall and in case of highly clustered data. Performance evaluation has demonstrated that the proposed general metric space search index is able to strongly outperform previous opensource state-of-the-art vector-only approaches. Similarity of the algorithm to the skip list structure allows straightforward balanced distributed implementation.
A good search strategy is essential for a successful systematic literature study. Historically, database searches have been the norm, which was later complemented with snowball searches. Our ...conjecture is that we can perform even better searches if combining these two search approaches, referred to as a hybrid search strategy.
Our main objective was to compare and evaluate a hybrid search strategy. Furthermore, we compared four alternative hybrid search strategies to assess whether we could identify more cost-efficient ways of searching for relevant primary studies.
To compare and evaluate the hybrid search strategy, we replicated the search procedure in a systematic literature review (SLR) on industry–academia collaboration in software engineering. The SLR used a more “traditional” approach to searching for relevant articles for an SLR, while our replication was executed using a hybrid search strategy.
In our evaluation, the hybrid search strategy was superior in identifying relevant primary studies. It identified 30% more primary studies and even more studies when focusing only on peer-reviewed articles. To embrace individual viewpoints when assessing research articles and minimise the risk of missing primary studies, we introduced two new concepts, wild cards and borderline articles, when performing systematic literature studies.
The hybrid search strategy is a strong contender for being used when performing systematic literature studies. Furthermore, alternative hybrid search strategies may be viable if selected wisely in relation to the start set for snowballing. Finally, the two new concepts were judged as essential to cater for different individual judgements and to minimise the risk of excluding primary studies that ought to be included.
Fast data search is an important element of big data in the modern era of internet of things, cloud computing, and social networks. Search using traditional binary-search algorithm can be accelerated ...by employing an interpolation search technique when the data is regularly distributed. In this work, the interpolation search is investigated in which the search results provided unexpected sluggish progress during a search in a large database due to the irregular distribution of data. Irregular distribution of data does not allow the interpolation to make a good prediction about the location of the search item. To overcome this issue, an interpolation–extrapolation search (IES) method is proposed where the interpolation method is integrated with an extrapolation method that balances the lower and upper bounds of the search interval. The proposed method provides faster convergence property than the binary search and the interpolation method. Hence, the proposed IES method provides a faster search for items in a big database.
Providing Advice to Jobseekers at Low Cost BELOT, MICHÈLE; KIRCHER, PHILIPP; MULLER, PAUL
The Review of economic studies,
07/2019, Volume:
86, Issue:
4 (309)
Journal Article
Peer reviewed
Open access
We develop and evaluate experimentally a novel tool that redesigns the job search process by providing tailored advice at lowcost. We invited jobseekers to our computer facilities for twelve ...consecutive weekly sessions to search for real jobs on our web interface. For one-half, instead of relying on their own search criteria, we use readily available labour market data to display relevant alternative occupations and associated jobs. The data indicate that this broadens the set of jobs they consider and increases their job interviews especially for participants who otherwise search narrowly and have been unemployed for a few months.
Invisible Search and Online Search Engines considers the use of search engines in contemporary everyday life and the challenges this poses for media and information literacy. Looking for mediated ...information is mostly done online and arbitrated by the various tools and devices that people carry with them on a daily basis. Because of this, search engines have a significant impact on the structure of our lives, and personal and public memories. Haider and Sundin consider what this means for society, whilst also uniting research on information retrieval with research on how people actually look for and encounter information. Search engines are now one of society’s key infrastructures for knowing and becoming informed. While their use is dispersed across myriads of social practices, where they have acquired close to naturalised positions, they are commercially and technically centralised. Arguing that search, searching, and search engines have become so widely used that we have stopped noticing them, Haider and Sundin consider what it means to be so reliant on this all-encompassing and increasingly invisible information infrastructure. Invisible Search and Online Search Engines is the first book to approach search and search engines from a perspective that combines insights from the technical expertise of information science research with a social science and humanities approach. As such, the book should be essential reading for academics, researchers, and students working on and studying information science, library and information science (LIS), media studies, journalism, digital cultures, and educational sciences.
Deep learning has made substantial breakthroughs in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature ...representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers’ prior knowledge and experience. And due to the limitations of humans’ inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture.
Neural Architecture Search
(
NAS
) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. In addition, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.
When conducting a Systematic Literature Review (SLR), researchers usually face the challenge of designing a search strategy that appropriately balances result quality and review effort. Using digital ...library (or database) searches or snowballing alone may not be enough to achieve high-quality results. On the other hand, using both digital library searches and snowballing together may increase the overall review effort.
The goal of this research is to propose and evaluate hybrid search strategies that selectively combine database searches with snowballing.
We propose four hybrid search strategies combining database searches in digital libraries with iterative, parallel, or sequential backward and forward snowballing. We simulated the strategies over three existing SLRs in SE that adopted both database searches and snowballing. We compared the outcome of digital library searches, snowballing, and hybrid strategies using precision, recall, and F-measure to investigate the performance of each strategy.
Our results show that, for the analyzed SLRs, combining database searches from the Scopus digital library with parallel or sequential snowballing achieved the most appropriate balance of precision and recall.
We put forward that, depending on the goals of the SLR and the available resources, using a hybrid search strategy involving a representative digital library and parallel or sequential snowballing tends to represent an appropriate alternative to be used when searching for evidence in SLRs.
Users frequently use search systems on the Web as well as online social media to learn about ongoing events and public opinion on personalities. Prior studies have shown that the top-ranked results ...returned by these search engines can shape user opinion about the topic (e.g., event or person) being searched. In case of polarizing topics like politics, where multiple competing perspectives exist, the political bias in the top search results can play a significant role in shaping public opinion towards (or away from) certain perspectives. Given the considerable impact that search bias can have on the user, we propose a generalizable search bias quantification framework that not only measures the political bias in ranked list output by the search system but also decouples the bias introduced by the different sources—input data and ranking system. We apply our framework to study the political bias in searches related to 2016 US Presidential primaries in Twitter social media search and find that both input data and ranking system matter in determining the final search output bias seen by the users. And finally, we use the framework to compare the relative bias for two popular search systems—Twitter social media search and Google web search—for queries related to politicians and political events. We end by discussing some potential solutions to signal the bias in the search results to make the users more aware of them.
In environments with scarce resources, adopting the right search strategy can make the difference between succeeding and failing, even between life and death. At different scales, this applies to ...molecular encounters in the cell cytoplasm, to animals looking for food or mates in natural landscapes, to rescuers during search and rescue operations in disaster zones, and to genetic computer algorithms exploring parameter spaces. When looking for sparse targets in a homogeneous environment, a combination of ballistic and diffusive steps is considered optimal; in particular, more ballistic Lévy flights with exponent α ≤ 1 are generally believed to optimize the search process. However, most search spaces present complex topographies. What is the best search strategy in these more realistic scenarios? Here, we show that the topography of the environment significantly alters the optimal search strategy toward less ballistic and more Brownian strategies. We consider an active particle performing a blind cruise search for nonregenerating sparse targets in a 2D space with steps drawn from a Lévy distribution with the exponent varying from α = 1 to α = 2 (Brownian). We show that, when boundaries, barriers, and obstacles are present, the optimal search strategy depends on the topography of the environment, with α assuming intermediate values in the whole range under consideration. We interpret these findings using simple scaling arguments and discuss their robustness to varying searcher’s size. Our results are relevant for search problems at different length scales from animal and human foraging to microswimmers’ taxis to biochemical rates of reaction.