We conducted an online experiment to study people’s perception of automated computer-written news. Using a 2 × 2 × 2 design, we varied the article topic (sports, finance; within-subjects) and both ...the articles’ actual and declared source (human-written, computer-written; between-subjects). Nine hundred eighty-six subjects rated two articles on credibility, readability, and journalistic expertise. Varying the declared source had small but consistent effects: subjects rated articles declared as human written always more favorably, regardless of the actual source. Varying the actual source had larger effects: subjects rated computer-written articles as more credible and higher in journalistic expertise but less readable. Across topics, subjects’ perceptions did not differ. The results provide conservative estimates for the favorability of computer-written news, which will further increase over time and endorse prior calls for establishing ethics of computer-written news.
Susan Schneider (2019) has proposed two new tests for consciousness in AI (artificial intelligence) systems; the AI Consciousness Test and the Chip Test. On their face, the two tests seem to have the ...virtue of proving satisfactory to a wide range of consciousness theorists holding divergent
theoretical positions, rather than narrowly relying on the truth of any particular theory of consciousness. Unfortunately, both tests are undermined in having an 'audience problem': those theorists with the kind of architectural worries that motivate the need for such tests should,
on similar grounds, doubt that the tests establish the existence of genuine consciousness in the AI in question. Nonetheless, the proposed tests constitute progress, as they could find use by some theorists holding fitting views about consciousness and perhaps in conjunction with other tests
for AI consciousness.
In this commentary, we discuss the nature of reversible and irreversible questions, that is, questions that may enable one to identify the nature of the source of their answers. We then introduce ...GPT-3, a third-generation, autoregressive language model that uses deep learning to produce human-like texts, and use the previous distinction to analyse it. We expand the analysis to present three tests based on mathematical, semantic (that is, the Turing Test), and ethical questions and show that GPT-3 is not designed to pass any of them. This is a reminder that GPT-3 does not do what it is not supposed to do, and that any interpretation of GPT-3 as the beginning of the emergence of a general form of artificial intelligence is merely uninformed science fiction. We conclude by outlining some of the significant consequences of the industrialisation of automatic and cheap production of good, semantic artefacts.
Abstract
This article considers the Turing test as a problem of communication, particularly by asking how the language of artificial intelligence (AI) appears to human experience in comparison to the ...language of the Other. This question is approached through Levinas’ philosophy, by considering the possibility of AI as an absolute alterity, rather than reducing its alterity to the Same. This perspective diverges from traditional accounts of AI, which are more concerned with identifying structures of consciousness in the machine that are analogous to those evident in firsthand experience. This article asks how exactly AI appears to human consciousness, and whether this appearance precludes the appearance of AI as a thinking-being. In the final analysis, the author argues that AI diverges from Levinas’ understanding of alterity, which centers around the exteriority of the Other. The alterity of AI, in contrast, centers around anteriority, defined as the appearance of language's origin-in-itself.
Visual Turing test for computer vision systems Geman, Donald; Geman, Stuart; Hallonquist, Neil ...
Proceedings of the National Academy of Sciences,
03/2015, Volume:
112, Issue:
12
Journal Article
Peer reviewed
Open access
Today, computer vision systems are tested by their accuracy in detecting and localizing instances of objects. As an alternative, and motivated by the ability of humans to provide far richer ...descriptions and even tell a story about an image, we construct a “visual Turing test”: an operator-assisted device that produces a stochastic sequence of binary questions from a given test image. The query engine proposes a question; the operator either provides the correct answer or rejects the question as ambiguous; the engine proposes the next question (“just-in-time truthing”). The test is then administered to the computer-vision system, one question at a time. After the system’s answer is recorded, the system is provided the correct answer and the next question. Parsing is trivial and deterministic; the system being tested requires no natural language processing. The query engine employs statistical constraints, learned from a training set, to produce questions with essentially unpredictable answers—the answer to a question, given the history of questions and their correct answers, is nearly equally likely to be positive or negative. In this sense, the test is only about vision. The system is designed to produce streams of questions that follow natural story lines, from the instantiation of a unique object, through an exploration of its properties, and on to its relationships with other uniquely instantiated objects.
Significance In computer vision, as in other fields of artificial intelligence, the methods of evaluation largely define the scientific effort. Most current evaluations measure detection accuracy, emphasizing the classification of regions according to objects from a predefined library. But detection is not the same as understanding. We present here a different evaluation system, in which a query engine prepares a written test (“visual Turing test”) that uses binary questions to probe a system’s ability to identify attributes and relationships in addition to recognizing objects.
We administer a Turing test to AI chatbots. We examine how chatbots behave in a suite of classic behavioral games that are designed to elicit characteristics such as trust, fairness, risk-aversion, ...cooperation, etc., as well as how they respond to a traditional Big-5 psychological survey that measures personality traits. ChatGPT-4 exhibits behavioral and personality traits that are statistically indistinguishable from a random human from tens of thousands of human subjects from more than 50 countries. Chatbots also modify their behavior based on previous experience and contexts "as if" they were learning from the interactions and change their behavior in response to different framings of the same strategic situation. Their behaviors are often distinct from average and modal human behaviors, in which case they tend to behave on the more altruistic and cooperative end of the distribution. We estimate that they act as if they are maximizing an average of their own and partner's payoffs.
The uncanny valley—the unnerving nature of humanlike robots—is an intriguing idea, but both its existence and its underlying cause are debated. We propose that humanlike robots are not only ...unnerving, but are so because their appearance prompts attributions of mind. In particular, we suggest that machines become unnerving when people ascribe to them experience (the capacity to feel and sense), rather than agency (the capacity to act and do). Experiment 1 examined whether a machine’s humanlike appearance prompts both ascriptions of experience and feelings of unease. Experiment 2 tested whether a machine capable of experience remains unnerving, even without a humanlike appearance. Experiment 3 investigated whether the perceived lack of experience can also help explain the creepiness of unfeeling humans and philosophical zombies. These experiments demonstrate that feelings of uncanniness are tied to perceptions of experience, and also suggest that experience—but not agency—is seen as fundamental to humans, and fundamentally lacking in machines.