My focus is the 'logico-rhetorical module' (Sperber, 2000). This mental module, Sperber hypothesizes, is an evolved ability of human beings to examine critically what someone is saying, for example, ...to detect inconsistency or inadequate evidence in an argument. On the assumption that we have this natural ability, Chilton (2005) questions the need for Critical Discourse Analysis; in contrast, on his reading of Sperber's work, Hart (this issue) argues the opposite. In this article, I agree with Chilton's (2005) stance to the extent that the competence of the logico-rhetorical module is, generally speaking, adequate for enabling critical engagement with verbal input. That said, I highlight two (non-competence related) limitations of the logico-rhetorical module for detecting inconsistency in arguments. To address these limitations, I hold a new approach is needed in Critical Discourse Analysis. This is one which draws on the corpus linguistic method; I refer to it as Electronic Deconstruction.
My broad concern is with texts which aim to persuade an audience of a particular point of view on a particular topic - persuasion texts. Political speeches and newspaper editorials are examples of ...this text type. I put forward a strategy for critically engaging with such texts. Its focus is a persuasion text's cohesion - how it is held together through its vocabulary and grammar. The strategy explores whether or not the cohesion of a persuasion text is unstable, if it deconstructs. Since a persuasion text's credibility is dependent, amongst other things, on effective cohesion, showing where a persuasion text's cohesion deconstructs diminishes its credibility. I call this critical reading strategy Electronic Deconstruction: 'Electronic' reflects the fact that the strategy draws on corpus linguistic method; 'Deconstruction' refers to the deconstructive approach of this strategy. An advantage of Electronic Deconstruction is that it can still facilitate critical engagement with an argument where it is difficult to reconstruct all its premises. This is because its evaluative focus is a persuasion text's cohesive structure rather than its logical structure. To demonstrate Electronic Deconstruction, the article employs a case-study, a text written in 2008 by the late political journalist, Christopher Hitchens, which justified his continuing support for the 2003 Iraq intervention by US-led coalition forces. After highlighting a number of frustrations in identification of its arguments, and thus for critical assessment of its logical soundness, I show how Electronic Deconstruction as an alternative critical engagement circumvents incomplete reconstruction by doing the following: revealing that the text's cohesive structure is unstable via an electronic deconstructive analysis which draws on a two-billion word corpus, the Oxford English Corpus. The article employs two corpus linguistic software tools: 'Sketchengine' (http://www.sketchengine.co.uk/) and 'WMatrix' (http://ucrel.lancs.ac.uk/wmatrix/). PUBLICATION ABSTRACT
This article tackle multilingual automatic alignment. Alignment refers to the process by which segments that are translation of one another are automatically matched. Instead of comparing only pairs ...of languages at sentence level, as it is usually done to conform to human process in translation. The computer is used here for its capacity to infer semantic alignment from a collection of texts that are translations of the same content. The corpus contains press releases from Europa, the European Community website, available in up to 23 languages. The alignment process takes advantage of frequency similarity between different linguistic versions of a document by computing matching features for each repeated string in all versions. This is done to find reliable anchors in the process of linking versions. The question of the best granularity is raised to bring out some semantic equivalences, when comparing two linguistic versions, character N-grams or word N-grams. The alignment systems are traditionally based on word N-grams splitting. The observation of the morphological variety of languages, even inside a single linguistic family, quickly shows that the word granularity is inadequate to provide a widely multilingual system, i.e. a language independent system able to handle flexional languages as well as positional languages. Instead, when starting from a multilingual collection to focus on pairs of texts, we defend that character N-grams alignment is more efficient than word N-grams alignment.
Contemporary crime novels often contain detailed literary representations of urban life worlds. These stagings can provide access to city-specific patterns and structures of thought, action and ...feeling, as well as locally established bodies of knowledge and processes of sense-making. Therefore, their systematic analysis can generate insights into the intrinsic logic of cities. To grasp such patterns on city level a preferably broad empirical basis is needed, but the study of large amounts of literary works poses a methodological challenge. This article presents a mix of methods that permits the analysis of vast quantities of (literary) texts through combining the classical qualitative close reading with elements from computer-aided qualitative content analysis, basic instruments from corpus linguistics and the methodology of distant reading in an iterative research process. It illustrates how to analyze qualitative data also quantitatively and on different levels with regard to social and spatial aspects of the depicted life worlds, thereby showing how novels could be used as data basis for urban sociology and interdisciplinary research questions about the distinctiveness of cities.
Abstract This article is concerned with the discourse and policy of lifelong learning in Europe after 1973. It deals with the question whether the 1980s have to be seen as a period of diminishing ...interest in lifelong learning or as a formative phase of so called neoliberalism. Applying a corpus lingustic approach, it analyses the official documents of the European Communities as archived in the database EUR-Lex. The results are interpreted as the expression of a new European governance regime called “technology corporatism” (Bornschier), which was jointly supported by supranational political actors, large-scale industry and science and also affected their understanding of lifelong learning.
The authors of the finding develop customizable corpus tool to build corpus of historical and religious texts. Big Data approach to Natural Language Processing and Natural Language Understanding was ...used to achieve the goal of such corpus data platform developing. Calculation of qualitative and quantitative characteristics, building search queries belong to the most important features of the adaptable text corpus effectiveness. Number of computer-based calculations and amount of processing data have been reduced and parallelized to achieve higher performance on the levels of computational methods and implemented system. The higher level of efficiency as a trade-off between effectiveness and computational time has been achieved by choosing proper parameters of computational methods. Latent-Semantic Analysis is used as one of the core methods for making queries. The methods applied are mostly based on Singular Value Decomposition. Parameters of the decomposition are analyzed and justified. Suggested approach has been verified on the test data available in different languages.
Le discours produit lors d’une visite touristique naît de différentes modalités de communication dont la visite assistée par un dispositif socio-technique et la visite-conférence dirigée par un ...médiateur. Ces deux modalités présentant des caractéristiques communes propres au genre discursif de la visite médiée, offrent aussi des différences significatives pour constituer un corpus d’étude subdivisé par les modalités de production et les langues sources : des textes écrits par des professionnels du domaine et des textes enregistrées en présence de visiteurs, en français et en espagnol. Plusieurs interrogations se posent dont celle d’une taxonomie des genres de discours liés au domaine spécialisé étudié, celle d’une unité de segmentation textuelle s’affranchissant du caractère scriptural ou oral du mode de production du texte, et celles liées à la catégorisation d’un texte dans un genre discursif. En effet, les valeurs des paramètres de caractérisation doivent permettre l’introduction d’un prototype indispensable à la catégorisation et à l’indexation textuelle du genre étudié.Cette recherche s’inscrit donc dans le cadre théorique de la linguistique textuelle et de l’analyse des pratiques discursives comme indices des praxis sociales, mais la méthodologie employée élargit cette base théorique à la linguistique contributionnelle post-gricéenne qui légitime l’introduction de la contribution comme unité de segmentation textuelle. En outre, le traitement quantitatif d’une compilation de textes sélectionnés trouve ses fondements au sein même de l’analyse de discours et de la linguistique de corpus. La méthode suivie, qui introduit les règles de segmentation textuelle dont l’annotation manuelle qualitative, et l’analyse quantitative permettent de proposer un modèle d’organisation de chaque genre considéré. Au-delà du grand intérêt que présente la caractérisation de nouveaux discours spécialisés, ce travail introduit d’un côté une méthode d’analyse à l’origine du développement d’un programme de segmentation, d’annotation et d’indexation ; et une valorisation didactique dans l’enseignement des langues sur objectifs spécifiques ; et d’un autre côté le développement d’interfaces proposant de nouvelles modalités de médiation dont le discours est pensé en amont de leur réalisation.
The discourse produced in a guided tour stems from different communicative modalities which include the visit assisted by a socio-technical device and the visit guided by an education and visitor service officer. These two modalities show common characteristics of a guided tour; they also offer significant differences. These differences allow us to compile a corpus divided according to its modalities of production and the languages: written text by professionals of the tourism sector, in French and in Spanish. Several issues arise such as the genre taxonomy of the discourse linked to the specific field studied, the unit of the text segmentation which has to free itself from the scriptural or oral feature of the text production, the textual categorisation and indexation of the studied genre. Indeed, the characterisation parameter value must introduce an essential prototype in order to categorise and index the texts of the studied genres. Therefore, as a sign of social praxis, this research suits the text linguistic and discourse analysis theoretical framework. In addition, the selected methodology enlarges this theoretical background to the post-Gricean linguistics of contribution which allows to define the contribution as the unit of textual segmentation. Furthermore, the quantitative analysis of a selected text compilation is rooted in the discourse analysis and corpus linguistic approaches. The method followed here, which introduces textual segmentation rules such as qualitative manual annotation and quantitative analysis suggests structural patterns of each considered genre. Beyond the notable interest of categorising new specialized discourses, this investigation introduces a new analytical method. On the one hand, the methodological framework is the source of a segmentation, annotation, and indexation software development. On the other hand, it is the source of an application development recommending new modalities of guided tours where the priority is given to the elaboration of the discourse.
The paper analyzes functioning of contact size adjectives in such fields as clothes, footwear, jewelry, food, face features and human body in English and Tatar languages applying digital methods, ...including corpus linguistic methods. The material of research is taken from the national language corpora, i.e. British National Corpus (96,263,399) and Tatar National Corpus (26,000,000). The results show that in real life communication, English speakers occasionally describe necklaces by size adjectives while for the Tatar language speakers the length parameter does not matter, since in communication traditionally or stereotypically specified length of the necklace is presupposed. Parametric adjectives of the Tatar language rather describe the form of the necklace (kiñ/keçkenä). In both languages, positive size adjectives prevail four- or five-fold in outfit describing situation. In English, the frequency of these adjectives is two times as low. In both languages, negative size adjectives have the same frequency in describing clothes. In both languages, the size characteristic of shoes is extremely rare. It seems that there is no problem of discrepancies in shoe and foot sizes. Food is rarely characterized by size adjectives. In British National Corpus, the most common phrase is big mac. In Tatar adjective vak / small dominates when describing food parameters. In English phrases with human body, parameters also represent a large group. Phrases with the positive parameter adjective are distributed almost equally with respect to all parts of the face. British people are prone to a detailed description of the person's face. Additionally Tatar language speakers tend to appeal to complex adjectives to describe someone's appearance. It is clear that all the specified distinctive linguistic and cultural features that have come out due to modern digital linguistic methods should be taken into account in actual intercultural communication.
This paper presents a novel version of ExATO, a term extractor originally designed to extract relevant terms from corpora in Portuguese. In this new version not only corpora in Portuguese can be ...handled, but also texts in English are accepted. This extension is likely to offer the same quality pattern already achieved for Portuguese. In this paper, we draw the analysis of results in parallel corpora with respect to the intrinsic differences between Portuguese and English languages, and also the environment of usage for ExATO for Portuguese and English corpora. A brief comparison of ExATO and other similar tool is presented to illustrate the higher quality of ExATO extraction from English corpora.