Verb-stranding ellipsis, when a verb is stranded outside of the ellipsis site in which it originated, has been identified in a number of languages (Irish, McCloskey 1991; Hebrew, Doron 1999, Goldberg ...2005; Greek, Merchant 2018; Uzbek, Gribanova, 2020; i.a.), and has been invoked productively in analyses investigating the position to which verbs move and the timing of verb movement in the grammar. Recently, Landau (2018; 2020a;b) has proposed a phase-based negative licensing condition which restricts head-stranding ellipsis and precludes verb-stranding verb phrase ellipsis (VPE) altogether. He claims that apparent verb-stranding VPE must be reanalyzed either as argument ellipsis (Oku 1998; Kim 1999; Takahashi 2008), or a clause-sized ellipsis that strands main verbs (Gribanova 2018). This article approaches this debate through an analysis of head movement and head-stranding ellipsis in the Indic verb-second (V2) language Kashmiri. We show that Landau’s phase-based approach encounters empirical challenges in accounting for ellipsis in V2 languages and requires an unworkable approach to V2 itself, at odds with accounts of V2 in Kashmiri and crosslinguistically (Holmberg 1986; Travis 1991; Vikner 1995; Zwart 1997; Bhatt 1999; Munshi and Bhatt 2009; Manetta 2011). While the present article argues in favor of the standard account of ellipsis (Merchant 2001; 2008), we affirm the important contribution of Landau’s work in identifying challenges that remain for any complete account of head-stranding ellipsis licensing.
Bilingual lexicon induction is the task of inducing word translations from monolingual corpora in two languages. In this article we present the most comprehensive analysis of bilingual lexicon ...induction to date. We present experiments on a wide range of languages and data sizes. We examine translation into English from 25 foreign languages: Albanian, Azeri, Bengali, Bosnian, Bulgarian, Cebuano, Gujarati, Hindi, Hungarian, Indonesian, Latvian, Nepali, Romanian, Serbian, Slovak, Somali, Spanish, Swedish, Tamil, Telugu, Turkish, Ukrainian, Uzbek, Vietnamese, and Welsh. We analyze the behavior of bilingual lexicon induction on low-frequency words, rather than testing solely on high-frequency words, as previous research has done. Low-frequency words are more relevant to statistical machine translation, where systems typically lack translations of rare words that fall outside of their training data. We systematically explore a wide range of features and phenomena that affect the quality of the translations discovered by bilingual lexicon induction. We provide illustrative examples of the highest ranking translations for orthogonal signals of translation equivalence like contextual similarity and temporal similarity. We analyze the effects of frequency and burstiness, and the sizes of the seed bilingual dictionaries and the monolingual training corpora. Additionally, we introduce a novel discriminative approach to bilingual lexicon induction. Our discriminative model is capable of combining a wide variety of features that individually provide only weak indications of translation equivalence. When feature weights are discriminatively set, these signals produce dramatically higher translation quality than previous approaches that combined signals in an unsupervised fashion (e.g., using minimum reciprocal rank). We also directly compare our model's performance against a sophisticated generative approach, the matching canonical correlation analysis (MCCA) algorithm used by Haghighi et al. (
). Our algorithm achieves an accuracy of 42% versus MCCA's 15%.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Central Asia is distinguished by its high level of multilingualism. Incorporating both language portrait and ethnographic methods, this paper attempts to uncover how young Uzbeks negotiate their use ...of language in multilingual Uzbekistan and its connections to education, opportunity, identity, and group membership; furthermore, it examines how they construct and negotiate their identities during this process. Under the lens of a micro and bottom-up approach, this research discovers that the youth of Uzbekistan regard multilingualism as a semiotic source of mobility that allows them to function adequately in a globalized world and that Uzbek, as a mother tongue, plays an important role in their ethnic and cultural identification. Being an Uzbek always occupies the first position among their many identities at the intersection of tradition and modernity, as well as localization and globalization, highlights the relationship between the mother tongue, as a heritage and individual development, which can be an important life anchor left for younger generations as a part of their own history and tradition, especially now, in the historical period we are living in, where relationships are characterized by high mobility and virtualization.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
In spite of the sharp rise of research interest in linguistic landscapes worldwide, little attention has been given to the multilingual urban discourse of Kazakhstan. Being first in the investigation ...into the multilingual practices characteristic of the linguistic landscape in the western region of Kazakhstan, our study adds to the number of linguistic landscape analyses through a translanguaging lens. This paper explores translingual practices on local "bottom-up" commercial public signs by the example of four major cities in the region: Aktau, Aktobe, Atyrau and Uralsk. The study uses a mixed method research design combining qualitative and quantitative analysis of multilingual urban texts accompanied by semi-structured ethnographic interviews with owners of commercial establishments. In our analysis, we specify various dynamic and creative forms of mixing the state Kazakh, interethnic Russian, international English and/or other local languages such as Uzbek and Arabic. We demonstrate how these languages are involved in the creation of symbolic meanings and attraction of potential consumers and contribute to the construction of the urban space of the western region of Kazakhstan. We provide illustrations of the ever-growing presence of English in multilingual written urban texts of the region as a symbol of modernity, high quality, innovation, technical progress and prestige. We also show the indexical potential of the Kazakh and Russian languages as markers of local affiliation and tradition, and the Uzbek and Arabic languages as symbols of the Turkic and Islamic cultures.
This paper investigates the interaction between head movement of the verb and ellipsis of vP (verb-stranding ellipsis, VSE) in Uzbek — an understudied Turkic language of Central Asia. I argue that ...Uzbek verbal predicates are formed by head movement, while non-verbal predicates are formed by a species of Local Dislocation (Embick & Noyer 2001; Embick 2003). Uzbek has two distinct ellipsis strategies that yield similar strings: argument ellipsis (AE) and VSE. VSE occurs only with (head-moved) verbs, and can elide non-verbal predicates, while AE cannot. Uzbek VSE imposes a strict identity requirement on the heads extracted from the ellipsis site (the Verbal Identity Condition (Goldberg 2005b)). Both the genuine existence of this condition, and its source, have recently come under scrutiny; this paper presents Uzbek evidence in support of the claim that the Verbal Identity Condition is genuinely present in a subset of typologically diverse languages with VSE (see Gribanova 2018b). Variable crosslinguistic behavior with respect to the Verbal Identity Condition is predicted by an independently supported view of head movement (Harizanov & Gribanova 2019) in which certain types of head movement are syntactic — yielding the potential for mismatches of extracted material, by analogy with phrasal movement (Merchant 2001) — while others are postsyntactic (yielding the Uzbek-type VSE pattern). The Uzbek investigation therefore provides crucial evidence in favor of a particular view of the crosslinguistic landscape of VSE, and moves us a step closer to explaining why head movement out of ellipsis domains varies systematically in its behavior across languages.
The article considers the concept of Schastie/Bakht (Happiness) in the Russian and Uzbek linguistic cultures as one of the most important universal concepts with a national component. On the one ...hand, the study is conditioned by the interest of modern contrastive linguistics in the comparative research of concepts with a national component. On the other hand, it continues scientific works concerned with the concept of Schastie in the Russian linguistic culture and the concept of Bakht in the Uzbek linguistic culture. The novelty of this study is determined by the fact that this concept is compared for the first time using set phrases of two languages and based on an analytical review of the relevant sources. The article aims at determining common and different components for the Uzbek and Russian linguistic cultures with regard to the Happiness concept (according to the data obtained from the analysis of the above-mentioned material). The article presents the results of an analytical review of studies on the concept of Schastie in the Russian linguistic culture and the concept of Bakht in the Uzbek linguistic culture, as well as contrastive analysis of phraseological units related to the verbalization of these concepts. To analyze and compare idioms of two unrelated languages (Russian and Uzbek) and ways of verbalizing the concept, the authors used the method of linguistic and cultural description supplemented by the component analysis of lexemes and the comparative method. As a result, general and specific meanings for the words “schastie” and “bakht” were identified, as well as general and specific components of the Happiness concept.
The primary aim of this study was to contribute to the development of multilingual automatic speech recognition for lower-resourced Turkic languages. Ten languages—Azerbaijani, Bashkir, Chuvash, ...Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek—were considered. A total of 22 models were developed (13 monolingual and 9 multilingual). The multilingual models that were trained using joint speech data performed more robustly than the baseline monolingual models, with the best model achieving an average character and word error rate reduction of 56.7%/54.3%, respectively. The results of the experiment showed that character and word error rate reduction was more likely when multilingual models were trained with data from related Turkic languages than when they were developed using data from unrelated, non-Turkic languages, such as English and Russian. The study also presented an open-source Turkish speech corpus. The corpus contains 218.2 h of transcribed speech with 186,171 utterances and is the largest publicly available Turkish dataset of its kind. The datasets and codes used to train the models are available for download from our GitHub page.
With the increase in linguistic and cultural diversity in South Korea, the landscape of English education in Korean classrooms has been changing. This has led to an increased need to explore the ...language and literacy practices of the emergent multilingual youth in Korea where one (official) language (Korean) has been predominantly used as the medium of instruction for English teaching and learning. Addressing this need for more research on how emerging multilingual children learn English in the diverse Korean classrooms of today, this four-year longitudinal case study explored out-of-classroom English language learning experiences of three Uzbek students in South Korea. Drawing upon the conceptual framework of translanguaging and agency, data were collected from various sources. I found that the actions taken by these students to learn English depended on their interlocutors, practical and academic purposes, and language ideologies embedded in contexts, which in turn influenced learners’ agency and translanguaging practices. More specifically, the findings show the students exercised agency over their choice of linguistic and non-linguistic resources in order to expand their linguistic repertoires on their own accord. These findings provide implications for EFL research and pedagogy, particularly within the context of the transition from monolingual to multilingual.
Making natural language processing technologies available for low-resource languages is an important goal to improve the access to technology in their communities of speakers. In this paper, we ...provide the first annotated corpora for polarity classification for Uzbek language. Our methodology considers collecting a medium-size manually annotated dataset and a larger-size dataset automatically translated from existing resources. Then, we use these datasets to train sentiment analysis models on the Uzbek language, using both traditional machine learning techniques and recent deep learning models.
Literary relations are the result of intercultural communication, which is based on the ancient history of mankind, which marked the beginning of the process of globalization. The article examines ...the historical foundations of the Kazakh-Uzbek literary relations and the peculiarities of their development. The subject of the research is Kazakh-Uzbek literary relations. The study of Kazakh and Uzbek literature is important in the context of comparative literary studies, which shows the existing common historical roots. Literary relations between them are also important, divided into chronological periods in accordance with the principles of historical development. The article uses both chronological and typological methods to identify the stages and types of literary translation in the literature of the two languages. With the help of the cultural-historical method, the historical origin of literary translation in Kazakh and Uzbek literature was determined, national characteristics influencing the created literary work, historical, genetic, and geographical factors that serve as the basis for their integration were identified.