Semantic relatedness deals with the problem of measuring how much two words are related to each other. While there is a large body of research for developing new measures, the use of semantic ...relatedness (SR) measures in topic segmentation has not been explored. In this research the performance of different SR measures is evaluated in the topic segmentation problem. To this end, two topic segmentation algorithms that use the difference in SR of words are introduced. Our results indicate that using an SR measure trained with a general domain corpora achieves better results than topic segmentation algorithms using Wordnet or simple word repetition. Furthermore, when compared with computationally more complex algorithms performing global analysis, our local analysis, enhanced with general domain lexical semantic information, achieves comparable results.
► Adaptation of lexical cohesion based topic segmentation to TV programs specifics. ► Confidence measures and semantic relations used as additional information. ► Language model interpolation ...techniques used for better language model estimation. ► Domain independent technique applied on two corpora composed of TV news and reports. ► F1-measure improved by +4.9 and +3.7 for both corpus.
Transcript-based topic segmentation of TV programs faces several difficulties arising from transcription errors, from the presence of potentially short segments and from the limited number of word repetitions to enforce lexical cohesion, i.e., lexical relations that exist within a text to provide a certain unity. To overcome these problems, we extend a probabilistic measure of lexical cohesion based on generalized probabilities with a unigram language model. On the one hand, confidence measures and semantic relations are considered as additional sources of information. On the other hand, language model interpolation techniques are investigated for better language model estimation. Experimental topic segmentation results are presented on two corpora with distinct characteristics, composed respectively of broadcast news and reports on current affairs. Significant improvements are obtained on both corpora, demonstrating the effectiveness of the extended lexical cohesion measure for spoken TV contents, as well as its genericity over different programs.
•A teacher made analogies using daily life situations to illustrate a proof by contradiction.•The teacher used lexical cohesion for mapping the target and base of analogies.•The teacher used ...conjunctions to make the consequences of the analogy explicit.•The teacher created cohesion through reported speech.•The teacher created a parallel structure between the target and base of the analogies.
Researchers have documented the use of analogies by teachers when introducing mathematical concepts. This article asks the question what linguistic resources do teachers use to create analogies? The article applies systemic functional linguistics to examine examples in which a geometry teacher used analogies to connect daily life instances and mathematical ideas. Specifically, the method applies cohesion analysis to examine the teacher's use of lexical cohesion, conjunctions, and reported speech in the creation of analogies. The teacher created a parallel structure between the target and base of the analogies. The study demonstrates how linguistic analysis can be useful for researchers studying how teachers construct the mathematical classroom register through analogies, particularly when connecting colloquial and mathematics discourses.
Abstract
This paper contrasts lexical cohesion between English and German spoken and
written registers, reporting findings from a quantitative lexical analysis.
After an overview of research aims and ...motivations we formulate hypotheses on
distributions of shallow features as indicators of lexical cohesion across
languages and modes and with respect to register ranking and variation. The
shallow features analysed are: highly frequent words in texts, lexical density,
standardized type-token-ratio, top-frequent content words of the language within
individual registers and texts, and several types of Latinate words. Descriptive
analyses of the corpus are then presented and statistically validated with the
help of univariate and multivariate analyses. The results are interpreted
relative to our hypotheses and related to the following properties of texts in
terms of lexical cohesion: semantic variability, cohesive strength, number and
length of nominal chains, degree of specification of lexis, and degree of
variation along all of these properties.
► Lexical cohesion is affected by genre-specific factors. ► Repetition is the most frequent cohesive device, followed by associative cohesion and inclusion. ► Specification ties are more common than ...generalization links. ► There is a direct proportional relation between number of ties and number of speakers. ► Ties are overwhelmingly realized across turns in remote-mediated ties.
Ever since the publication of
Halliday and Hasan’s (1976) seminal work on cohesion, many scholars have sought to explain different aspects of this textual relation in discourse. The purpose of this paper is twofold: first, to add to the study of the interaction between lexical cohesion and coherence (
Hellman, 1995; Hoey, 1991b; Sanders and Pander Maat, 2006); and second, to contribute to the exploration of lexical cohesion as a measure in generic and register analysis (
Louwerse et al., 2004; Taboada, 2004; Tanskanen, 2006; Thompson, 1994).
I present an integrated model of lexical cohesion which challenges existing proposals affording particular attention to what I call ‘associative cohesion’. Using both quantitative and qualitative methods, the adequacy of this model is tested against a 15,683 word-corpus of broadcast discussions extracted from the
International Corpus of English. The analysis of 11,199 lexical ties reports repetition (59%) as the most frequent lexical cohesion device, followed by associative cohesion (24%) and inclusive relations (8.2%), which are mostly produced in remote-mediated ties (81.8%) over speakers’ turns (90.7%). These are shown to be sensitive to genre-specific factors and to collaborate in topic management processes, thereby demonstrating the descriptive potential and applicability of the framework.
An Analysis Lexical Cohesion In Jakarta Post News Batubara, Muhammad Hasyimsyah; Rahila, Cut Dara Ilfa; Ridaini, Ridaini
Journal of Linguistics, Literature and Language Teaching (Online),
11/2021, Volume:
1, Issue:
1
Journal Article
Open access
This research is about lexical cohesion (repetition, synonym, antonym, hyponym, collocation). This study's objectives were: (1) to find the type of LC in the reportage, and opinion column of Jakarta ...Post News and (2) find the dominant type of LC in the reportage and opinion column of Jakarta Post News. Researchers used library research, while the data sources were the news reports and opinion texts of the Jakarta Post News, which consisted of 30 opinion news and 30 reportage news in edition October. Data analysis used Miles and Huberman's model (reduction, display and verification). The results found synonym 94, repetition 87, antonym 67, hyponym 40, collocation 30. The total LC in the Jakarta Post News is 318. The dominant LC in the Jakarta Post is synonymous with a total of 94 words.
Lexical cohesion Mahlberg, Michaela
International journal of corpus linguistics,
2006, Volume:
11, Issue:
3
Journal Article
Peer reviewed
Cohesion is generally described with regard to two broad categories: ‘grammatical cohesion’ and ‘lexical cohesion’. These categories reflect a view on language that treats grammar and lexis along ...separate lines. Language teaching textbooks on cohesion often follow this division. In contrast, a corpus theoretical approach to the description of English prioritises lexis and does not assume that lexical and grammatical phenomena can be clearly distinguished. Consequently, cohesion can be seen in a new light: cohesion is created by interlocking lexico-grammatical patterns and overlapping lexical items. A corpus theoretical approach to cohesion has important implications for English language teaching. The article looks at difficulties of teaching cohesion, shows links between communicative approaches to ELT and corpus linguistics, and suggests practical applications of corpus theoretical concepts.