We investigate the interaction between coherence and lexical cohesion in expository and persuasive texts using seven encyclopedia texts and seven fundraising letters. We describe genre structure in ...terms of genre-specific moves and coherence structure with Rhetorical Structure Theory. For lexical cohesion, we identify repetitions, systematic semantic relations and collocations across discourse units, modeled as weighted multigraphs. By comparing the prominence of discourse units in the coherence structure with the centrality in the lexical cohesion structure in the two genres, we show that lexical cohesion is closely aligned with coherence in the expository texts, but not in the persuasive texts.
Purpose
Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not ...coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks.
Design/methodology/approach
MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion.
Findings
This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance.
Practical implications
With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload.
Originality/value
This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree.
The main purpose of this research was to determine 1) if there is a meaningful relationship between translation student's knowledge of Lexical Cohesion Patterns (LCPs) and their performance in the ...translation of English texts, 2) and if there is any relationship between participants’ gender and their performance in translating English texts. 90 (45 males and 45 females) undergraduate translation students from Kermanshah Razi University and Payame Noor University, Illam Branch, took part in the study. They were assigned to three groups using Allen's (1992) Placement Test. Based on .The participants received 6 texts enjoying different LCPs (Lexical Repetition, Synonymy, Antonymy, Super-ordinate Repetition, Hyponymic Repetition, Co-Reference, Labeling, Non-lexical Relations, and Substitution). The findings of the study lend support to the positive effect of the students’ language proficiency level especially the knowledge of LCPs on their performance in the translation of English texts. They also indicated that the participants’ performance regarding different LCPs was different. The results further showed that there is no meaningful relationship between participants’ gender and their performance in the translation of English texts enjoying different LCPs.
In this work, we investigate how speaker-based information and lexical-based information can be fused efficiently for topic segmentation of spoken contents. While in recent work, we have proposed an ...early fusion scheme, so as to jointly model speaker and lexical distribution, we propose here a co-segmentation framework, between segmentations performed in the speaker space and in the lexical space. Experiments carried out on two distinct corpora (Radio talk show and TV Broadcast News) show that, even if performances of speaker information are contrasted and closely related to the content structure, its integration with lexical information, either by early fusion or by co-segmentation, is always effective. Absolute gains of 16% (Radio corpus) and 5% (TV corpus) are observed for topic boundary detection performance.
The unity of a text relies heavily on its coherence and cohesion. Cohesive elements play a vital role in facilitating the connection between words, phrases, clauses, and sentences, resulting in a ...cohesive structure that contributes to a logical and coherent organization. The comprehension of a text is greatly aided by its coherence. In this study, we delve into the utilization of grammatical and lexical cohesive devices within spontaneously written texts by students in the domains of engineering and physical science. Furthermore, we examine how the discipline of study influences the students' capacity to generate coherent texts. A reader-centric approach is employed to identify coherent texts for subsequent analysis. 197 texts that are found coherent by readers are examined to identify cohesive devices used in these texts. The analysis reveals that cohesive devices are employed with notable quantitative and qualitative distinctions. Additionally, an intriguing observation is that both groups exhibit a greater emphasis on employing grammatical cohesive devices, with repetition being the sole lexical cohesive device employed to achieve coherence. The discipline has an impact but not a strong impact on the use of cohesive devices. The study's findings suggest potential implications for teaching pedagogy.
This study investigated the reasons for misunderstanding of the texts available in English book 3 of Iranian high school. In order for a text to maintain its go – togetherness, it should be both ...cohesive and coherent. That is, it should utilize cohesive devices such as Reference, substitution, ellipsis, conjunctions, and lexical cohesion, the last one of which accounts for the state of coherence in a text. In this analysis, these five cohesive devices have been dealt with in each unit. Based on a calculated frequency, it was shown that the ellipsis and substitutions were the two cohesive devices, which were less often used, in the reading passages of this book. In addition, lexical cohesion pertinent to coherence was hardly used, causing the texts to be incoherent.
Intra-content term weighting for topic segmentation Bouchekif, Abdessalam; Damnati, Geraldine; Charlet, Delphine
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Conference Proceeding
Term weighting is an important task in many applications, such as information retrieval, extraction of significant words or automatic summarization. It translates the capacity of a term to ...discriminate a document within a collection, or a part of a document within a whole document. This paper deals with term weighting strategies in the context of lexical cohesion based topic segmentation. The aim is to propose a term weighting method which does not require any external information data. Weights are estimated from the content itself which is considered as a collection of mono-thematic documents. Two approaches are proposed and significant improvements are observed on a rich corpus covering various formats of Broadcast News shows from 8 French TV channels.
User intention modeling is a key component for providing appropriate services
within ubiquitous and pervasive computing environments. Intention modeling
should be concentrated on inferring user ...activities based on the objects a
user approaches or touches. In order to support this kind of modeling, we
propose the creation of object-activity pairs based on relatedness in a
general domain. In this paper, we show our method for achieving this and
evaluate its effectiveness.
nema