This study shows issues of comparing English translations of Holy Quran and its Arabic text from discourse structure perspective. There are several different translations of Quran, which differ in ...structure and word domain. In these translations, the order of sentences, phrases, and words is different, which affects computational text analyzing results. It is a new idea to study translations of Quran from entity coherence and lexical cohesion point of view, as a method for evaluating the accuracy and equivalence of existing translations. The results of this study even can be used for machine translation in the future. This research is a preliminary stage of investigating the issues, constructing a platform and defining of some preliminary rules for comparing and evaluating discourse structure of translations.
Detectar automáticamente los límites físicos adecuados de los subtópicos en un documento es una tarea difícil y muy útil en el procesamiento de texto. Existen algunos métodos que intentan resolver ...este problema, varios de ellos con resultados favorables, aunque presentan algunas deficiencias; además, muchas de estas soluciones dependen del dominio de la aplicación. Se realiza un análisis de dos algoritmos para la segmentación de documentos y se comparan los resultados obtenidos con cada uno de ellos.
Trabajo de fin de Grado. Grado en Estudios Ingleses. Curso académico 2013/2014
ENThis dissertation is devoted to a case study of the lexical patterns found in the discourse of a set of selected ...articles from The Guardian and The Independent, all of them dealing with the announcement of the Noble Prizes in Literature and published over the time span of three decades. It aims to contribute to the study of how media discourse uses lexical networks in order to facilitate the reader the understanding of what they are intending to communicate. Seminal research works on lexical cohesion and, more concretely, Tanskanen’s taxonomy, are the basis for this analysis of media discourse. The study will provide information about how the use of lexical networks found in articles dealing with the same topic has developed from a chronological perspective.
ESEsta disertación está dedicada al estudio de un caso de los patrones léxicos encontrados en el discurso de una serie de artículos seleccionados de los periódicos The Guardian y The Independent, todos ellos sobre el anuncio de los premios Nobel de literatura y publicados en un período de tiempo de tres décadas. El trabajo pretende contribuir al estudio de cómo el discurso de los medios de comunicación utiliza las redes léxicas para facilitar al lector el entendimiento de lo que pretenden comunicar. La investigación llevada a cabo sobre cohesión léxica y, más concretamente, la taxonomía de Tanskanen, son la base para este análisis del discurso de los medios de comunicación. El estudio proporcionará información sobre la evolución del uso de las redes léxicas encontradas en artículos sobre el mismo tema desde una perspectiva cronológica.
This paper investigates using lexical cohesion to generate a moderately fluent semantic summary from a collection of documents written in Chinese. Based on the algorithm of cohesion analysis using ...the relationship among the words in the HowNet knowledge database, the built system computes concept frequency rather than word frequency as a measurement of importance. It merges the analysis of lexical semantics and some summarization principles to remove the redundancy and remain the difference in multiple documents. Such approach reduces information loss due to vocabulary switching in the summarization process and the use of a more general notion of relatedness which is based on lexical semantics. Thus we can take into account some more-distant relationship between words. Evaluation results show that the performance of the presented system is obviously better than that of the baseline system. The system can be applied to on-line web texts processing.
Measuring Lexical Cohesion in a Document Gupta, K.; Sadiq, M.; Sridhar, V.
2008 Seventh Mexican International Conference on Artificial Intelligence,
2008-Oct.
Conference Proceeding
Lexical Cohesion is one of the important features in text processing for analyzing document structure and improving the accuracy of text processing. We present a hierarchial graph based model to ...measure cohesion by grouping lexically cohesive units together in a text. Latent semantic analysis is used to construct a relational graph to uncover the hidden semantics from the text by discovering new relations and their weights. Coherence score is derived from the learned weights that are assigned to edges in the graph and number of disjoint partitions in the graph.