Even in highly-developed countries, as many as 15–30% of the population can only understand texts written using a basic vocabulary. Their understanding of everyday texts is limited, which prevents ...them from taking an active role in society and making informed decisions regarding healthcare, legal representation, or democratic choice. Lexical simplification is a natural language processing task that aims to make text understandable to everyone by replacing complex vocabulary and expressions with simpler ones, while preserving the original meaning. It has attracted considerable attention in the last 20 years, and fully automatic lexical simplification systems have been proposed for various languages. The main obstacle for the progress of the field is the absence of high-quality datasets for building and evaluating lexical simplification systems. In this study, we present a new benchmark dataset for lexical simplification in English, Spanish, and (Brazilian) Portuguese, and provide details about data selection and annotation procedures, to enable compilation of comparable datasets in other languages and domains. As the first multilingual lexical simplification dataset, where instances in all three languages were selected and annotated using comparable procedures, this is the first dataset that offers a direct comparison of lexical simplification systems for three languages. To showcase the usability of the dataset, we adapt two state-of-the-art lexical simplification systems with differing architectures (neural vs. non-neural) to all three languages (English, Spanish, and Brazilian Portuguese) and evaluate their performances on our new dataset. For a fairer comparison, we use several evaluation measures which capture varied aspects of the systems' efficacy, and discuss their strengths and weaknesses. We find that a state-of-the-art neural lexical simplification system outperforms a state-of-the-art non-neural lexical simplification system in all three languages, according to all evaluation measures. More importantly, we find that the state-of-the-art neural lexical simplification systems perform significantly better for English than for Spanish and Portuguese, thus posing a question if such an architecture can be used for successful lexical simplification in other languages, especially the low-resourced ones.
(
) is one of the most problematic hop (
L.) pathogens, as the highly virulent fungal pathotypes cause severe annual yield losses due to infections of entire hop fields. In recent years, the RNA ...interference (RNAi) mechanism has become one of the main areas of focus in plant-fungal pathogen interaction studies and has been implicated as one of the major contributors to fungal pathogenicity. MicroRNA-like RNAs (milRNAs) have been identified in several important plant pathogenic fungi; however, to date, no milRNA has been reported in the
species. In the present study, using a high-throughput sequencing approach and extensive bioinformatics analysis, a total of 156 milRNA precursors were identified in the annotated
genome, and 27 of these milRNA precursors were selected as true milRNA candidates, with appropriate microRNA hairpin secondary structures. The stem-loop RT-qPCR assay was used for milRNA validation; a total of nine
milRNAs were detected, and their expression was confirmed. The milRNA expression patterns, determined by the absolute quantification approach, imply that milRNAs play an important role in the pathogenicity of highly virulent
pathotypes. Computational analysis predicted milRNA targets in the
genome and in the host hop transcriptome, and the activity of milRNA-mediated RNAi target cleavage was subsequently confirmed for two selected endogenous fungal target gene models using the 5' RLM-RACE approach.
•We built state-of-the-art lexical simplification systems for Spanish.•We produced new resources for lexical simplification for Spanish.•New resources improve grammaticality of simplified output.•New ...resources improve meaning preservation during simplification.•New resources increase the number and correctness of lexical changes.
The current bottleneck of all data-driven lexical simplification (LS) systems is scarcity and small size of parallel corpora (original sentences and their manually simplified versions) used for training. This is especially pronounced for languages other than English. We address this problem, taking Spanish as an example of such a language, by building new simplification-specific datasets of synonyms and paraphrases using freely available resources. We test their usefulness in the LS task by adding them, in various combinations, to the existing text simplification (TS) training dataset in a phrase-based statistical machine translation (PBSMT) approach. Our best systems significantly outperform the state-of-the-art LS systems for Spanish, by the number of transformations performed and the grammaticality, simplicity and meaning preservation of the output sentences. The results of a detailed manual analysis show that some of the newly built TS resources, although they have a good lexical coverage and lead to a high number of transformations, often change the original meaning and do not generate simpler output when used in this PBSMT setup. The good combinations of these additional resources with the TS training dataset and a good choice of language model, in contrast, improve the lexical coverage and produce sentences which are grammatical, simpler than the original, and preserve the original meaning well.
Olive is considered one of the oldest and the most important cultivated fruit trees in Albania. In the present study, the genetic diversity and structure of Albanian olive germplasm is represented by ...a set of 194 olive genotypes collected in-situ in their natural ecosystems and in the ex-situ collection. The study was conducted using 26 microsatellite markers (14 genomic SSR and 12 Expressed Sequence Tag microsatellites). The identity analysis revealed 183 unique genotypes. Genetic distance-based and model-based Bayesian analyses were used to investigate the genetic diversity, relatedness, and the partitioning of the genetic variability among the Albanian olive germplasm. The genetic distance-based analysis grouped olives into 12 clusters, with an average similarity of 50.9%. Albanian native olives clustered in one main group separated from introduced foreign cultivars, which was also supported by Principal Coordinate Analysis (PCoA) and model-based methods. A core collection of 57 genotypes representing all allelic richness found in Albanian germplasm was developed for the first time. Herein, we report the first extended genetic characterization and structure of olive germplasm in Albania. The findings suggest that Albanian olive germplasm is a unique gene pool and provides an interesting genetic basis for breeding programs.
Diseases caused by viruses and virus-like organisms are one of the major problems in viticulture and grapevine marketing worldwide. Therefore, rapid and accurate diagnosis and identification is ...crucial. In this study, we used HTS of virus- and viroid-derived small RNAs to determine the virome status of Slovenian preclonal candidates of autochthonous and local grapevine varieties (
L.). The method applied to the studied vines revealed the presence of nine viruses and two viroids. All viral entities were validated and more than 160 Sanger sequences were generated and deposited in NCBI. In addition, a complete description into the co-infections in each plant studied was obtained. No vine was found to be virus- and viroid-free, and no vine was found to be infected with only one virus or viroid, while the highest number of viral entities in a plant was eight.
Aggregation operators play an important role in many theoretical and practical aspects of applied mathematics. Recently, the focus is on operators with an annihilator, therefore the topic of this ...paper is distributivity, both conditional and regular, for certain classes of aggregation operators with this property. The characterization of all pairs (F, G) of aggregation operators that are satisfying distributivity law, on both whole and restricted domain, where F is a T-uninorm in Umax, and G is a t-conorm or a uninorm from Umin or Umax is given.
The aim of this study is to contribute to understanding the mechanisms underlying the formation of biologically relevant minerals by comparing the properties of solid phases formed in calcium ...phosphate (CaP) or calcium carbonate (CaCO3) precipitation systems, at defined initial experimental conditions: supersaturation, constituent ions ratio, ionic strength, and/or presence of relevant inorganic ions. Thus, three systems of different chemical complexities were investigated: (a) system containing constituent ions, (b) system containing additional co-ions, and (c) system with higher ionic strength and addition of Mg2+. The respective precipitation diagrams were constructed, and supersaturation domains of different CaP and CaCO3 solid phases formation were identified. The obtained results may have implications not only for biomineralization and geochemistry, but also for materials science in general.
Making It Simplext Saggion, Horacio; Štajner, Sanja; Bott, Stefan ...
ACM transactions on accessible computing,
06/2015, Letnik:
6, Številka:
4
Journal Article
Recenzirano
The way in which a text is written can be a barrier for many people. Automatic text simplification is a natural language processing technology that, when mature, could be used to produce texts that ...are adapted to the specific needs of particular users. Most research in the area of automatic text simplification has dealt with the English language. In this article, we present results from the Simplext project, which is dedicated to automatic text simplification for Spanish. We present a modular system with dedicated procedures for syntactic and lexical simplification that are grounded on the analysis of a corpus manually simplified for people with special needs. We carried out an automatic evaluation of the system’s output, taking into account the interaction between three different modules dedicated to different simplification aspects. One evaluation is based on readability metrics for Spanish and shows that the system is able to reduce the lexical and syntactic complexity of the texts. We also show, by means of a human evaluation, that sentence meaning is preserved in most cases. Our results, even if our work represents the first automatic text simplification system for Spanish that addresses different linguistic aspects, are comparable to the state of the art in English Automatic Text Simplification.
The issue of conditional distributivity, or how it is also called, restricted distributivity, which is a form of relaxed distributivity on the restricted domain, is crucial for many different areas ...such as utility theory and integration theory. The focus of this paper is on this specific form of distributivity for a continuous semi-t-operator with respect to a continuous t-conorm and for a continuous semi-t-operator with respect to a uninorm of the form Umin or Umax with continuous underlying t-norm and t-conorm.
The full characterization of different pairs of aggregation operators that fulfill the distributivity law is crucial for many different areas such as the utility theory and the integration theory. ...The problem of distributivity between uni-nullnorms and Mayor's aggregation operators is being addressed through this paper. Here presented study is the next step in research of this topic and it upgrades some previously obtained results.