Most of the relevant Big Data processing frameworks (e.g., Apache Hadoop, Apache Spark) only support JVM (Java Virtual Machine) languages by default. In order to support non-JVM languages, ...subprocesses are created and connected to the framework using system pipes. With this technique, the impossibility of managing the data at thread level arises together with an important loss in the performance. To address this problem we introduce Ignis, a new Big Data framework that benefits from an elegant way to create multi-language executors managed through an RPC system. As a consequence, the new system is able to execute natively applications implemented using non-JVM languages. In addition, Ignis allows users to combine in the same application the benefits of implementing each computational task in the best suited programming language without additional overhead. The system runs completely inside Docker containers, isolating the execution environment from the physical machine. A comparison with Apache Spark shows the advantages of our proposal in terms of performance and scalability.
•A new step forward toward the real convergence of HPC and Big Data worlds.•Efficient execution of multi-language applications without additional overhead.•Outperforms Spark considering some of the most typical algorithmic models.•Ignis API inspired by Spark to facilitate the adoption by the research community.•A completely isolated framework running inside Docker containers.
In this paper we propose a scalable platform for real-time processing of Social Media data. The platform ingests huge amounts of contents, such as Social Media posts or comments, and can support ...Public Health surveillance tasks. The processing and analytical needs of multiple screening tasks can easily be handled by incorporating user-defined
. The design is modular and supports different processing elements, such as crawlers to extract relevant contents or classifiers to categorise Social Media. We describe here an implementation of a use case built on the platform that monitors Social Media users and detects early signs of depression.
Political bots, through astroturfing and other strategies, have become important players in recent elections in several countries. This study aims to provide researchers and the citizenry with the ...necessary knowledge to design strategies to identify bots and counteract what international organizations have deemed bots’ harmful effects on democracy and, simultaneously, improve automatic detection of them. This study is based on two innovative methodological approaches: (1) dealing with bots using hybrid intelligence (HI), a multidisciplinary perspective that combines artificial intelligence (AI), natural language processing, political science, and communication science, and (2) applying framing theory to political bots. This paper contributes to the literature in the field by (a) applying framing to the analysis of political bots, (b) defining characteristics to identify signs of automation in Spanish, (c) building a Spanish-language bot database, (d) developing a specific classifier for Spanish-language accounts, (e) using HI to detect bots, and (f) developing tools that enable the everyday citizen to identify political bots through framing.
The Luna Valley complex geosite (northwestern Spain) is a region of geoheritage significance located in an area with high environmental value. Geological studies began in the mid-20th century and ...continue to provide scientific data of significant relevance to the knowledge regarding the Palaeozoic stratigraphy of northern Gondwana and the tectonics of the Variscan orogen. This region also has high value for geoeducation, being visited regularly by both students and the general public. Educational use of the area has promoted the creation of several publicly available materials and activities that include trails, guides, displays and brochures, as well as the development of a small museum. However, over time, weathering; the abandonment of rural life; and the intensive, uncontrolled, and careless use of this region as a geosite for scientific and educational purposes has led to significant degradation and the consequent loss of its geoheritage value. This paper describes the geology of five key geosites in the Luna Valley. This is followed by a review of the promotional initiatives carried out in the area. These data, along with our knowledge of the area, allow us to develop a heritage analysis that includes the main geological interests, conservation status and some key management issues for each of these five individual sites. Several recommendations aim to control the physical degradation of the geosites, encourage their regular monitoring and the updating of the outreach materials using virtual tools, and promote the involvement of the local population in the conservation of this unique site.
In this paper, we explore a real-time automation challenge: the problem of focused extraction of Social Media users. This challenge can be seen as a special form of focused crawling where the main ...target is to detect users with certain patterns. Given a specific user profile, the task consists of rapidly ingesting Social Media data and early detecting target users. This is a real-time intelligent automation task that has numerous applications in domains such as safety, health or marketing. The volume and dynamics of Social Media contents demand efficient real-time solutions able to predict which users are worth to explore. To meet this aim, we propose and evaluate several methods that effectively allow us to harvest relevant users. Even with little contextual information (e.g., a single user submission), our methods quickly focus on the most promising users. We also developed a distributed microservice architecture that supports real-time parallel extraction of Social Media users. This modular architecture scales up in clusters of computers and it can be easily adapted for user extraction in multiple domains and Social Media sources. Our experiments suggest that some of the proposed prioritisation methods, which work with minimal user context, are effective at rapidly focusing on the most relevant users. These methods perform satisfactorily with huge volumes of users and interactions and lead to harvest ratios 2 to 9 times higher than those achieved by random prioritisation.
Los tumores neuroendocrinos gastroenteropancreáticos son neoplasias raras distribuidas a lo largo del tubo digestivo y poseen características peculiares, como la captación de sales de plata, la ...expresión de marcadores de célula neuroendocrina y los gránulos secretorios de contenido hormonal. Según su tamaño, localización anatómica y la presencia de metástasis, estos tumores debutan con distintas características clínicas y pronóstico. El diagnóstico temprano, que requiere de un alto grado de sospecha y una confirmación con estudios especializados, resulta invaluable para tratar estas lesiones a tiempo y aumentar la sobrevida de los pacientes. El tratamiento quirúrgico es la herramienta de primera mano, y otras terapias médicas ayudan a mejorar los síntomas y la calidad de vida de aquellos pacientes con lesiones irresecables. En esta revisión, se tratan los aspectos más relevantes en cuanto a la clasificación, morfología, localización, diagnóstico y tratamiento de estas neoplasias gastrointestinales, y al final, se expone la única experiencia colombiana sobre la epidemiología y el manejo de los tumores neuroendocrinos
Antecedentes: la creciente resistencia del Helicobacter pylori a los antibióticos induce el fracaso de la terapia de erradicación, por lo que se pretende modificar no solo la duración de la misma ...sino el régimen de antibióticos. Materiales y métodos: luego de una asignación aleatorizada se compararon dos esquemas de tratamiento estándar (7 grupo 1 frente a 10 días grupo 2) con omeprazol 20 mg más amoxicilina 1 g y claritromicina 500 mg, todos vía oral (VO) cada 12 horas en pacientes con dispepsia no ulcerosa (DNU) y dispepsia ulcerosa (DU), para evaluar la efectividad de la erradicación con la prueba o test del aliento. Al año se comparó de nuevo la respuesta clínica de cada una de las terapias en los pacientes con DNU y DU. Se evaluó, además, la tolerancia a la terapia en cada grupo. Resultados: se asignaron aleatoriamente 149 pacientes al grupo 1 y 144 pacientes al grupo 2. La tasa de erradicación en el análisis por intención a tratar fue del 67,8% en el grupo 1 y del 74,3% en el grupo 2 (p=0,24), y en el análisis por protocolo fue del 72,1% y 81,1% (p=0,08), respectivamente. La tasa de erradicación fue similar para ambos grupos independiente del grado de infección por H. pylori (p=0,22) y no se encontraron diferencias en el grado de infección y la presencia de DNU o DU (p=0,19). Los efectos adversos fueron más frecuentes en el grupo 2 (27,5% frente a 36,1%), aunque sin relevancia estadística (p=0,4). La tasa de erradicación para ambos grupos fue similar para los pacientes con DNU (73,8% frente a 81,1%) y DU (64,3% frente a 73%). El seguimiento al año mostró que las manifestaciones clínicas no se relacionaron con el hecho de haber erradicado o no la bacteria (p=0,7), pese a que la respuesta clínica de los pacientes con DU fue mejor que la observada para los pacientes con DNU. Conclusiones: la terapia estándar durante 7 o 10 días es insuficiente para la erradicación del H. pylori, independiente del grado de infección por este microorganismo o del tipo de hallazgo endoscópico (DNU o DU). Ambas terapias mostraron unas tasas de erradicación subóptimas y una pobre respuesta clínica al año de seguimiento en el grupo con DNU
Stomach cancer (SC) incidence and mortality are relevant public health issues worldwide. In Colombia, screening for preneoplastic lesions (PNL) and the presence of
H. pylori
is not routinely ...performed. Therefore, the aim of this study was to evaluate OLGA-OLGIM staging and the interobserver agreement in gastritis and preneoplastic lesions in patients with gastroduodenal symptoms from Colombia. A cross-sectional study was conducted in 272 patients with gastroduodenal symptoms. Gastric biopsies were taken following the Updated Sydney System with the OLGA-OLGIM classification, and the results were evaluated by two pathologists. Chronic gastritis and PNL were reported in 76% and 24% of the patients, respectively. Furthermore, 25% of the patients with PNL displayed gastric atrophy (GA) and 75% intestinal metaplasia (IM). Agreement in the histopathological reading for IM was good, whereas for OLGA was variable, and for the
H. pylori
quantity was poor. OLGA-OLGIM stages 0-II were the most frequent (96%), while stage III (4%) and SC (4%) were the least frequent. Age and coffee consumption were associated with a higher prevalence of PNL. This work determined that 4% of the population is at high risk of developing SC and would benefit from follow-up studies. Reinforcement of training programs to improve the agreement in histopathology readings is required.
In this paper we propose a streaming approach for real-time processing of huge amounts of data. CATENAE is a library for easy building and execution of Python topologies (e.g., web crawler, ...classifier). Topologies are designed for their deployment inside Docker containers and, thus, horizontal scaling, granular resource assignment and isolation can be achieved easily. Furthermore, micromodules can have its own dependencies (including the Python version), allowing the user to limit resources such as CPU or memory by instance. We describe an implementation of a use case composed of two topologies: (1) a crawler for tracking users in social media and (2) an early risk detector of depression. We also explain how CATENAE topologies can be connected to non-Python systems.