In the field of Business Process Management (BPM), modeling business processes and related data is a critical issue since process activities need to manage data stored in databases. The connection ...between processes and data is usually handled at the implementation level, even if modeling both processes and data at the conceptual level should help designers in improving business process models and identifying requirements for implementation. Especially in data- and decision-intensive contexts, business process activities need to access data stored both in databases and data warehouses. In this paper, we complete our approach for defining a novel conceptual view that bridges process activities and data. The proposed approach allows the designer to model the connection between business processes and database models and define the operations to perform, providing interesting insights on the overall connected perspective and hints for identifying activities that are crucial for decision support.
Display omitted
•The NLP software MagiCoder is introduced.•MagiCoder automatically maps spontaneous reports into MedDRA terminology.•We tested MagiCoder against a gold standard of about 1800 manually ...revised reports.•We measured an average recall and precision of 86.9% and 91.8%, respectively.•MagiCoder reduces the time required for encoding ADR reports.
The collection of narrative spontaneous reports is an irreplaceable source for the prompt detection of suspected adverse drug reactions (ADRs). In such task qualified domain experts manually revise a huge amount of narrative descriptions and then encode texts according to MedDRA standard terminology. The manual annotation of narrative documents with medical terminology is a subtle and expensive task, since the number of reports is growing up day-by-day.
Natural Language Processing (NLP) applications can support the work of people responsible for pharmacovigilance. Our objective is to develop NLP algorithms and tools for the detection of ADR clinical terminology. Efficient applications can concretely improve the quality of the experts’ revisions. NLP software can quickly analyze narrative texts and offer an encoding (i.e., a list of MedDRA terms) that the expert has to revise and validate.
MagiCoder, an NLP algorithm, is proposed for the automatic encoding of free-text descriptions into MedDRA terms. MagiCoder procedure is efficient in terms of computational complexity. We tested MagiCoder through several experiments. In the first one, we tested it on a large dataset of about 4500 manually revised reports, by performing an automated comparison between human and MagiCoder encoding. Moreover, we tested MagiCoder on a set of about 1800 reports, manually revised ex novo by some experts of the domain, who also compared automatic solutions with the gold reference standard. We also provide two initial experiments with reports written in English, giving a first evidence of the robustness of MagiCoder w.r.t. the change of the language.
For the current base version of MagiCoder, we measured an average recall and precision of 86.9% and 91.8%, respectively.
From a practical point of view, MagiCoder reduces the time required for encoding ADR reports. Pharmacologists have only to review and validate the MedDRA terms proposed by the application, instead of choosing the right terms among the 70 K low level terms of MedDRA. Such improvement in the efficiency of pharmacologists’ work has a relevant impact also on the quality of the subsequent data analysis. We developed MagiCoder for the Italian pharmacovigilance language. However, our proposal is based on a general approach, not depending on the considered language nor the term dictionary.
Text normalization into medical dictionaries is useful to support clinical tasks. A typical setting is pharmacovigilance (PV). The manual detection of suspected adverse drug reactions (ADRs) in ...narrative reports is time consuming and natural language processing (NLP) provides a concrete help to PV experts. In this paper, we carry out experiments for testing performances of MagiCoder, an NLP application designed to extract MedDRA terms from narrative clinical text. Given a narrative description, MagiCoder proposes an automatic encoding. The pharmacologist reviews, (possibly) corrects, and then, validates the solution. This drastically reduces the time needed for the validation of reports with respect to a completely manual encoding. In previous work, we mainly tested MagiCoder performances on Italian written spontaneous reports. In this paper, we include some new features, change the experiment design, and carry on more tests about MagiCoder. Moreover, we do a change of language, moving to English documents. In particular, we tested MagiCoder on the CADEC dataset, a corpus of manually annotated posts about ADRs collected from the social media.
Abstract Objective The main goal of this work is to propose a framework for the visual specification and query of consistent multi-granular clinical temporal abstractions. We focus on the issue of ...querying patient clinical information by visually defining and composing temporal abstractions, i.e., high level patterns derived from several time-stamped raw data. In particular, we focus on the visual specification of consistent temporal abstractions with different granularities and on the visual composition of different temporal abstractions for querying clinical databases. Background Temporal abstractions on clinical data provide a concise and high-level description of temporal raw data, and a suitable way to support decision making. Granularities define partitions on the time line and allow one to represent time and, thus, temporal clinical information at different levels of detail, according to the requirements coming from the represented clinical domain. The visual representation of temporal information has been considered since several years in clinical domains. Proposed visualization techniques must be easy and quick to understand, and could benefit from visual metaphors that do not lead to ambiguous interpretations. Recently, physical metaphors such as strips, springs, weights, and wires have been proposed and evaluated on clinical users for the specification of temporal clinical abstractions. Visual approaches to boolean queries have been considered in the last years and confirmed that the visual support to the specification of complex boolean queries is both an important and difficult research topic. Methodology We propose and describe a visual language for the definition of temporal abstractions based on a set of intuitive metaphors (striped wall, plastered wall, brick wall), allowing the clinician to use different granularities. A new algorithm, underlying the visual language, allows the physician to specify only consistent abstractions, i.e., abstractions not containing contradictory conditions on the component abstractions. Moreover, we propose a visual query language where different temporal abstractions can be composed to build complex queries: temporal abstractions are visually connected through the usual logical connectives AND, OR, and NOT. Results The proposed visual language allows one to simply define temporal abstractions by using intuitive metaphors, and to specify temporal intervals related to abstractions by using different temporal granularities. The physician can interact with the designed and implemented tool by point-and-click selections, and can visually compose queries involving several temporal abstractions. The evaluation of the proposed granularity-related metaphors consisted in two parts: (i) solving 30 interpretation exercises by choosing the correct interpretation of a given screenshot representing a possible scenario, and (ii) solving a complex exercise, by visually specifying through the interface a scenario described only in natural language. The exercises were done by 13 subjects. The percentage of correct answers to the interpretation exercises were slightly different with respect to the considered metaphors (54.4 – striped wall, 73.3 – plastered wall, 61 – brick wall, and 61 – no wall), but post hoc statistical analysis on means confirmed that differences were not statistically significant. The result of the user's satisfaction questionnaire related to the evaluation of the proposed granularity-related metaphors ratified that there are no preferences for one of them. The evaluation of the proposed logical notation consisted in two parts: (i) solving five interpretation exercises provided by a screenshot representing a possible scenario and by three different possible interpretations, of which only one was correct, and (ii) solving five exercises, by visually defining through the interface a scenario described only in natural language. Exercises had an increasing difficulty. The evaluation involved a total of 31 subjects. Results related to this evaluation phase confirmed us about the soundness of the proposed solution even in comparison with a well known proposal based on a tabular query form (the only significant difference is that our proposal requires more time for the training phase: 21 min versus 14 min). Discussion and conclusions In this work we have considered the issue of visually composing and querying temporal clinical patient data. In this context we have proposed a visual framework for the specification of consistent temporal abstractions with different granularities and for the visual composition of different temporal abstractions to build (possibly) complex queries on clinical databases. A new algorithm has been proposed to check the consistency of the specified granular abstraction. From the evaluation of the proposed metaphors and interfaces and from the comparison of the visual query language with a well known visual method for boolean queries, the soundness of the overall system has been confirmed; moreover, pros and cons and possible improvements emerged from the comparison of different visual metaphors and solutions.
In the temporal database literature, every fact stored in a database may be equipped with two temporal dimensions: the valid time, which describes the time when the fact is true in the modeled ...reality, and the transaction time, which describes the time when the fact is current in the database and can be retrieved. Temporal functional dependencies (TFDs) add valid time to classical functional dependencies (FDs) in order to express database integrity constraints over the flow of time. Currently, proposals dealing with TFDs adopt a point-based approach, where tuples hold at specific time points, to express integrity constraints such as “
for each month, the salary of an employee depends only on his role
”. To the best of our knowledge, there are no proposals dealing with interval-based temporal functional dependencies (ITFDs), where the associated valid time is represented by an interval and there is the need of representing both point-based and interval-based data dependencies. In this paper, we propose ITFDs based on Allen’s interval relations and discuss their expressive power with respect to other TFDs proposed in the literature: ITFDs allow us to express interval-based data dependencies, which cannot be expressed through the existing point-based TFDs. ITFDs allow one to express constraints such as “
employees starting to work the same day with the same role get the same salary
” or “
employees with a given role working on a project cannot start to work with the same role on another project that will end before the first one
”. Furthermore, we propose new algorithms based on B-trees to efficiently verify the satisfaction of ITFDs in a temporal database. These algorithms guarantee that, starting from a relation satisfying a set of ITFDs, the updated relation still satisfies the given ITFDs.
Text normalization into medical dictionaries is useful to support clinical tasks. A typical setting is pharmacovigilance (PV). The manual detection of suspected adverse drug reactions (ADRs) in ...narrative reports is time consuming and natural language processing (NLP) provides a concrete help to PV experts. In this paper, we carry out experiments for testing performances of <inline-formula><tex-math notation="LaTeX">\mathsf{MagiCoder}</tex-math></inline-formula>, an NLP application designed to extract <inline-formula><tex-math notation="LaTeX">\mathsf{MedDRA}</tex-math></inline-formula> terms from narrative clinical text. Given a narrative description, <inline-formula><tex-math notation="LaTeX">\mathsf{MagiCoder}</tex-math></inline-formula> proposes an automatic encoding. The pharmacologist reviews, (possibly) corrects, and then, validates the solution. This drastically reduces the time needed for the validation of reports with respect to a completely manual encoding. In previous work, we mainly tested <inline-formula><tex-math notation="LaTeX">\mathsf{MagiCoder}</tex-math></inline-formula> performances on Italian written spontaneous reports. In this paper, we include some new features, change the experiment design, and carry on more tests about <inline-formula><tex-math notation="LaTeX">\mathsf{MagiCoder}</tex-math></inline-formula>. Moreover, we do a change of language, moving to English documents. In particular, we tested <inline-formula><tex-math notation="LaTeX">\mathsf{MagiCoder}</tex-math></inline-formula> on the CADEC dataset, a corpus of manually annotated posts about ADRs collected from the social media.
Text normalization into medical dictionaries is useful to support clinical tasks. A typical setting is pharmacovigilance (PV). The manual detection of suspected adverse drug reactions (ADRs) in ...narrative reports is time consuming and natural language processing (NLP) provides a concrete help to PV experts. In this paper, we carry out experiments for testing performances of Formula Omitted, an NLP application designed to extract Formula Omitted terms from narrative clinical text. Given a narrative description, Formula Omitted proposes an automatic encoding. The pharmacologist reviews, (possibly) corrects, and then, validates the solution. This drastically reduces the time needed for the validation of reports with respect to a completely manual encoding. In previous work, we mainly tested Formula Omitted performances on Italian written spontaneous reports. In this paper, we include some new features, change the experiment design, and carry on more tests about Formula Omitted. Moreover, we do a change of language, moving to English documents. In particular, we tested Formula Omitted on the CADEC dataset, a corpus of manually annotated posts about ADRs collected from the social media.
The joint modeling of genetic data and brain imaging information allows for determining the pathophysiological pathways of neurodegenerative diseases such as Alzheimer’s disease (AD). This task has ...typically been approached using mass-univariate methods that rely on a complete set of Single Nucleotide Polymorphisms (SNPs) to assess their association with selected image-derived phenotypes (IDPs). However, such methods are prone to multiple comparisons bias and, most importantly, fail to account for potential cross-feature interactions, resulting in insufficient detection of significant associations. Ways to overcome these limitations while reducing the number of traits aim at conveying genetic information at the gene level and capturing the integrated genetic effects of a set of genetic variants, rather than looking at each SNP individually. Their associations with brain IDPs are still largely unexplored in the current literature, though they can uncover new potential genetic determinants for brain modulations in the AD continuum. In this work, we explored an explainable multivariate model to analyze the genetic basis of the grey matter modulations, relying on the AD Neuroimaging Initiative (ADNI) phase 3 dataset. Cortical thicknesses and subcortical volumes derived from T1-weighted Magnetic Resonance were considered to describe the imaging phenotypes. At the same time the genetic counterpart was represented by gene variant scores extracted by the Sequence Kernel Association Test (SKAT) filtering model. Moreover, transcriptomic analysis was carried on to assess the expression of the resulting genes in the main brain structures as a form of validation. Results highlighted meaningful genotype–phenotype interactionsas defined by three latent components showing a significant difference in the projection scores between patients and controls. Among the significant associations, the model highlighted EPHX1 and BCAS1 gene variant scores involved in neurodegenerative and myelination processes, hence relevant for AD. In particular, the first was associated with decreased subcortical volumes and the second with decreasedtemporal lobe thickness. Noteworthy, BCAS1 is particularly expressed in the dentate gyrus. Overall, the proposed approach allowed capturing genotype–phenotype interactions in a restricted study cohort that was confirmed by transcriptomic analysis, offering insights into the underlying mechanisms of neurodegeneration in AD in line with previous findings and suggesting new potential disease biomarkers.
Display omitted
•PLS reveals genotype–phenotype interactions in AD continuum.•Using SKAT filtering, gene-based variant scores were derived in a restricted cohort.•Gene variant scores are associated with brain modulations.•EPHX1 and BCAS1 are tied to neurodegeneration.•Transcriptomic validation reinforces model insights.
A large volume of research in temporal data mining is focusing on discovering temporal rules from time-stamped data. The majority of the methods proposed so far have been mainly devoted to the mining ...of temporal rules which describe relationships between data sequences or instantaneous events and do not consider the presence of complex temporal patterns into the dataset. Such complex patterns, such as trends or up and down behaviors, are often very interesting for the users. In this paper we propose a new kind of temporal association rule and the related extraction algorithm; the learned rules involve complex temporal patterns in both their antecedent and consequent. Within our proposed approach, the user defines a set of complex patterns of interest that constitute the basis for the construction of the temporal rule; such complex patterns are represented and retrieved in the data through the formalism of knowledge-based Temporal Abstractions. An Apriori-like algorithm looks then for meaningful temporal relationships (in particular, precedence temporal relationships) among the complex patterns of interest. The paper presents the results obtained by the rule extraction algorithm on a simulated dataset and on two different datasets related to biomedical applications: the first one concerns the analysis of time series coming from the monitoring of different clinical variables during hemodialysis sessions, while the other one deals with the biological problem of inferring relationships between genes from DNA microarray data. PUBLICATION ABSTRACT