Abstract
Gender diversity and the lack of women in leadership in academia have been issues of academic interest for decades. However, little is known about gender diversity at academic conferences as ...an essential aspect of academia. We investigated 86,719 contributions to International Communication Association (ICA) conferences over the past 18 years with regard to female and male authorship and how it changed following the introduction of childcare, during the global pandemic, and under female division leadership. Lastly, we analyzed divisions/interest groups, authors’ gender, and national affiliation. We found that the proportion of female authors is high in all conference years and is representative of ICA membership. We found differences in how women and men are represented across divisions, countries of author affiliation, based on the availability of childcare, and during the global pandemic. We discuss implications at societal, organizational, and individual levels.
Continuous blood pressure (BP) estimation using pulse transit time (PTT) is a promising method for unobtrusive BP measurement. However, the accuracy of this approach must be improved for it to be ...viable for a wide range of applications. This study proposes a novel continuous BP estimation approach that combines data mining techniques with a traditional mechanism-driven model. First, 14 features derived from simultaneous electrocardiogram and photoplethysmogram signals were extracted for beat-to-beat BP estimation. A genetic algorithm-based feature selection method was then used to select BP indicators for each subject. Multivariate linear regression and support vector regression were employed to develop the BP model. The accuracy and robustness of the proposed approach were validated for static, dynamic, and follow-up performance. Experimental results based on 73 subjects showed that the proposed approach exhibited excellent accuracy in static BP estimation, with a correlation coefficient and mean error of 0.852 and -0.001 ± 3.102 mmHg for systolic BP, and 0.790 and -0.004 ± 2.199 mmHg for diastolic BP. Similar performance was observed for dynamic BP estimation. The robustness results indicated that the estimation accuracy was lower by a certain degree one day after model construction but was relatively stable from one day to six months after construction. The proposed approach is superior to the state-of-the-art PTT-based model for an approximately 2-mmHg reduction in the standard derivation at different time intervals, thus providing potentially novel insights for cuffless BP estimation.
Abstract
Europe PMC (https://europepmc.org) is a database of research articles, including peer reviewed full text articles and abstracts, and preprints - all freely available for use via website, ...APIs and bulk download. This article outlines new developments since 2017 where work has focussed on three key areas: (i) Europe PMC has added to its core content to include life science preprint abstracts and a special collection of full text of COVID-19-related preprints. Europe PMC is unique as an aggregator of biomedical preprints alongside peer-reviewed articles, with over 180 000 preprints available to search. (ii) Europe PMC has significantly expanded its links to content related to the publications, such as links to Unpaywall, providing wider access to full text, preprint peer-review platforms, all major curated data resources in the life sciences, and experimental protocols. The redesigned Europe PMC website features the PubMed abstract and corresponding PMC full text merged into one article page; there is more evident and user-friendly navigation within articles and to related content, plus a figure browse feature. (iii) The expanded annotations platform offers ∼1.3 billion text mined biological terms and concepts sourced from 10 providers and over 40 global data resources.
Given the competition for top journal space, there is an incentive to produce "significant" results. With the combination of unreported tests, lack of adjustment for multiple tests, and direct and ...indirect p-hacking, many of the results being published will fail to hold up in the future. In addition, there are basic issues with the interpretation of statistical significance. Increasing thresholds may be necessary, but still may not be sufficient: if the effect being studied is rare, even t > 3 will produce a large number of false positives. Here I explore the meaning and limitations of a p-value. I offer a simple alternative (the minimum Bayes factor). I present guidelines for a robust, transparent research culture in financial economics. Finally, I offer some thoughts on the importance of risk-taking (from the perspective of authors and editors) to advance our field.
The quality of education is one of the pillars of sustainable development, as set out in “The 2030 Agenda for Sustainable Development”, adopted by all United Nations Member States in 2015. Recent ...social and technological developments, as well as events such as the COVID-19 pandemic or conflicts in many parts of the world, have led to essential changes in the way education processes are carried out. In addition, they have made it possible to generate, collect and store large amounts of data related to these processes, data that can hide useful information for decisions that, in the medium or long term, can lead to a significant increase in the quality of education. Uncovering this information is the subject of Educational Data Mining. To understand the state-of-the-art reflected by recent developments, trends, theories, methodologies, and applications in this field, in the European Union, we considered it appropriate to conduct a systematic and critical literature review. Our paper aims to identify, analyze, and synthesize relevant information from these articles, both to build a foundation for further studies and to identify gaps or unexplored issues that can be addressed in future research. The analysis is based on research identified in three international databases recognized for content quality: Scopus, Science direct, and IEEEXplore.
Malaria is a life-threatening disease caused by
parasites that infect the red blood cells (RBCs). Manual identification and counting of parasitized cells in microscopic thick/thin-film blood ...examination remains the common, but burdensome method for disease diagnosis. Its diagnostic accuracy is adversely impacted by inter/intra-observer variability, particularly in large-scale screening under resource-constrained settings.
State-of-the-art computer-aided diagnostic tools based on data-driven deep learning algorithms like convolutional neural network (CNN) has become the architecture of choice for image recognition tasks. However, CNNs suffer from high variance and may overfit due to their sensitivity to training data fluctuations.
The primary aim of this study is to reduce model variance, improve robustness and generalization through constructing model ensembles toward detecting parasitized cells in thin-blood smear images.
We evaluate the performance of custom and pretrained CNNs and construct an optimal model ensemble toward the challenge of classifying parasitized and normal cells in thin-blood smear images. Cross-validation studies are performed at the patient level to ensure preventing data leakage into the validation and reduce generalization errors. The models are evaluated in terms of the following performance metrics: (a) Accuracy; (b) Area under the receiver operating characteristic (ROC) curve (AUC); (c) Mean squared error (MSE); (d) Precision; (e) F-score; and (f) Matthews Correlation Coefficient (MCC).
It is observed that the ensemble model constructed with VGG-19 and SqueezeNet outperformed the state-of-the-art in several performance metrics toward classifying the parasitized and uninfected cells to aid in improved disease screening.
Ensemble learning reduces the model variance by optimally combining the predictions of multiple models and decreases the sensitivity to the specifics of training data and selection of training algorithms. The performance of the model ensemble simulates real-world conditions with reduced variance, overfitting and leads to improved generalization.
•Privacy-preservation issue for a novel collaborative data model called the semi-fully distributed setting is investigated.•A privacy-preserving Naive Bayes classification solution based on secure ...multi-party computation is proposed for the semi-fully distributed scenario.•The proposed Naive Bayes classifier has the capability to guarantee the accuracy property of classification model, as well as to protect honest parties’ privacy against corrupted participants.•The proposed Naive Bayes classification method for the semifully distributed setting is efficient in real-life applications.
In recent years, issues of privacy preservation in data mining and machine learning have received more and more attention from the research community. Privacy-preserving data mining and machine learning solutions enable data holders to jointly discover knowledge and valuable information, as well as construct machine learning models without privacy concerns. In this paper, we address the distressing problem of privacy-preservation for a novel data model called the semi-fully distributed setting. Differently from the existing scenarios, each record of the dataset in this data model is composed of three parts, in which the first part is privately kept by a data user, the second one is securely stored by the miner, and the rest is publicly known by both the miner and the data user. For this new data model, we propose a privacy-preserving Naive Bayes classification solution based on secure multi-party computation. Our proposed solution not only achieves a high level of privacy but also guarantees the accuracy of the classification model. The experimental results show that the new proposal is efficient in real-life applications. Furthermore, our pioneering study paves the way for new researches into privacy preservation issues for the semi-fully distributed data model.
Hyperspectral (HS) images are characterized by approximately contiguous spectral information, enabling the fine identification of materials by capturing subtle spectral discrepancies. Due to their ...excellent locally contextual modeling ability, convolutional neural networks (CNNs) have been proven to be a powerful feature extractor in HS image classification. However, CNNs fail to mine and represent the sequence attributes of spectral signatures well due to the limitations of their inherent network backbone. To solve this issue, we rethink HS image classification from a sequential perspective with transformers and propose a novel backbone network called SpectralFormer . Beyond bandwise representations in classic transformers, SpectralFormer is capable of learning spectrally local sequence information from neighboring bands of HS images, yielding groupwise spectral embeddings. More significantly, to reduce the possibility of losing valuable information in the layerwise propagation process, we devise a cross-layer skip connection to convey memory-like components from shallow to deep layers by adaptively learning to fuse "soft" residuals across layers. It is worth noting that the proposed SpectralFormer is a highly flexible backbone network, which can be applicable to both pixelwise and patchwise inputs. We evaluate the classification performance of the proposed SpectralFormer on three HS datasets by conducting extensive experiments, showing the superiority over classic transformers and achieving a significant improvement in comparison with state-of-the-art backbone networks. The codes of this work will be available at https://github.com/danfenghong/IEEE_TGRS_SpectralFormer for the sake of reproducibility.
Top data mining tools for the healthcare industry Santos-Pereira, Judith; Gruenwald, Le; Bernardino, Jorge
Journal of King Saud University. Computer and information sciences,
September 2022, 2022-09-00, 2022-09-01, Letnik:
34, Številka:
8
Journal Article
Recenzirano
Odprti dostop
The healthcare industry has become increasingly challenging, requiring retrieval of knowledge from large amounts of complex data to find the best treatments. Several works have suggested the use of ...Data Mining tools to overcome the challenges; however, none of them has suggested the best tool to do so. To fill this gap, this paper presents a survey of popular open-source data mining tools in which data mining tool selection criteria based on healthcare application requirements is proposed and the best ones using the proposed selection criteria are identified. The following popular open-source data mining tools are assessed: KNIME, R, RapidMiner, Scikit-learn, and Spark. The study shows that KNIME and RapidMiner provide the largest coverage of healthcare data mining requirements.