Research grants are important for researchers to sustain a good position in academia. There are many grant opportunities available from different funding agencies. However, finding relevant grant ...announcements is challenging and time-consuming for researchers. To resolve the problem, we proposed a grant announcements recommendation system for the National Institute of Health (NIH) grants using researchers' publications. We formulated the recommendation as a classification problem and proposed a recommender using state-of-the-art deep learning techniques: i.e. Bidirectional Encoder Representations from Transformers (BERT), to capture intrinsic, non-linear relationship between researchers' publications and grants announcements. Internal and external evaluations were conducted to assess the system's usefulness. During internal evaluations, the grant citations were used to establish grant-publication ground truth, and results were evaluated against Recall@k, Precision@k, Mean reciprocal rank (MRR) and Area under the Receiver Operating Characteristic curve (ROC-AUC). During external evaluations, researchers' publications were clustered using Dirichlet Process Mixture Model (DPMM), recommended grants by our model were then aggregated per cluster through Recency Weight, and finally researchers were invited to provide ratings to recommendations to calculate Precision@k. For comparison, baseline recommenders using Okapi Best Matching (BM25), Term-Frequency Inverse Document Frequency (TF-IDF), doc2vec, and Naïve Bayes (NB) were also developed. Both internal and external evaluations (all metrics) revealed favorable performances of our proposed BERT-based recommender.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Improvement of agricultural water use efficiency is of major concern with drought problems being one of the most important factors limiting grain production worldwide. Effective management of water ...for crop production in water-scarce areas requires efficient approaches. Increasing crop water use efficiency and drought tolerance by genetic improvement and physiological regulation may be a means to achieve efficient and effective use of water. A limited water supply inhibits the photosynthesis of plants, causes changes of chlorophyll contents and components and damage to photosynthetic apparatus. It also inhibits photochemical activities and decreases the activities of enzymes in plants. Water stress is one of the important factors inhibiting the growth and photosynthetic abilities of plants through disturbing the balance between the production of reactive oxygen species and the antioxidant defence, causing accumulation of reactive oxygen species which induce oxidative stress to proteins, membrane lipids and other cellular components. A number of approaches are being used to enhance water use efficiency and to minimize the detrimental effect of water stress in crop plants. Proper plant nutrition is a good strategy to enhance water use efficiency and productivity in crop plants. Plant nutrients play a very important role in enhancing water use efficiency under limited water supply. In this paper we discuss the possible effective techniques to improve water use efficiency and some macronutrients (nitrogen, phosphorus, potassium, calcium and magnesium), micronutrients (zinc, boron, iron, manganese, molybdenum and chloride), and silicon (a beneficial nutrient) in detail to show how these nutrients play their role in enhancing water use efficiency in crop plant.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
As most great discoveries and advancements in science and technology invariably involve the cooperation of a group of researchers, effective collaboration is the key factor. Nevertheless, finding ...suitable scholars and researchers to work with is challenging and, mostly, time-consuming for many. A recommender who is capable of finding and recommending collaborators would prove helpful. In this work, we utilized a life science and biomedical research database, i.e., MEDLINE, to develop a collaboration recommendation system based on novel graph neural networks, i.e., GraphSAGE and Temporal Graph Network, which can capture intrinsic, complex, and changing dependencies among researchers, including temporal user–user interactions. The baseline methods based on LightGCN and gradient boosting trees were also developed in this work for comparison. Internal automatic evaluations and external evaluations through end-users' ratings were conducted, and the results revealed that our graph neural networks recommender exhibits consistently encouraging results.
The use of Electronic Health Records (EHR)/Electronic Medical Records (EMR) data is becoming more prevalent for research. However, analysis of this type of data has many unique complications due to ...how they are collected, processed and types of questions that can be answered. This book covers many important topics related to using EHR/EMR data for research including data extraction, cleaning, processing, analysis, inference, and predictions based on many years of practical experience of the authors. The book carefully evaluates and compares the standard statistical models and approaches with those of machine learning and deep learning methods and reports the unbiased comparison results for these methods in predicting clinical outcomes based on the EHR data.
Key Features:
Written based on hands-on experience of contributors from multidisciplinary EHR research projects, which include methods and approaches from statistics, computing, informatics, data science and clinical/epidemiological domains.
Documents the detailed experience on EHR data extraction, cleaning and preparation
Provides a broad view of statistical approaches and machine learning prediction models to deal with the challenges and limitations of EHR data.
Considers the complete cycle of EHR data analysis.
The use of EHR/EMR analysis requires close collaborations between statisticians, informaticians, data scientists and clinical/epidemiological investigators. This book reflects that multidisciplinary perspective.
Accurate estimates of natural and/or vaccine-induced antibodies to SARS-CoV-2 are difficult to obtain. Although model-based estimates of seroprevalence have been proposed, they require inputting ...unknown parameters including viral reproduction number, longevity of immune response, and other dynamic factors. In contrast to a model-based approach, the current study presents a data-driven detailed statistical procedure for estimating total seroprevalence (defined as antibodies from natural infection or from full vaccination) in a region using prospectively collected serological data and state-level vaccination data. Specifically, we conducted a longitudinal statewide serological survey with 88,605 participants 5 years or older with 3 prospective blood draws beginning September 30, 2020. Along with state vaccination data, as of October 31, 2021, the estimated percentage of those 5 years or older with naturally occurring antibodies to SARS-CoV-2 in Texas is 35.0% (95% CI = (33.1%, 36.9%)). This is 3x higher than, state-confirmed COVID-19 cases (11.83%) for all ages. The percentage with naturally occurring or vaccine-induced antibodies (total seroprevalence) is 77.42%. This methodology is integral to pandemic preparedness as accurate estimates of seroprevalence can inform policy-making decisions relevant to SARS-CoV-2.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Deep learning is widely used in many real-life applications. Despite their remarkable performance accuracies, deep learning networks are often poorly calibrated, which could be harmful in ...risk-sensitive scenarios. Uncertainty quantification offers a way to evaluate the reliability and trustworthiness of deep-learning-based model predictions. In this work, we introduced uncertainty quantification to our virtual research assistant recommender platform through both Monte Carlo dropout ensemble techniques. We also proposed a new formula to incorporate the uncertainty estimates into our recommendation models. The experiments were carried out on two different components of the recommender platform (i.e., a BERT-based grant recommender and a temporal graph network (TGN)-based collaborator recommender) using real-life datasets. The recommendation results were compared in terms of both recommender metrics (AUC, AP, etc.) and the calibration/reliability metric (ECE). With uncertainty quantification, we were able to better understand the behavior of our regular recommender outputs; while our BERT-based grant recommender tends to be overconfident with its outputs, our TGN-based collaborator recommender tends to be underconfident in producing matching probabilities. Initial case studies also showed that our proposed model with uncertainty quantification adjustment from ensemble gave the best-calibrated results together with the desirable recommender performance.
Aberrant activation of the Sonic Hedgehog (SHH) gene is observed in various cancers. Previous studies have shown a "cross-talk" effect between the canonical Hedgehog signaling pathway and the ...Epidermal Growth Factor (EGF) pathway when SHH is active in the presence of EGF. However, the precise mechanism of the cross-talk effect on the entire gene population has not been investigated. Here, we re-analyzed publicly available data to study how SHH and EGF cooperate to affect the dynamic activity of the gene population. We used genome dynamic analysis to explore the expression profiles under different conditions in a human medulloblastoma cell line. Ordinary differential equations, equipped with solid statistical and computational tools, were exploited to extract the information hidden in the dynamic behavior of the gene population. Our results revealed that EGF stimulation plays a dominant role, overshadowing most of the SHH effects. We also identified cross-talk genes that exhibited expression profiles dissimilar to that seen under SHH or EGF stimulation alone. These unique cross-talk patterns were validated in a cell culture model. These cross-talk genes identified here may serve as valuable markers to study or test for EGF co-stimulatory effects in an SHH+ environment. Furthermore, these cross-talk genes may play roles in cancer progression, thus they may be further explored as cancer treatment targets.
We report a new approach of using statistical context-based scores as encoded features to train neural networks to achieve secondary structure prediction accuracy improvement. The context-based ...scores are pseudo-potentials derived by evaluating statistical, high-order inter-residue interactions, which estimate the favorability of a residue adopting certain secondary structure conformation within its amino acid environment. Encoding these context-based scores as important training and prediction features provides a way to address a long-standing difficulty in neural network-based secondary structure predictions of taking interdependency among secondary structures of neighboring residues into account. Our computational results have shown that the context-based scores are effective features to enhance the prediction accuracy of secondary structure predictions. An overall 7-fold cross-validated Q3 accuracy of 82.74% and Segment Overlap Accuracy (SOV) accuracy of 86.25% are achieved on a set of more than 7987 protein chains with, at most, 25% sequence identity. The Q3 prediction accuracy on benchmarks of CB513, Manesh215, Carugo338, as well as CASP9 protein chains is higher than popularly used secondary structure prediction servers, including Psipred, Profphd, Jpred, Porter (ab initio), and Netsurf. More significant improvement is observed in the SOV accuracy, where more than 4% enhancement is observed, compared to the server with the best SOV accuracy. A Q8 accuracy of >70% (71.5%) is also found in eight-state secondary structure prediction. The majority of the Q3 accuracy improvement is contributed from correctly identifying β-sheets and α-helices. When the context-based scores are incorporated, there are 15.5% more residues predicted with >90% confidence. These high-confidence predictions usually have a rather high accuracy (averagely ∼95%). The three- and eight-state prediction servers (SCORPION) implementing our methods are available online.
Full text
Available for:
IJS, KILJ, NUK, PNG, UL, UM
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and immunity remains uncertain in populations. The state of Texas ranks 2nd in infection with over 2.71 million cases and has ...seen a disproportionate rate of death across the state. The Texas CARES project was funded by the state of Texas to estimate the prevalence of SARS-CoV-2 antibody status in children and adults. Identifying strategies to understand natural as well as vaccine induced antibody response to COVID-19 is critical.
The Texas CARES (Texas Coronavirus Antibody Response Survey) is an ongoing prospective population-based convenience sample from the Texas general population that commenced in October 2020. Volunteer participants are recruited across the state to participate in a 3-time point data collection Texas CARES to assess antibody response over time. We use the Roche Elecsys® Anti-SARS-CoV-2 Immunoassay to determine SARS-CoV-2 antibody status.
The crude antibody positivity prevalence in Phase I was 26.1% (80/307). The fully adjusted seroprevalence of the sample was 31.5%. Specifically, 41.1% of males and 21.9% of females were seropositive. For age categories, 33.5% of those 18-34; 24.4% of those 35-44; 33.2% of those 45-54; and 32.8% of those 55+ were seropositive. In this sample, 42.2% (89/211) of those negative for the antibody test reported having had a COVID-19 test.
In this survey we enrolled and analyzed data for 307 participants, demonstrating a high survey and antibody test completion rate, and ability to implement a questionnaire and SARS-CoV-2 antibody testing within clinical settings. We were also able to determine our capability to estimate the cross-sectional seroprevalence within Texas's federally qualified community centers (FQHCs). The crude positivity prevalence for SARS-CoV-2 antibodies in this sample was 26.1% indicating potentially high exposure to COVID-19 for clinic employees and patients. Data will also allow us to understand sex, age and chronic illness variation in seroprevalence by natural and vaccine induced. These methods are being used to guide the completion of a large longitudinal survey in the state of Texas with implications for practice and population health.
To describe COVID-19 illness characteristics, risk factors, and SARS-CoV-2 serostatus by variant time period in a large community-based pediatric sample.
Data were collected prospectively over four ...timepoints between October 2020 and November 2022 from a population-based cohort ages 5 to 19 years old.
State of Texas, USA.
Participants ages 5 to 19 years were recruited from large pediatric healthcare systems, Federally Qualified Healthcare Centers, urban and rural clinical practices, health insurance providers, and a social media campaign.
SARS-CoV-2 infection.
SARS-CoV-2 antibody status was assessed by the Roche Elecsys
Anti-SARS-CoV-2 Immunoassay for detection of antibodies to the SARS-CoV-2 nucleocapsid protein (Roche N-test). Self-reported antigen or PCR COVID-19 test results and symptom status were also collected.
Over half (57.2%) of the sample (N = 3911) was antibody positive. Symptomatic infection increased over time from 47.09% during the pre-Delta variant time period, to 76.95% during Delta, to 84.73% during Omicron, and to 94.79% during the Omicron BA.2. Those who were not vaccinated were more likely (OR 1.71, 95% CI 1.47, 2.00) to be infected versus those fully vaccinated.
Results show an increase in symptomatic COVID-19 infection among non-hospitalized children with each progressive variant over the past two years. Findings here support the public health guidance that eligible children should remain up to date with COVID-19 vaccinations.