The burgeoning paradigm of high-throughput computations and materials informatics brings new opportunities in terms of targeted materials design and discovery. The discovery process can be ...significantly accelerated and streamlined if one can learn effectively from available knowledge and past data to predict materials properties efficiently. Indeed, a very active area in materials science research is to develop machine learning based methods that can deliver automated and cross-validated predictive models using either already available materials data or new data generated in a targeted manner. In the present contribution, we show that fast and accurate predictions of a wide range of properties of binary wurtzite superlattices, formed by a diverse set of chemistries, can be made by employing state-of-the-art statistical learning methods trained on quantum mechanical computations in combination with a judiciously chosen numerical representation to encode materials’ similarity. These surrogate learning models then allow for efficient screening of vast chemical spaces by providing instant predictions of the targeted properties. Moreover, the models can be systematically improved in an adaptive manner, incorporate properties computed at different levels of fidelities and are naturally amenable to inverse materials design strategies. While the learning approach to make predictions for a wide range of properties (including structural, elastic and electronic properties) is demonstrated here for a specific example set containing more than 1200 binary wurtzite superlattices, the adopted framework is equally applicable to other classes of materials as well.
•A volume estimation method is developed using connected vehicle trajectory data.•Formulated as a maximum likelihood problem and solved by expectation maximization.•Two case studies were conducted ...using real-world connected vehicle data.•Estimation results were observed with mean absolute percentage error 9–12%.
Recently connected vehicle (CV) technology has received significant attention thanks to active pilot deployments supported by the US Department of Transportation (USDOT). At signalized intersections, CVs may serve as mobile sensors, providing opportunities of reducing dependencies on conventional vehicle detectors for signal operation. However, most of the existing studies mainly focus on scenarios that penetration rates of CVs reach certain level, e.g., 25%, which may not be feasible in the near future. How to utilize data from a small number of CVs to improve traffic signal operation remains an open question. In this work, we develop an approach to estimate traffic volume, a key input to many signal optimization algorithms, using GPS trajectory data from CV or navigation devices under low market penetration rates. To estimate traffic volumes, we model vehicle arrivals at signalized intersections as a time-dependent Poisson process, which can account for signal coordination. The estimation problem is formulated as a maximum likelihood problem given multiple observed trajectories from CVs approaching to the intersection. An expectation maximization (EM) procedure is derived to solve the estimation problem. Two case studies were conducted to validate our estimation algorithm. One uses the CV data from the Safety Pilot Model Deployment (SPMD) project, in which around 2800 CVs were deployed in the City of Ann Arbor, MI. The other uses vehicle trajectory data from users of a commercial navigation service in China. Mean absolute percentage error (MAPE) of the estimation is found to be 9–12%, based on benchmark data manually collected and data from loop detectors. Considering the existing scale of CV deployments, the proposed approach could be of significant help to traffic management agencies for evaluating and operating traffic signals, paving the way of using CVs for detector-free signal operation in the future.
Gastroesophageal adenocarcinomas (GEAs) are heterogeneous cancers where immune checkpoint inhibitors have robust efficacy in heavily inflamed microsatellite instability (MSI) or Epstein-Barr virus ...(EBV)-positive subtypes. Immune checkpoint inhibitor responses are markedly lower in diffuse/genome stable (GS) and chromosomal instable (CIN) GEAs. In contrast to EBV and MSI subtypes, the tumor microenvironment of CIN and GS GEAs have not been fully characterized to date, which limits our ability to improve immunotherapeutic strategies.
Here we aimed to identify tumor-immune cell association across GEA subclasses using data from The Cancer Genome Atlas (N = 453 GEAs) and archival GEA resection specimen (N = 71). The Cancer Genome Atlas RNAseq data were used for computational inferences of immune cell subsets, which were correlated to tumor characteristics within and between subtypes. Archival tissues were used for more spatial immune characterization spanning immunohistochemistry and mRNA expression analyses.
Our results confirmed substantial heterogeneity in the tumor microenvironment between distinct subtypes. While MSI-high and EBV+ GEAs harbored most intense T cell infiltrates, the GS group showed enrichment of CD4+ T cells, macrophages and B cells and, in ∼50% of cases, evidence for tertiary lymphoid structures. In contrast, CIN cancers possessed CD8+ T cells predominantly at the invasive margin while tumor-associated macrophages showed tumor infiltrating capacity. Relatively T cell-rich ‘hot’ CIN GEAs were often from Western patients, while immunological ‘cold’ CIN GEAs showed enrichment of MYC and cell cycle pathways, including amplification of CCNE1.
These results reveal the diversity of immune phenotypes of GEA. Half of GS gastric cancers have tertiary lymphoid structures and are therefore promising candidates for immunotherapy. The majority of CIN GEAs, however, exhibit T cell exclusion and infiltrating macrophages. Associations of immune-poor CIN GEAs with MYC activity and CCNE1 amplification may enable new studies to determine precise mechanisms of immune evasion, ultimately inspiring new therapeutic modalities.
•There is large heterogeneity in the immune contexture of gastroesophageal adenocarcinoma (GEA) subtypes.•Chromosomal instable GEAs are often T cell excluded, which is associated with enhanced MYC and cell cycle pathways.•Genome stable cancers, contrarily, often have tertiary lymphoid structures.•This study argues for more personalized immunotargeting strategies in gastroesophageal cancer treatment.
We introduce the TRUST4 open-source algorithm for reconstruction of immune receptor repertoires in αβ/γδ T cells and B cells from RNA-sequencing (RNA-seq) data. Compared with competing methods, ...TRUST4 supports both FASTQ and BAM format and is faster and more sensitive in assembling longer-even full-length-receptor repertoires. TRUST4 can also call repertoire sequences from single-cell RNA-seq (scRNA-seq) data without V(D)J enrichment, and is compatible with both SMART-seq and 5' 10x Genomics platforms.
Despite growing numbers of immune checkpoint blockade (ICB) trials with available omics data, it remains challenging to evaluate the robustness of ICB response and immune evasion mechanisms ...comprehensively. To address these challenges, we integrated large-scale omics data and biomarkers on published ICB trials, non-immunotherapy tumor profiles, and CRISPR screens on a web platform TIDE (http://tide.dfci.harvard.edu). We processed the omics data for over 33K samples in 188 tumor cohorts from public databases, 998 tumors from 12 ICB clinical studies, and eight CRISPR screens that identified gene modulators of the anticancer immune response. Integrating these data on the TIDE web platform with three interactive analysis modules, we demonstrate the utility of public data reuse in hypothesis generation, biomarker optimization, and patient stratification.
Cancer cell molecular mimicry of stem cells (SC) imbues neoplastic cells with enhanced proliferative and renewal capacities. In support, numerous mediators of SC self-renewal have been evinced to ...show oncogenic potential. We have recently reported that short-hairpin RNA-mediated knockdown of the embryonic stem cell (ESC) self-renewal gene NANOG significantly reduced the clonogenic and tumorigenic capabilities of various cancer cells. In this study, we sought to test the potential pro-tumorigenic functions of NANOG, particularly, in prostate cancer (PCa). Using qRT-PCR, we first confirmed that PCa cells expressed NANOG mRNA primarily from the NANOGP8 locus on chromosome 15q14. We then constructed a lentiviral promoter reporter in which the -3.8-kb NANOGP8 genomic fragment was used to drive the expression of green fluorescence protein (GFP). We observed that NANOGP8-GFP(+) PCa cells showed cancer stem cell (CSC) characteristics such as enhanced clonal growth and tumor regenerative capacity. To further investigate the functions and mechanisms of NANOG in tumorigenesis, we established tetracycline-inducible NANOG-overexpressing cancer cell lines, including both PCa (Du145 and LNCaP) and breast (MCF-7) cancer cells. NANOG induction promoted drug resistance in MCF-7 cells, tumor regeneration in Du145 cells and, most importantly, castration-resistant tumor development in LNCaP cells. These pro-tumorigenic effects of NANOG were associated with key molecular changes, including an upregulation of molecules such as CXCR4, IGFBP5, CD133 and ALDH1. The present gain-of-function studies, coupled with our recent loss-of-function work, establish the integral role for NANOG in neoplastic processes and shed light on its mechanisms of action.
Anthracnose caused by Colletotrichum species is a serious disease of more than 30 plant genera. Several Colletotrichum species have been reported to infect chili in different countries. Although ...China is the largest chiliproducing country, little is known about the species
that have been infecting chili locally. Therefore, we collected samples of diseased chili from 29 provinces of China, from which 1285 strains were isolated. The morphological characters of all strains were observed and compared, and multi-locus phylogenetic analyses (ITS, ACT, CAL, CHS-1,
GAPDH, TUB2, and HIS3) were performed on selected representative strains. Fifteen Colletotrichum species were identified, with C. fioriniae, C. fructicola, C. gloeosporioides, C. scovillei, and C. truncatum being prevalent. Three new species, C. conoides, C. grossum,
and C. liaoningense, were recognised and described in this paper. Colletotrichum aenigma, C. cliviae, C. endophytica, C. hymenocallidis, C. incanum, C. karstii, and C. viniferum were reported for the first time from chili. Pathogenicity of all species isolated from chili
was confirmed, except for C. endophytica. The current study improves the understanding of species causing anthracnose on chili and provides useful information for the effective control of the disease in China.
High-throughput CRISPR screens have shown great promise in functional genomics. We present MAGeCK-VISPR, a comprehensive quality control (QC), analysis, and visualization workflow for CRISPR screens. ...MAGeCK-VISPR defines a set of QC measures to assess the quality of an experiment, and includes a maximum-likelihood algorithm to call essential genes simultaneously under multiple conditions. The algorithm uses a generalized linear model to deconvolute different effects, and employs expectation-maximization to iteratively estimate sgRNA knockout efficiency and gene essentiality. MAGeCK-VISPR also includes VISPR, a framework for the interactive visualization and exploration of QC and analysis results. MAGeCK-VISPR is freely available at http://bitbucket.org/liulab/mageck-vispr .
This study evaluated the effects of Bacillus fermentation on soybean meal protein (SBMP) microstructure and major anti‐nutritional factors (ANFs) in soybean meal (SBM). The Bacillus siamensis isolate ...JL8 producing high yield of protease at 519·1 U g−1 was selected for the laboratory production of fermented soybean meal (FSBM). After 24 h fermentation, the FSBM showed better properties compared with those of SBM, the ANFs such as glycinin, β‐conglycinin and trypsin inhibitor significantly decreased by 86·0, 70·3 and 95·01%, while in vitro digestibility and absorbability increased by 8·7 and 18·9% respectively. Scanning electron microscopy (SEM) image of fermented soybean meal protein showed smaller aggregates and looser network than that of SBMP. Secondary structure examination of proteins revealed fermentation significantly decreased the content of β‐sheet structure by 43·2% and increased the random coil structure by 59·9%. It is demonstrated that Bacillus fermentation improved the nutritional quality of SBM through degrading ANFs and changing the microstructure of SBMP.
Significance and Impact of the Study
There is limited information about the structural property changes of soybean protein during fermentation. In this study, physicochemical analysis of soybean meal protein showed evidence that the increase in in vitro digestibility and absorbability of fermented soybean meal reflected the decrease in β‐conformation and destruction of original structure in soybean meal protein. The results directly gained the understanding of nutritional quality improvement of soybean meal by Bacillus fermentation, and supply the potential use of Bacillus siamensis for fermented soybean meal production.
Significance and Impact of the Study: There is limited information about the structural property changes of soybean protein during fermentation. In this study, physicochemical analysis of soybean meal protein showed evidence that the increase in in vitro digestibility and absorbability of fermented soybean meal reflected the decrease in β‐conformation and destruction of original structure in soybean meal protein. The results directly gained the understanding of nutritional quality improvement of soybean meal by Bacillus fermentation, and supply the potential use of Bacillus siamensis for fermented soybean meal production.
Genome-wide screening using CRISPR coupled with nuclease Cas9 (CRISPR-Cas9) is a powerful technology for the systematic evaluation of gene function. Statistically principled analysis is needed for ...the accurate identification of gene hits and associated pathways. Here, we describe how to perform computational analysis of CRISPR screens using the MAGeCKFlute pipeline. MAGeCKFlute combines the MAGeCK and MAGeCK-VISPR algorithms and incorporates additional downstream analysis functionalities. MAGeCKFlute is distinguished from other currently available tools by its comprehensive pipeline, which contains a series of functions for analyzing CRISPR screen data. This protocol explains how to use MAGeCKFlute to perform quality control (QC), normalization, batch effect removal, copy-number bias correction, gene hit identification and downstream functional enrichment analysis for CRISPR screens. We also describe gene identification and data analysis in CRISPR screens involving drug treatment. Completing the entire MAGeCKFlute pipeline requires ~3 h on a desktop computer running Linux or Mac OS with R support.