DNA base modifications, such as C5-methylcytosine (5mC) and N6-methyldeoxyadenosine (6mA), are important types of epigenetic regulations. Short-read bisulfite sequencing and long-read PacBio ...sequencing have inherent limitations to detect DNA modifications. Here, using raw electric signals of Oxford Nanopore long-read sequencing data, we design DeepMod, a bidirectional recurrent neural network (RNN) with long short-term memory (LSTM) to detect DNA modifications. We sequence a human genome HX1 and a Chlamydomonas reinhardtii genome using Nanopore sequencing, and then evaluate DeepMod on three types of genomes (Escherichia coli, Chlamydomonas reinhardtii and human genomes). For 5mC detection, DeepMod achieves average precision up to 0.99 for both synthetically introduced and naturally occurring modifications. For 6mA detection, DeepMod achieves ~0.9 average precision on Escherichia coli data, and have improved performance than existing methods on Chlamydomonas reinhardtii data. In conclusion, DeepMod performs well for genome-scale detection of DNA modifications and will facilitate epigenetic analysis on diverse species.
Class 1 integrase intI1 has been considered as a good proxy for anthropogenic pollution because of being linked to genes conferring resistance to antibiotics. The gene cassettes of class 1 integrons ...could carry diverse antibiotic resistance genes (ARGs) and conduct horizontal gene transfer among microorganisms. The present study applied high-throughput sequencing technique combined with an intI1 database and genome assembly to quantify the abundance of intI1 in 64 environmental samples from 8 ecosystems, and to investigate the diverse arrangements of ARG-carrying gene cassettes (ACGCs) carried by class 1 integrons. The abundance of detected intI1 ranged from 3.83 × 10
to 4.26 × 10° intI1/cell. High correlation (Pearson's r = 0.852) between intI1 and ARG abundance indicated that intI1 could be considered as an important indicator of ARGs in environments. Aminoglycoside resistance genes were most frequently observed on gene cassettes, carried by 57% assembled ACGCs, followed by trimethoprim and beta-lactam resistance genes. This study established the pipeline for broad monitoring of intI1 in various environmental samples and scanning the ARGs carried by integrons. These findings supplemented our knowledge on the distribution of class 1 integrons and ARGs carried on mobile genetic elements, benefiting future studies on horizontal gene transfer of ARGs.
Debugging a genome sequence is imperative for successfully building a synthetic genome. As part of the effort to build a designer eukaryotic genome, yeast synthetic chromosome X (synX), designed as ...707,459 base pairs, was synthesized chemically. SynX exhibited good fitness under a wide variety of conditions. A highly efficient mapping strategy called pooled PCRTag mapping (PoPM), which can be generalized to any watermarked synthetic chromosome, was developed to identify genetic alterations that affect cell fitness ("bugs"). A series of bugs were corrected that included a large region bearing complex amplifications, a growth defect mapping to a recoded sequence in
, and a loxPsym site affecting promoter function of
PoPM is a powerful tool for synthetic yeast genome debugging and an efficient strategy for phenotype-genotype mapping.
Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot ...correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection.
In recent years, there are research trends from constant to variable density and low-order to high-order gravitational potential gradients in gravity field modeling. Under the research circumstances, ...this paper focuses on the variable density model for gravitational curvatures (or gravity curvatures, third-order derivatives of gravitational potential) of a tesseroid and spherical shell in the spatial domain of gravity field modeling. In this contribution, the general formula of the gravitational curvatures of a tesseroid with arbitrary-order polynomial density is derived. The general expressions for gravitational effects up to the gravitational curvatures of a spherical shell with arbitrary-order polynomial density are derived when the computation point is located above, inside, and below the spherical shell. When the computation point is located above the spherical shell, the general expressions for the mass of a spherical shell and the relation between the radial gravitational effects up to arbitrary-order and the mass of a spherical shell with arbitrary-order polynomial density are derived. The influence of the computation point’s height and latitude on gravitational curvatures with the polynomial density up to fourth-order is numerically investigated using tesseroids to discretize a spherical shell. Numerical results reveal that the near-zone problem exists for the fourth-order polynomial density of the gravitational curvatures, i.e., relative errors in
log
10
scale of gravitational curvatures are large than 0 below the height of about 50 km by a grid size of
15
′
×
15
′
. The polar-singularity problem does not occur for the gravitational curvatures with polynomial density up to fourth-order because of the Cartesian integral kernels of the tesseroid. The density variation can be revealed in the absolute errors as the superposition effects of Laplace parameters of gravitational curvatures other than the relative errors. The derived expressions are examples of the high-order gravitational potential gradients of the mass body with variable density in the spatial domain, which will provide the theoretical basis for future applications of gravity field modeling in geodesy and geophysics.
In plants, cytosine DNA methylations (5mCs) can happen in three sequence contexts as CpG, CHG, and CHH (where H = A, C, or T), which play different roles in the regulation of biological processes. ...Although long Nanopore reads are advantageous in the detection of 5mCs comparing to short-read bisulfite sequencing, existing methods can only detect 5mCs in the CpG context, which limits their application in plants. Here, we develop DeepSignal-plant, a deep learning tool to detect genome-wide 5mCs of all three contexts in plants from Nanopore reads. We sequence Arabidopsis thaliana and Oryza sativa using both Nanopore and bisulfite sequencing. We develop a denoising process for training models, which enables DeepSignal-plant to achieve high correlations with bisulfite sequencing for 5mC detection in all three contexts. Furthermore, DeepSignal-plant can profile more 5mC sites, which will help to provide a more complete understanding of epigenetic mechanisms of different biological processes.
Endometriosis is a common chronic inflammatory and estrogen-dependent disease that mostly affects people of childbearing age. The dietary inflammatory index (DII) is a novel instrument for assessing ...the overall inflammatory potential of diet. However, no studies have shown the relationship between DII and endometriosis to date. This study aimed to elucidate the relationship between DII and endometriosis. Data were acquired from the National Health and Nutrition Examination Survey (NHANES) 2001-2006. DII was calculated using an inbuilt function in the R package. Relevant patient information was obtained through a questionnaire containing their gynecological history. Based on an endometriosis questionnaire survey, those participants who answered yes were considered cases (with endometriosis), and participants who answered no were considered as controls (without endometriosis) group. Multivariate weighted logistic regression was applied to examine the correlation between DII and endometriosis. Subgroup analysis and smoothing curve between DII and endometriosis were conducted in a further investigation. Compared to the control group, patients were prone to having a higher DII (P = 0.014). Adjusted multivariate regression models showed that DII was positively correlated with the incidence of endometriosis (P < 0.05). Analysis of subgroups revealed no significant heterogeneity. In middle-aged and older women (age ≥ 35 years), the smoothing curve fitting analysis results demonstrated a non-linear relationship between DII and the prevalence of endometriosis. Therefore, using DII as an indicator of dietary-related inflammation may help to provide new insight into the role of diet in the prevention and management of endometriosis.
An accurate method with conditional split, double exponential quadrature rule, and numerical differentiation has been proposed in the paper “Accurate computation of gravitational field of a ...tesseroid” (Fukushima in J Geod 92(12):1371–1386,
https://doi.org/10.1007/s00190-018-1126-2
, 2018) to compute the gravitational field (i.e. gravitational potential, gravitational acceleration vector, and gravity gradient tensor) of a tesseroid. This study presents the corrections for some formulas in the main paper and electronic supplementary material of Fukushima (J Geod 92(12):1371–1386,
https://doi.org/10.1007/s00190-018-1126-2
, 2018). Moreover, the FORTRAN subroutines gtess (or qgtess) and ggtess (or qggtess) in the original codes xtess.txt (or xqtess.txt) in double (or quadrature) precision provided by Fukushima (J Geod 92(12):1371–1386,
https://doi.org/10.1007/s00190-018-1126-2
, 2018) are revised. The revised parts have impacts on the calculation of these components of the gravitational acceleration vector (
g
Φ
and
g
Λ
) and gravity gradient tensor (
Γ
Φ
Φ
,
Γ
Φ
Λ
,
Γ
Φ
H
,
Γ
Λ
Λ
,
Γ
Λ
H
, and
Γ
HH
). The revised FORTRAN codes xtess.f90 and xqtess.f90 in double and quadrature precision are presented at the GitHub website
https://github.com/xiaoledeng/xtess-xqtess
. These revised FORTRAN codes can accurately compute the gravitational field of a tesseroid in double and quadrature precision no matter the computation point is located outside, near the surface of, on the surface of, or inside the tesseroid. They can be applied to calculate the gravitational field of the different layers (e.g. atmosphere, topography, crust, and mantle) of the Earth or other celestial bodies, which helps investigate the various geoscience applications, e.g. geoid determination in geodesy and gravity interpretation in geophysics.
The gravity field modelling due to mass distributions of the Earth is one of the primary fields in geodesy and geophysics. Among the mass bodies, a spherical shell has become a commonly used mass ...body to evaluate the gravitational effects of a tesseroid due to its analytical solutions and simple formulae. However, this classic numerical strategy has the shortcoming that its computation time is longer with a finer grid size of the discretized tesseroids, and it has to be performed on a high-performance computer. In this contribution, the simpler analytical expressions for the radial gravity vector and radial–radial gravity gradient tensor of a homogeneous spherical cap and spherical zonal band are derived. Moreover, new analytical formulae of the gravitational curvatures (i.e., third-order derivatives of the gravitational potential) of a homogeneous spherical cap and spherical zonal band are also derived. The analytical consistencies between the new and old radial gravity vector and radial-radial gravity gradient tensor of a spherical cap are confirmed. The computation time and relative approximation errors between a spherical zonal band and spherical shell discretized using tesseroids are quantitatively analyzed with different grid sizes. Numerical experiments show that the computation time of a spherical zonal band discretized using tesseroids is about 180/
n
times less than that of a spherical shell discretized using tesseroids for the gravitational effects up to gravitational curvatures with different grid sizes both in double and quadruple precision, where
n
is from the grid size
n
∘
×
n
∘
. Moreover, the mean values of the relative approximation errors of the gravitational effects of a spherical zonal band discretized using tesseroids are smaller than those of a spherical shell discretized using tesseroids with the influence of the computation point’s height at different grid sizes. Numerical results confirm the benefit of a spherical zonal band in comparison with a spherical shell discretized using tesseroids regarding both the computation time and errors. The numerical strategy of a spherical zonal band discretized using tesseroids can be applied instead of the commonly used numerical strategy of a spherical shell discretized using tesseroids in the numerical evaluation of a tesseroid with different numerical methods in the future research.