Specialized data structures are required for online algorithms to efficiently handle large sequencing datasets. The counting quotient filter (CQF), a compact hashtable, can efficiently store k-mers ...with a skewed distribution.
Here, we present the mixed-counters quotient filter (MQF) as a new variant of the CQF with novel counting and labeling systems. The new counting system adapts to a wider range of data distributions for increased space efficiency and is faster than the CQF for insertions and queries in most of the tested scenarios. A buffered version of the MQF can offload storage to disk, trading speed of insertions and queries for a significant memory reduction. The labeling system provides a flexible framework for assigning labels to member items while maintaining good data locality and a concise memory representation. These labels serve as a minimal perfect hash function but are ~ tenfold faster than BBhash, with no need to re-analyze the original data for further insertions or deletions.
The MQF is a flexible and efficient data structure that extends our ability to work with high throughput sequencing data.
Domestic dog breeds exhibit remarkable morphological variations that result from centuries of artificial selection and breeding. Identifying the genetic changes that contribute to these variations ...could provide critical insights into the molecular basis of tissue and organismal morphogenesis. Bulldogs, French Bulldogs and Boston Terriers share many morphological and disease-predisposition traits, including brachycephalic skull morphology, widely set eyes and short stature. Unlike other brachycephalic dogs, these breeds also exhibit vertebral malformations that result in a truncated, kinked tail (screw tail). Whole genome sequencing of 100 dogs from 21 breeds identified 12.4 million bi-allelic variants that met inclusion criteria. Whole Genome Association of these variants with the breed defining phenotype of screw tail was performed using 10 cases and 84 controls and identified a frameshift mutation in the WNT pathway gene DISHEVELLED 2 (DVL2) (Chr5: 32195043_32195044del, p = 4.37 X 10-37) as the most strongly associated variant in the canine genome. This DVL2 variant was fixed in Bulldogs and French Bulldogs and had a high allele frequency (0.94) in Boston Terriers. The DVL2 variant segregated with thoracic and caudal vertebral column malformations in a recessive manner with incomplete and variable penetrance for thoracic vertebral malformations between different breeds. Importantly, analogous frameshift mutations in the human DVL1 and DVL3 genes cause Robinow syndrome, a congenital disorder characterized by similar craniofacial, limb and vertebral malformations. Analysis of the canine DVL2 variant protein showed that its ability to undergo WNT-induced phosphorylation is reduced, suggesting that altered WNT signaling may contribute to the Robinow-like syndrome in the screwtail breeds.
Despite significant healthcare advances in the 21
st
century, the exact etiology of dental caries remains unsolved. The past two decades have witnessed a tremendous growth in our understanding of ...dental caries amid the advent of revolutionary omics technologies. Accordingly, a consensus has been reached that dental caries is a community-scale metabolic disorder, and its etiology is beyond a single causative organism. This conclusion was based on a variety of microbiome studies following the flow of information along the central dogma of biology from genomic data to the end products of metabolism. These studies were facilitated by the unprecedented growth of the next- generation sequencing tools and omics techniques, such as metagenomics and metatranscriptomics, to estimate the community composition of oral microbiome and its functional potential. Furthermore, the rapidly evolving proteomics and metabolomics platforms, including nuclear magnetic resonance spectroscopy and/or mass spectrometry coupled with chromatography, have enabled precise quantification of the translational outcomes. Although the majority supports ‘conserved functional changes’ as indicators of dysbiosis, it remains unclear how caries dynamics impact the microbiota functions and vice versa, over the course of disease onset and progression. What compounds the situation is the host-microbiota crosstalk. Genome-wide association studies have been undertaken to elucidate the interaction of host genetic variation with the microbiome. However, these studies are challenged by the complex interaction of host genetics and environmental factors. All these complementary approaches need to be orchestrated to capture the key players in this multifactorial disease. Herein, we critically review the milestones in caries research focusing on the state-of-art singular and integrative omics studies, supplemented with a bibliographic network analysis to address the oral microbiome, the host factors, and their interactions. Additionally, we highlight gaps in the dental literature and shed light on critical future research questions and study designs that could unravel the complexities of dental caries, the most globally widespread disease.
The importance of the Gallus gallus (chicken) as a model organism and agricultural animal merits a continuation of sequence assembly improvement efforts. We present a new version of the chicken ...genome assembly (Gallus_gallus-5.0; GCA_000002315.3), built from combined long single molecule sequencing technology, finished BACs, and improved physical maps. In overall assembled bases, we see a gain of 183 Mb, including 16.4 Mb in placed chromosomes with a corresponding gain in the percentage of intact repeat elements characterized. Of the 1.21 Gb genome, we include three previously missing autosomes, GGA30, 31, and 33, and improve sequence contig length 10-fold over the previous Gallus_gallus-4.0. Despite the significant base representation improvements made, 138 Mb of sequence is not yet located to chromosomes. When annotated for gene content, Gallus_gallus-5.0 shows an increase of 4679 annotated genes (2768 noncoding and 1911 protein-coding) over those in Gallus_gallus-4.0. We also revisited the question of what genes are missing in the avian lineage, as assessed by the highest quality avian genome assembly to date, and found that a large fraction of the original set of missing genes are still absent in sequenced bird species. Finally, our new data support a detailed map of MHC-B, encompassing two segments: one with a highly stable gene copy number and another in which the gene copy number is highly variable. The chicken model has been a critical resource for many other fields of study, and this new reference assembly will substantially further these efforts.
Automatic voice pathology detection and classification systems effectively contribute to the assessment of voice disorders, enabling the early detection of voice pathologies and the diagnosis of the ...type of pathology from which patients suffer. This paper concentrates on developing an accurate and robust feature extraction for detecting and classifying voice pathologies by investigating different frequency bands using autocorrelation and entropy. We extracted maximum peak values and their corresponding lag values from each frame of a voiced signal by using autocorrelation as features to detect and classify pathological samples. We also extracted the entropy for each frame of the voice signal after we normalized its values to be used as the features. These features were investigated in distinct frequency bands to assess the contribution of each band to the detection and classification processes. Various samples of the sustained vowel /a/ for both normal and pathological voices were extracted from three different databases in English, German, and Arabic. A support vector machine was used as a classifier. We also performed u-tests to investigate if there is a significant difference between the means of the normal and pathological samples. The best achieved accuracies in both detection and classification varied depending on the used band, method, and database. The most contributive bands in both detection and classification were between 1000 and 8000 Hz. The highest obtained accuracies in the case of detection were 99.69%, 92.79%, and 99.79% for Massachusetts eye and ear infirmary (MEEI), Saarbrücken voice database (SVD), and Arabic voice pathology database (AVPD), respectively. However, the highest achieved accuracies for classification were 99.54%, 99.53%, and 96.02% for MEEI, SVD, and AVPD, correspondingly, using the combined feature.
Genome editing followed by reproductive cloning was previously used to produce two hornless dairy bulls. We crossed one genome-edited dairy bull, homozygous for the dominant P
Celtic POLLED allele, ...with horned cows (pp) and obtained six heterozygous (P
p) polled calves. The calves had no horns and were otherwise healthy and phenotypically unremarkable. We conducted whole-genome sequencing of all animals using an Illumina HiSeq4000 to achieve ~20× coverage. Bioinformatics analyses revealed the bull was a compound heterozygote, carrying one naturally occurring P
Celtic POLLED allele and an allele containing an additional introgression of the homology-directed repair donor plasmid along with the P
Celtic allele. These alleles segregated in the offspring of this bull, and inheritance of either allele produced polled calves. No other unintended genomic alterations were observed. These data can be used to inform conversations in the scientific community, with regulatory authorities and with the public around 'intentional genomic alterations' and future regulatory actions regarding genome-edited animals.
Summary Background and Objective Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from ...which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes. Materials and Methods Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples. Results The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively.
The homologous recombination (HR) pathway is largely inactive in early embryos prior to the first cell division, making it difficult to achieve targeted gene knock-ins. The homology-mediated end ...joining (HMEJ)-based strategy has been shown to increase knock-in efficiency relative to HR, non-homologous end joining (NHEJ), and microhomology-mediated end joining (MMEJ) strategies in non-dividing cells.
By introducing gRNA/Cas9 ribonucleoprotein complex and a HMEJ-based donor template with 1 kb homology arms flanked by the H11 safe harbor locus gRNA target site, knock-in rates of 40% of a 5.1 kb bovine sex-determining region Y (SRY)-green fluorescent protein (GFP) template were achieved in Bos taurus zygotes. Embryos that developed to the blastocyst stage were screened for GFP, and nine were transferred to recipient cows resulting in a live phenotypically normal bull calf. Genomic analyses revealed no wildtype sequence at the H11 target site, but rather a 26 bp insertion allele, and a complex 38 kb knock-in allele with seven copies of the SRY-GFP template and a single copy of the donor plasmid backbone. An additional minor 18 kb allele was detected that looks to be a derivative of the 38 kb allele resulting from the deletion of an inverted repeat of four copies of the SRY-GFP template.
The allelic heterogeneity in this biallelic knock-in calf appears to have resulted from a combination of homology directed repair, homology independent targeted insertion by blunt-end ligation, NHEJ, and rearrangement following editing of the gRNA target site in the donor template. This study illustrates the potential to produce targeted gene knock-in animals by direct cytoplasmic injection of bovine embryos with gRNA/Cas9, although further optimization is required to ensure a precise single-copy gene integration event.
Abstract
Mutations in IRF6, TFAP2A and GRHL3 cause orofacial clefting syndromes in humans. However, Tfap2a and Grhl3 are also required for neurulation in mice. Here, we found that homeostasis of Irf6 ...is also required for development of the neural tube and associated structures. Over-expression of Irf6 caused exencephaly, a rostral neural tube defect, through suppression of Tfap2a and Grhl3 expression. Conversely, loss of Irf6 function caused a curly tail and coincided with a reduction of Tfap2a and Grhl3 expression in tail tissues. To test whether Irf6 function in neurulation was conserved, we sequenced samples obtained from human cases of spina bifida and anencephaly. We found two likely disease-causing variants in two samples from patients with spina bifida. Overall, these data suggest that the Tfap2a-Irf6-Grhl3 genetic pathway is shared by two embryologically distinct morphogenetic events that previously were considered independent during mammalian development. In addition, these data suggest new candidates to delineate the genetic architecture of neural tube defects and new therapeutic targets to prevent this common birth defect.
Background
Adrenalectomy for pheochromocytoma (PHEO) is challenging because of the high risk of intraoperative hemodynamic instability (HDI). This study aimed to compare the incidence and risk ...factors of intraoperative HDI between laparoscopic left adrenalectomy (LLA) and laparoscopic right adrenalectomy (LRA).
Methods
We retrospectively analyzed two hundred and seventy-one patients aged > 18 years with unilateral benign PHEO of any size who underwent transperitoneal laparoscopic adrenalectomy at our hospitals between September 2016 and September 2023. Patients were divided into LRA (
N
= 122) and LLA (
N
= 149) groups. Univariate and multivariate logistic regression analyses were used to predict intraoperative HDI. In multivariate analysis for the prediction of HDI, right-sided PHEO, PHEO size, preoperative comorbidities, and preoperative systolic blood pressure were included.
Results
Intraoperative HDI was significantly higher in the LRA group than in the LLA (27% vs. 9.4%,
p
< 0.001). In the multivariate regression analysis, right-sided tumours showed a higher risk of intraoperative HDI (odds ratio OR 5.625, 95% confidence interval CI, 1.147–27.577,
p
= 0.033). The tumor size (OR 11.019, 95% CI 3.996–30.38,
p
< 0.001), presence of preoperative comorbidities diabetes mellitus, hypertension, and coronary heart disease (OR 7.918, 95% CI 1.323–47.412,
p
= 0.023), and preoperative systolic blood pressure (OR 1.265, 95% CI 1.07–1.495,
p
= 0.006) were associated with a higher risk of HDI in both LRA and LLA, with no superiority of one side over the other.
Conclusion
LRA was associated with a significantly higher intraoperative HDI than LLA. Right-sided PHEO was a risk factor for intraoperative HDI.