Water buffalo (Bubalus bubalis) is an important source of meat and milk in countries with relatively warm weather. Compared to the cattle genome, a little has been done to reveal its genome structure ...and genomic traits. This is due to the complications stemming from the large genome size, the complexity of the genome, and the high repetitive content. In this paper, we introduce a high-quality draft assembly of the Egyptian water buffalo genome. The Egyptian breed is used as a dual purpose animal (milk/meat). It is distinguished by its adaptability to the local environment, quality of feed changes, as well as its high resistance to diseases. The genome assembly of the Egyptian water buffalo has been achieved using a reference-based assembly workflow. Our workflow significantly reduced the computational complexity of the assembly process, and improved the assembly quality by integrating different public resources. We also compared our assembly to the currently available draft assemblies of water buffalo breeds. A total of 21,128 genes were identified in the produced assembly. A list of milk virgin-related genes; milk pregnancy-related genes; milk lactation-related genes; milk involution-related genes; and milk mastitis-related genes were identified in the assembly. Our results will significantly contribute to a better understanding of the genetics of the Egyptian water buffalo which will eventually support the ongoing breeding efforts and facilitate the future discovery of genes responsible for complex processes of dairy, meat production and disease resistance among other significant traits.
Ion Torrent is one of the major next generation sequencing (NGS) technologies and it is frequently used in medical research and diagnosis. The built-in software for the Ion Torrent sequencing ...machines delivers the sequencing results in the BAM format. In addition to the usual SAM/BAM fields, the Ion Torrent BAM file includes technology-specific flow signal data. The flow signals occupy a big portion of the BAM file (about 75% for the human genome). Compressing SAM/BAM into CRAM format significantly reduces the space needed to store the NGS results. However, the tools for generating the CRAM formats are not designed to handle the flow signals. This missing feature has motivated us to develop a new program to improve the compression of the Ion Torrent files for long term archiving.
In this paper, we present IonCRAM, the first reference-based compression tool to compress Ion Torrent BAM files for long term archiving. For the BAM files, IonCRAM could achieve a space saving of about 43%. This space saving is superior to what achieved with the CRAM format by about 8-9%.
Reducing the space consumption of NGS data reduces the cost of storage and data transfer. Therefore, developing efficient compression software for clinical NGS data goes beyond the computational interest; as it ultimately contributes to the overall cost reduction of the clinical test. The space saving achieved by our tool is a practical step in this direction. The tool is open source and available at Code Ocean, github, and http://ioncram.saudigenomeproject.com .
Background Molecular genetics techniques are an essential diagnostic tool for primary immunodeficiency diseases (PIDs). The use of next-generation sequencing (NGS) provides a comprehensive way of ...concurrently screening a large number of PID genes. However, its validity and cost-effectiveness require verification. Objectives We sought to identify and overcome complications associated with the use of NGS in a comprehensive gene panel incorporating 162 PID genes. We aimed to ascertain the specificity, sensitivity, and clinical sensitivity of the gene panel and its utility as a diagnostic tool for PIDs. Methods A total of 162 PID genes were screened in 261 patients by using the Ion Torrent Proton NGS sequencing platform. Of the 261 patients, 122 had at least 1 known causal mutation at the onset of the study and were used to assess the specificity and sensitivity of the assay. The remaining samples were from unsolved cases that were biased toward more phenotypically and genotypically complicated cases. Results The assay was able to detect the mutation in 117 (96%) of 122 positive control subjects with known causal mutations. For the unsolved cases, our assay resulted in a molecular genetic diagnosis for 35 of 139 patients. Interestingly, most of these cases represented atypical clinical presentations of known PIDs. Conclusions The targeted NGS PID gene panel is a sensitive and cost-effective diagnostic tool that can be used as a first-line molecular assay in patients with PIDs. The assay is an alternative choice to the complex and costly candidate gene approach, particularly for patients with atypical presentation of known PID genes.
Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have ...gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.
In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.
Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis.The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the user's local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.
Abstract
Context
Pediatric differentiated thyroid cancer (DTC) differs from adult DTC in its underlying genetics and clinicopathological features. In this report, we studied these aspects in 48 cases ...of pediatric DTC.
Patients and Methods
We used the comprehensive Oncomine Childhood Cancer Gene panel on Ion Torrent next-generation sequencing platform. We included 48 patients (37 girls and 11 boys) with pediatric DTC (median age 17 years; range, 5-18 years) and studied the association between these genetic alterations and the clinicopathological features and outcome.
Results
Of 48 tumors, 33 (69%) had somatic genetic alterations that were mutually exclusive except in one tumor. BRAFV600E and RET-PTC1 were the most common, occurring in 9 different tumors (19%) each. RET-PTC3 and ETV6-NTRK3 were the next most common, with each occurring in 4 different tumors (8%). Other genetic alterations including EML4-NTRK1, EML4-ALK, NRAS, KRAS, PTEN, and CREBBP occurred once each. There were no differences between those who had mutations and those without mutations with respect to age, sex, tumor multifocality, extrathyroidal extension, vascular invasion, lymph node or distant metastasis, and American Thyroid Association response to therapy status at the last follow-up visits. Similarly, none of these factors was different between those with fusion genes vs single-point mutations vs no mutations.
Conclusions
In pediatric DTC, fusion genes are more common than single-point mutations. The most common genetic alterations are RET-PTC1, BRAFV600E, RET-PTC3, and ETV6-NTRK3. Other alterations occur rarely. Genetic alterations do not correlate with the clinicopathological features or the outcome.
Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known ...disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS). We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.
Display omitted
•Multiplex consanguineous families are rich sources for novel gene discovery•Prescreening these families for known disease genes accelerates gene discovery•33 novel candidate genes are reported in this study
Using whole-exome sequencing on prescreened multiplex consanguineous families, Alazami et al. describe the identification of 33 novel candidate genes for various neurogenetic conditions. Such families are rich sources for novel gene discovery.
Abstract
At Wuhan, in December 2019, the SRAS-CoV-2 outbreak was detected and it has been the pandemic worldwide. This study aims to investigate the mutations in sequence of the SARS-CoV-2 genome and ...characterize the mutation patterns in Egyptian COVID-19 patients during different waves of infection. The samples were collected from 250 COVID-19 patients and the whole genome sequencing was conducted using Next Generation Sequencing. The viral sequence analysis showed 1115 different genome from all Egyptian samples in the second wave mutations including 613 missense mutations, 431 synonymous mutations, 25 upstream gene mutations, 24 downstream gene mutations, 10 frame-shift deletions, and 6 stop gained mutation. The Egyptian genomic strains sequenced in second wave of infection are different to that of the first wave. We observe a shift of lineage prevalence from the strain B.1 to B.1.1.1. Only one case was of the new English B.1.1.7. Few samples have one or two mutations of interest from the Brazil and South Africa isolates. New clade 20B appear by March 2020 and 20D appear by May 2020 till January 2021.
Most autosomal recessive diseases are rare, but they collectively account for a substantial proportion of disease burden, especially in consanguineous populations. Estimation of this disease burden, ...however, is hampered by many factors, including lack of countrywide registries. Establishing carrier frequency can be a practical surrogate to estimate disease burden, although the requirement of a large representative cohort may be challenging.
We propose that the application of clinical genomics in the diagnostic setting offers a unique opportunity to estimate carrier frequency in the population as a secondary benefit.
We used a data set of ~7,100 patients who underwent genomic testing for various Mendelian disorders to estimate the carrier frequency.
We were able to calculate the frequency of 259 confirmed founder recessive mutations. We found the corresponding disease burden to be, at minimum, ~7 per 1,000 children born to first-cousin parents, with disorders related to intellectual disability and vision impairment being the most common.
Our approach can be utilized to inform the design of new policies for the prevention of genetic disorders and highlights an important secondary benefit of clinical genomics.
Genet Med18 12, 1244–1249.
Ciliopathies are clinically diverse disorders of the primary cilium. Remarkable progress has been made in understanding the molecular basis of these genetically heterogeneous conditions; however, our ...knowledge of their morbid genome, pleiotropy, and variable expressivity remains incomplete.
We applied genomic approaches on a large patient cohort of 371 affected individuals from 265 families, with phenotypes that span the entire ciliopathy spectrum. Likely causal mutations in previously described ciliopathy genes were identified in 85% (225/265) of the families, adding 32 novel alleles. Consistent with a fully penetrant model for these genes, we found no significant difference in their "mutation load" beyond the causal variants between our ciliopathy cohort and a control non-ciliopathy cohort. Genomic analysis of our cohort further identified mutations in a novel morbid gene TXNDC15, encoding a thiol isomerase, based on independent loss of function mutations in individuals with a consistent ciliopathy phenotype (Meckel-Gruber syndrome) and a functional effect of its deficiency on ciliary signaling. Our study also highlighted seven novel candidate genes (TRAPPC3, EXOC3L2, FAM98C, C17orf61, LRRCC1, NEK4, and CELSR2) some of which have established links to ciliogenesis. Finally, we show that the morbid genome of ciliopathies encompasses many founder mutations, the combined carrier frequency of which accounts for a high disease burden in the study population.
Our study increases our understanding of the morbid genome of ciliopathies. We also provide the strongest evidence, to date, in support of the classical Mendelian inheritance of Bardet-Biedl syndrome and other ciliopathies.