The Southern African Human Genome Programme is a national initiative that aspires to unlock the unique genetic character of southern African populations for a better understanding of human genetic ...diversity. In this pilot study the Southern African Human Genome Programme characterizes the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole-genome sequencing. A total of ~16 million unique variants are identified. Despite the shallow time depth since divergence between the two main southeastern Bantu-speaking groups (Nguni and Sotho-Tswana), principal component analysis and structure analysis reveal significant (p < 10
) differentiation, and F
analysis identifies regions with high divergence. The Coloured individuals show evidence of varying proportions of admixture with Khoesan, Bantu-speakers, Europeans, and populations from the Indian sub-continent. Whole-genome sequencing data reveal extensive genomic diversity, increasing our understanding of the complex and region-specific history of African populations and highlighting its potential impact on biomedical research and genetic susceptibility to disease.
A chronic inflammatory state to a large extent explains sickle cell disease (SCD) pathophysiology. Nonetheless, the principal dysregulated factors affecting this major pathway and their mechanisms of ...action still have to be fully identified and elucidated. Integrating gene expression and genome-wide association study (GWAS) data analysis represents a novel approach to refining the identification of key mediators and functions in complex diseases. Here, we performed gene expression meta-analysis of five independent publicly available microarray datasets related to homozygous SS patients with SCD to identify a consensus SCD transcriptomic profile. The meta-analysis conducted using the MetaDE R package based on combining p values (maxP approach) identified 335 differentially expressed genes (DEGs; 224 upregulated and 111 downregulated). Functional gene set enrichment revealed the importance of several metabolic pathways, of innate immune responses, erythrocyte development, and hemostasis pathways. Advanced analyses of GWAS data generated within the framework of this study by means of the atSNP R package and SIFT tool identified 60 regulatory single-nucleotide polymorphisms (rSNPs) occurring in the promoter of 20 DEGs and a deleterious SNP, affecting CAMKK2 protein function. This novel database of candidate genes, transcription factors, and rSNPs associated with SCD provides new markers that may help to identify new therapeutic targets.
The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an ...African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging.
H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community.
The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.
Human genomic data are large and complex, and require adequate infrastructure for secure storage and transfer. The NIH and The Wellcome Trust have funded multiple projects on genomic research, ...including the Human Heredity and Health in Africa (H3Africa) initiative, and data are required to be deposited into the public domain. The European Genome-phenome Archive (EGA) is a repository for sequence and genotype data where the data access is controlled by access committees. Access is determined by a formal application procedure for the purpose of secure storage and distribution, and must be in line with the informed consent of the study participants. H3Africa researchers based in Africa and generating their own data can benefit tremendously from the data sharing capabilities of the internet by using the appropriate technologies. The H3Africa Data Archive is an effort between the H3Africa data generating projects, H3ABioNet and the EGA to store and submit genomic data to public repositories. H3ABioNet maintains the security of the H3Africa Data Archive, ensures ethical security compliance, supports users with data submission and facilitates the data transfer. The goal is to ensure efficient data flow between researchers, the archive and the EGA or other public repositories. To comply with the H3Africa data sharing and release policy, nine months after the data is in secure storage, H3ABioNet converts the data into an XML format ready for submission to EGA. This article describes the infrastructure that has been developed for African human genomic data management. Keywords: genomic data, data archive, h3africa data, african genomic data
Genomics data are currently being produced at unprecedented rates, resulting in increased knowledge discovery and submission to public data repositories. Despite these advances, genomic information ...on African-ancestry populations remains significantly low compared with European- and Asian-ancestry populations. This information is typically segmented across several different biomedical data repositories, which often lack sufficient fine-grained structure and annotation to account for the diversity of African populations, leading to many challenges related to the retrieval, representation and findability of such information. To overcome these challenges, we developed the African Genomic Medicine Portal (AGMP), a database that contains metadata on genomic medicine studies conducted on African-ancestry populations. The metadata is curated from two public databases related to genomic medicine, PharmGKB and DisGeNET. The metadata retrieved from these source databases were limited to genomic variants that were associated with disease aetiology or treatment in the context of African-ancestry populations. Over 2000 variants relevant to populations of African ancestry were retrieved. Subsequently, domain experts curated and annotated additional information associated with the studies that reported the variants, including geographical origin, ethnolinguistic group, level of association significance and other relevant study information, such as study design and sample size, where available. The AGMP functions as a dedicated resource through which to access African-specific information on genomics as applied to health research, through querying variants, genes, diseases and drugs. The portal and its corresponding technical documentation, implementation code and content are publicly available.
We present two web-based components for the display of Protein-Protein Interaction networks using different self-organizing layout methods: force-directed and circular. These components conform to ...the BioJS standard and can be rendered in an HTML5-compliant browser without the need for third-party plugins. We provide examples of interaction networks and how the components can be used to visualize them, and refer to a more complex tool that uses these components.
Availability:
http://github.com/biojs/biojs;
http://dx.doi.org/10.5281/zenodo.7753
Measles virus (MV) causes T cell suppression by interference with phosphatidylinositol-3-kinase (PI3K) activation. We previously found that this interference affected the activity of splice ...regulatory proteins and a T cell inhibitory protein isoform was produced from an alternatively spliced pre-mRNA.
Differentially regulated and alternatively splice variant transcripts accumulating in response to PI3K abrogation in T cells potentially encode proteins involved in T cell silencing.
To test this hypothesis at the cellular level, we performed a Human Exon 1.0 ST Array on RNAs isolated from T cells stimulated only or stimulated after PI3K inhibition. We developed a simple algorithm based on a splicing index to detect genes that undergo alternative splicing (AS) or are differentially regulated (RG) upon T cell suppression.
Applying our algorithm to the data, 9% of the genes were assigned as AS, while only 3% were attributed to RG. Though there are overlaps, AS and RG genes differed with regard to functional regulation, and were found to be enriched in different functional groups. AS genes targeted extracellular matrix (ECM)-receptor interaction and focal adhesion pathways, while RG genes were mainly enriched in cytokine-receptor interaction and Jak-STAT. When combined, AS/RG dependent alterations targeted pathways essential for T cell receptor signaling, cytoskeletal dynamics and cell cycle entry.
PI3K abrogation interferes with key T cell activation processes through both differential expression and alternative splicing, which together actively contribute to T cell suppression.
Multiple factors underlie susceptibility to essential hypertension, including a significant genetic and ethnic component, and environmental effects. Blood pressure response of hypertensive ...individuals to salt is heterogeneous, but salt sensitivity appears more prevalent in people of indigenous African origin. The underlying genetics of salt-sensitive hypertension, however, are poorly understood. In this study, computational methods including text- and data-mining have been used to select and prioritize candidate aetiological genes for salt-sensitive hypertension. Additionally, we have compared allele frequencies and copy number variation for single nucleotide polymorphisms in candidate genes between indigenous Southern African and Caucasian populations, with the aim of identifying candidate genes with significant variability between the population groups: identifying genetic variability between population groups can exploit ethnic differences in disease prevalence to aid with prioritisation of good candidate genes. Our top-ranking candidate genes include parathyroid hormone precursor (PTH) and type-1 angiotensin II receptor (AGTR1). We propose that the candidate genes identified in this study warrant further investigation as potential aetiological genes for salt-sensitive hypertension.
Admixed populations present unique opportunities to discover the genetic factors underlying many multifactorial diseases. The geographical position and complex history of South Africa has led to the ...establishment of the unique admixed population known as the South African Coloured. Not much is known about the genetic make-up of this population, and the historical record is patchy. We genotyped 959 individuals from the Western Cape area, self-identified as belonging to this population, using the Affymetrix 500k genotyping platform. This resulted in nearly 75,000 autosomal SNPs that could be compared with populations represented in the International HapMap Project and the Human Genome Diversity Project. Analysis by means of both the admixture and linkage models in STRUCTURE revealed that the major ancestral components of this population are predominantly Khoesan (32-43%), Bantu-speaking Africans (20-36%), European (21-28%) and a smaller Asian contribution (9-11%), depending on the model used. This is consistent with historical data. While of great historical and genealogical interest, this information is also essential for future admixture mapping of disease genes in this population.
The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data ...analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet.