The inference of genetic ancestry plays an increasingly prominent role in clinical, population, and forensic genetics studies. Several genotyping strategies and analytical methodologies have been ...developed over the last few decades to assign individuals to specific biogeographic regions. However, despite these efforts, ancestry inference in populations with a recent history of admixture, such as those in Brazil, remains a challenge. In admixed populations, proportion and components of genetic ancestry vary on different levels: (i) between populations; (ii) between individuals of the same population, and (iii) throughout the individual's genome. The present study evaluated 1171 admixed Brazilian samples to compare the genetic ancestry inferred by tri-/tetra-hybrid admixture models and evaluated different marker sets from those with small numbers of ancestry informative markers panels (AIMs), to high-density SNPs (HDSNP) and whole-genome-sequence (WGS) data. Analyses revealed greater variation in the correlation coefficient of ancestry components within and between admixed populations, especially for minority ancestral components. We also observed positive correlation between the number of markers in the AIMs panel and HDSNP/WGS. Furthermore, the greater the number of markers, the more accurate the tri-/tetra-hybrid admixture models.
Limb-girdle muscular dystrophies (LGMD) are a heterogeneous group of genetically determined muscle disorders with a primary or predominant involvement of the pelvic or shoulder girdle musculature. ...More than 20 genes with autosomal recessive (LGMD2A to LGMD2Q) and autosomal dominant inheritance (LGMD1A to LGMD1H) have been mapped/identified to date. Mutations are known for six among the eight mapped autosomal dominant forms: LGMD1A (myotilin), LGMD1B (lamin A/C), LGMD1C (caveolin-3), LGMD1D (desmin), LGMD1E (DNAJB6), and more recently for LGMD1F (transportin-3). Our group previously mapped the LGMD1G gene at 4q21 in a Caucasian-Brazilian family. We now mapped a Uruguayan family with patients displaying a similar LGMD1G phenotype at the same locus. Whole genome sequencing identified, in both families, mutations in the HNRPDL gene. HNRPDL is a heterogeneous ribonucleoprotein family member, which participates in mRNA biogenesis and metabolism. Functional studies performed in S. cerevisiae showed that the loss of HRP1 (yeast orthologue) had pronounced effects on both protein levels and cell localizations, and yeast proteome revealed dramatic reorganization of proteins involved in RNA-processing pathways. In vivo analysis showed that hnrpdl is important for muscle development in zebrafish, causing a myopathic phenotype when knocked down. The present study presents a novel association between a muscular disorder and a RNA-related gene and reinforces the importance of RNA binding/processing proteins in muscle development and muscle disease. Understanding the role of these proteins in muscle might open new therapeutic approaches for muscular dystrophies.
As whole-genome sequencing (WGS) becomes the gold standard tool for studying population genomics and medical applications, data on diverse non-European and admixed individuals are still scarce. Here, ...we present a high-coverage WGS dataset of 1,171 highly admixed elderly Brazilians from a census-based cohort, providing over 76 million variants, of which ~2 million are absent from large public databases. WGS enables identification of ~2,000 previously undescribed mobile element insertions without previous description, nearly 5 Mb of genomic segments absent from the human genome reference, and over 140 alleles from HLA genes absent from public resources. We reclassify and curate pathogenicity assertions for nearly four hundred variants in genes associated with dominantly-inherited Mendelian disorders and calculate the incidence for selected recessive disorders, demonstrating the clinical usefulness of the present study. Finally, we observe that whole-genome and HLA imputation could be significantly improved compared to available datasets since rare variation represents the largest proportion of input from WGS. These results demonstrate that even smaller sample sizes of underrepresented populations bring relevant data for genomic studies, especially when exploring analyses allowed only by WGS.
The world population is getting older and studies aiming to enhance our comprehension of the underlying mechanisms responsible for health span are of utmost interest for longevity and as a measure ...for health care. In this review, we summarized previous genetic association studies (GWAS) and next-generation sequencing (NGS) of elderly cohorts. We also present the updated hypothesis for the aging process, together with the factors associated with healthy aging. We discuss the relevance of studying older individuals and build databanks to characterize the presence and resistance against late-onset disorders. The identification of about 2 million novel variants in our cohort of more than 1000 elderly Brazilians illustrates the importance of studying highly admixed populations of non-European ancestry. Finally, the ascertainment of nonagenarians and particularly of centenarians who were recovered from COVID-19 or remained asymptomatic opens new avenues of research aiming to enhance our comprehension of biological mechanisms associated with resistance against pathogens.
Research in the field of pharmacogenomics (PGx) aims to identify genetic variants that modulate response to drugs, through alterations in their pharmacokinetics (PK) or pharmacodynamics (PD). The ...distribution of PGx variants differs considerably among populations, and whole-genome sequencing (WGS) plays a major role as a comprehensive approach to detect both common and rare variants. This study evaluated the frequency of PGx markers in the context of the Brazilian population, using data from a population-based admixed cohort from Sao Paulo, Brazil, which includes variants from WGS of 1,171 unrelated, elderly individuals.
The Stargazer tool was used to call star alleles and structural variants (SVs) from 38 pharmacogenes. Clinically relevant variants were investigated, and the predicted drug response phenotype was analyzed in combination with the medication record to assess individuals potentially at high-risk of gene-drug interaction.
In total, 352 unique star alleles or haplotypes were observed, of which 255 and 199 had a frequency < 0.05 and < 0.01, respectively. For star alleles with frequency > 5% (
= 97), decreased, loss-of-function and unknown function accounted for 13.4%, 8.2% and 27.8% of alleles or haplotypes, respectively. Structural variants (SVs) were identified in 35 genes for at least one individual, and occurred with frequencies >5% for CYP2D6, CYP2A6, GSTM1, and UGT2B17. Overall 98.0% of the individuals carried at least one high risk genotype-predicted phenotype in pharmacogenes with PharmGKB level of evidence 1A for drug interaction. The Electronic Health Record (EHR) Priority Result Notation and the cohort medication registry were combined to assess high-risk gene-drug interactions. In general, 42.0% of the cohort used at least one PharmGKB evidence level 1A drug, and 18.9% of individuals who used PharmGKB evidence level 1A drugs had a genotype-predicted phenotype of high-risk gene-drug interaction.
This study described the applicability of next-generation sequencing (NGS) techniques for translating PGx variants into clinically relevant phenotypes on a large scale in the Brazilian population and explores the feasibility of systematic adoption of PGx testing in Brazil.
Approximately 5% of the human genome shows common structural variation, which is enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive ...structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120 kb in length and carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A (encoded by GYPA) and glycophorin B (encoded by GYPB) are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They are receptors for the invasion of the protist parasite Plasmodium falciparum, a causative agent of malaria. A particular complex structural variant, called DUP4, creates a GYPB-GYPA fusion gene known to confer resistance to malaria. Many other structural variants exist across the glycophorin gene cluster, and they remain poorly characterised.
Here, we analyse sequences from 3234 diploid genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection of these variants using fibre-FISH and breakpoint mapping at the sequence level. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by non-allelic homologous recombination and by comparing the structural variant breakpoints with recombination hotspot maps, confirm the importance of a particular meiotic recombination hotspot on structural variant formation in this region.
We identify and validate large structural variants in the human glycophorin A-B-E gene cluster which may be associated with different clinical aspects of malaria.
Genetic evaluation has been recognized as an important tool to elucidate the causes of growth disorders.
To investigate the cause of short stature and to determine the phenotype of patients with IHH ...mutations, including the response to recombinant human growth hormone (rhGH) therapy.
We studied 17 families with autosomal-dominant short stature by using whole exome sequencing and screened IHH defects in 290 patients with growth disorders. Molecular analyses were performed to evaluate the potential impact of N-terminal IHH variants.
We identified 10 pathogenic or possibly pathogenic variants in IHH, an important regulator of endochondral ossification. Molecular analyses revealed a smaller potential energy of mutated IHH molecules. The allele frequency of rare, predicted to be deleterious IHH variants found in short-stature samples (1.6%) was higher than that observed in two control cohorts (0.017% and 0.08%; P < 0.001). Identified IHH variants segregate with short stature in a dominant inheritance pattern. Affected individuals typically manifest mild disproportional short stature with a frequent finding of shortening of the middle phalanx of the fifth finger. None of them have classic features of brachydactyly type A1, which was previously associated with IHH mutations. Five patients heterozygous for IHH variants had a good response to rhGH therapy. The mean change in height standard deviation score in 1 year was 0.6.
Our study demonstrated the association of pathogenic variants in IHH with short stature with nonspecific skeletal abnormalities and established a frequent cause of growth disorder, with a preliminary good response to rhGH.
•Delayed or insufficient humoral immune response to SARS-CoV-2 in patients with Turner syndrome (TS).•Lower interferon-γ production in volunteers with TS after stimulation with toll-like receptors ...7/8 agonists.•Higher cytotoxic activity by cluster of differentiation 8+ and natural killer cells after phorbol myristate acetate (PMA)/ionomycin stimuli in TS.
The X-chromosome contains the largest number of immune-related genes, which play a major role in COVID-19 symptomatology and susceptibility. Here, we had a unique opportunity to investigate, for the first time, COVID-19 outcomes in six unvaccinated young Brazilian patients with Turner syndrome (TS; 45, X0), including one case of critical illness in a child aged 10 years, to evaluate their immune response according to their genetic profile.
A serological analysis of humoral immune response against SARS-CoV-2, phenotypic characterization of antiviral responses in peripheral blood mononuclear cells after stimuli, and the production of cytotoxic cytokines of T lymphocytes and natural killer cells were performed in blood samples collected from the patients with TS during the convalescence period. Whole exome sequencing was also performed.
Our volunteers with TS showed a delayed or insufficient humoral immune response to SARS-CoV-2 (particularly immunoglobulin G) and a decrease in interferon-γ production by cluster of differentiation (CD)4+ and CD8+ T lymphocytes after stimulation with toll-like receptors 7/8 agonists. In contrast, we observed a higher cytotoxic activity in the volunteers with TS than the volunteers without TS after phorbol myristate acetate/ionomycin stimulation, particularly granzyme B and perforin by CD8+ and natural killer cells. Interestingly, two volunteers with TS carry rare genetic variants in genes that regulate type I and III interferon immunity.
Following previous reports in the literature for other conditions, our data showed that patients with TS may have an impaired immune response against SARS-CoV-2. Furthermore, other medical conditions associated with TS could make them more vulnerable to COVID-19.
The MHC class I region contains crucial genes for the innate and adaptive immune response, playing a key role in susceptibility to many autoimmune and infectious diseases. Genome‐wide association ...studies have identified numerous disease‐associated SNPs within this region. However, these associations do not fully capture the immune‐biological relevance of specific HLA alleles. HLA imputation techniques may leverage available SNP arrays by predicting allele genotypes based on the linkage disequilibrium between SNPs and specific HLA alleles. Successful imputation requires diverse and large reference panels, especially for admixed populations. This study employed a bioinformatics approach to call SNPs and HLA alleles in multi‐ethnic samples from the 1000 genomes (1KG) dataset and admixed individuals from Brazil (SABE), utilising 30X whole‐genome sequencing data. Using HIBAG, we created three reference panels: 1KG (n = 2504), SABE (n = 1171), and the full model (n = 3675) encompassing all samples. In extensive cross‐validation of these reference panels, the multi‐ethnic 1KG reference exhibited overall superior performance than the reference with only Brazilian samples. However, the best results were achieved with the full model. Additionally, we expanded the scope of imputation by developing reference panels for non‐classical, MICA, MICB and HLA‐H genes, previously unavailable for multi‐ethnic populations. Validation in an independent Brazilian dataset showcased the superiority of our reference panels over the Michigan Imputation Server, particularly in predicting HLA‐B alleles among Brazilians. Our investigations underscored the need to enhance or adapt reference panels to encompass the target population's genetic diversity, emphasising the significance of multiethnic references for accurate imputation across different populations.
Human genomics has quickly evolved, powering genome‐wide association studies (GWASs). SNP‐based GWASs cannot capture the intense polymorphism of HLA genes, highly associated with disease ...susceptibility. There are methods to statistically impute HLA genotypes from SNP‐genotypes data, but lack of diversity in reference panels hinders their performance. We evaluated the accuracy of the 1000 Genomes data as a reference panel for imputing HLA from admixed individuals of African and European ancestries, focusing on (a) the full dataset, (b) 10 replications from 6 populations, and (c) 19 conditions for the custom reference panels. The full dataset outperformed smaller models, with a good F1‐score of 0.66 for HLA‐B. However, custom models outperformed the multiethnic or population models of similar size (F1‐scores up to 0.53, against up to 0.42). We demonstrated the importance of using genetically specific models for imputing populations, which are currently underrepresented in public datasets, opening the door to HLA imputation for every genetic population.