Elevated plasma cholesterol and type 2 diabetes (T2D) are associated with coronary artery disease (CAD). Individuals treated with cholesterol-lowering statins have increased T2D risk, while ...individuals with hypercholesterolemia have reduced T2D risk. We explore the relationship between lipid and glucose control by constructing network models from the STARNET study with sequencing data from seven cardiometabolic tissues obtained from CAD patients during coronary artery by-pass grafting surgery. By integrating gene expression, genotype, metabolomic, and clinical data, we identify a glucose and lipid determining (GLD) regulatory network showing inverse relationships with lipid and glucose traits. Master regulators of the GLD network also impact lipid and glucose levels in inverse directions. Experimental inhibition of one of the GLD network master regulators, lanosterol synthase (LSS), in mice confirms the inverse relationships to glucose and lipid levels as predicted by our model and provides mechanistic insights.
Heritability is essential for understanding the biological causes of disease but requires laborious patient recruitment and phenotype ascertainment. Electronic health records (EHRs) passively capture ...a wide range of clinically relevant data and provide a resource for studying the heritability of traits that are not typically accessible. EHRs contain next-of-kin information collected via patient emergency contact forms, but until now, these data have gone unused in research. We mined emergency contact data at three academic medical centers and identified 7.4 million familial relationships while maintaining patient privacy. Identified relationships were consistent with genetically derived relatedness. We used EHR data to compute heritability estimates for 500 disease phenotypes. Overall, estimates were consistent with the literature and between sites. Inconsistencies were indicative of limitations and opportunities unique to EHR research. These analyses provide a validation of the use of EHRs for genetics and disease research.
Display omitted
•Emergency contact information used to identify 7.4 million familial relationships.•Familial relationships were validated using clinical and genetic data•Estimated heritability for 500 traits using only medical records data•Heritability estimates were concordant across study sites and with the literature
Electronic health records can be mined for familial relationships that provide distinct insights into heritability of human disease.
Pathogenic variants in BRCA1 and BRCA2 (BRCA1/2) lead to increased risk of breast, ovarian, and other cancers, but most variant-positive individuals in the general population are unaware of their ...risk, and little is known about prevalence in non-European populations. We investigated BRCA1/2 prevalence and impact in the electronic health record (EHR)-linked BioMe Biobank in New York City.
Exome sequence data from 30,223 adult BioMe participants were evaluated for pathogenic variants in BRCA1/2. Prevalence estimates were made in population groups defined by genetic ancestry and self-report. EHR data were used to evaluate clinical characteristics of variant-positive individuals.
There were 218 (0.7%) individuals harboring expected pathogenic variants, resulting in an overall prevalence of 1 in 139. The highest prevalence was in individuals with Ashkenazi Jewish (AJ; 1 in 49), Filipino and other Southeast Asian (1 in 81), and non-AJ European (1 in 103) ancestry. Among 218 variant-positive individuals, 112 (51.4%) harbored known founder variants: 80 had AJ founder variants (BRCA1 c.5266dupC and c.68_69delAG, and BRCA2 c.5946delT), 8 had a Puerto Rican founder variant (BRCA2 c.3922G>T), and 24 had one of 19 other founder variants. Non-European populations were more likely to harbor BRCA1/2 variants that were not classified in ClinVar or that had uncertain or conflicting evidence for pathogenicity (uncertain/conflicting). Within mixed ancestry populations, such as Hispanic/Latinos with genetic ancestry from Africa, Europe, and the Americas, there was a strong correlation between the proportion of African genetic ancestry and the likelihood of harboring an uncertain/conflicting variant. Approximately 28% of variant-positive individuals had a personal history, and 45% had a personal or family history of BRCA1/2-associated cancers. Approximately 27% of variant-positive individuals had prior clinical genetic testing for BRCA1/2. However, individuals with AJ founder variants were twice as likely to have had a clinical test (39%) than those with other pathogenic variants (20%).
These findings deepen our knowledge about BRCA1/2 variants and associated cancer risk in diverse populations, indicate a gap in knowledge about potential cancer-related variants in non-European populations, and suggest that genomic screening in diverse patient populations may be an effective tool to identify at-risk individuals.
On average, Peruvian individuals are among the shortest in the world
. Here we show that Native American ancestry is associated with reduced height in an ethnically diverse group of Peruvian ...individuals, and identify a population-specific, missense variant in the FBN1 gene (E1297G) that is significantly associated with lower height. Each copy of the minor allele (frequency of 4.7%) reduces height by 2.2 cm (4.4 cm in homozygous individuals). To our knowledge, this is the largest effect size known for a common height-associated variant. FBN1 encodes the extracellular matrix protein fibrillin 1, which is a major structural component of microfibrils. We observed less densely packed fibrillin-1-rich microfibrils with irregular edges in the skin of individuals who were homozygous for G1297 compared with individuals who were homozygous for E1297. Moreover, we show that the E1297G locus is under positive selection in non-African populations, and that the E1297 variant shows subtle evidence of positive selection specifically within the Peruvian population. This variant is also significantly more frequent in coastal Peruvian populations than in populations from the Andes or the Amazon, which suggests that short stature might be the result of adaptation to factors that are associated with the coastal environment in Peru.
The ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD ...across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by several orders of magnitude on genomic datasets, making IBD estimation tractable for millions of individuals. We apply iLASH to the PAGE dataset of ~52,000 multi-ethnic participants, including several founder populations with elevated IBD sharing, identifying IBD segments in ~3 minutes per chromosome compared to over 6 days for a state-of-the-art algorithm. iLASH enables efficient analysis of very large-scale datasets, as we demonstrate by computing IBD across the UK Biobank (~500,000 individuals), detecting 12.9 billion pairwise connections.
Identity-by-descent (IBD), the detection of shared segments inherited from a common ancestor, is a fundamental concept in genomics with broad applications in the characterization and analysis of ...genomes. While historically the concept of IBD was extensively utilized through linkage analyses and in studies of founder populations, applications of IBD-based methods subsided during the genome-wide association study era. This was primarily due to the computational expense of IBD detection, which becomes increasingly relevant as the field moves toward the analysis of biobank-scale datasets that encompass individuals from highly diverse backgrounds. To address these computational barriers, the past several years have seen new methodological advances enabling IBD detection for datasets in the hundreds of thousands to millions of individuals, enabling novel analyses at an unprecedented scale. Here, we describe the latest innovations in IBD detection and describe opportunities for the application of IBD-based methods across a broad range of questions in the field of genomics.
Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these ...population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.
Display omitted
•Genomic data linked to health records capture demography in health systems•Genetic networks reveal recent common ancestry in diverse populations•Evidence of many founder populations in New York City•Fine-scale population structure impacts genetic risk predictions
Taking a quantitative approach to genetic ancestry in health systems furthers understanding of disease burdens specific to fine-scale populations and the environmental and demographic ties that can impact disease.
Population-based genomic screening has the predicted ability to reduce morbidity and mortality associated with medically actionable conditions. However, much research is needed to develop standards ...for genomic screening and to understand the perspectives of people offered this new testing modality. This is particularly true for non-European ancestry populations who are vastly underrepresented in genomic medicine research. Therefore, we implemented a pilot genomic screening program in the BioMe Biobank in New York City, where the majority of participants are of non-European ancestry.
We initiated genomic screening for well-established genes associated with hereditary breast and ovarian cancer syndrome (HBOC), Lynch syndrome (LS), and familial hypercholesterolemia (FH). We evaluated and included an additional gene (TTR) associated with hereditary transthyretin amyloidosis (hATTR), which has a common founder variant in African ancestry populations. We evaluated the characteristics of 74 participants who received results associated with these conditions. We also assessed the preferences of 7461 newly enrolled BioMe participants to receive genomic results.
In the pilot genomic screening program, 74 consented participants received results related to HBOC (N = 26), LS (N = 6), FH (N = 8), and hATTR (N = 34). Thirty-three of 34 (97.1%) participants who received a result related to hATTR were self-reported African American/African (AA) or Hispanic/Latinx (HL), compared to 14 of 40 (35.0%) participants who received a result related to HBOC, LS, or FH. Among the 7461 participants enrolled after the BioMe protocol modification to allow the return of genomic results, 93.4% indicated that they would want to receive results. Younger participants, women, and HL participants were more likely to opt to receive results.
The addition of TTR to a pilot genomic screening program meant that we returned results to a higher proportion of AA and HL participants, in comparison with genes traditionally included in genomic screening programs in the USA. We found that the majority of participants in a multi-ethnic biobank are interested in receiving genomic results for medically actionable conditions. These findings increase knowledge about the perspectives of diverse research participants on receiving genomic results and inform the broader implementation of genomic medicine in underrepresented patient populations.
Hispanic/Latino (H/L) populations, although linked by culture and aspects of shared history, reflect the complexity of history and migration influencing the Americas. The original settlement by ...indigenous Americans, followed by postcolonial admixture from multiple continents, has yielded localized genetic patterns. In addition, numerous H/L populations appear to have signatures of pre-colonization and post-colonization bottlenecks, indicating that tens of millions of H/Ls may harbor signatures of founder effects today. Based on both population and medical genetic findings we highlight the extreme differentiation across the Americas, providing evidence for why H/Ls should not be considered a single population in modern human genetics. We highlight the need for additional sampling of understudied H/L groups, and ramifications of these findings for genomic medicine in one-tenth of the world’s population.
Individuals of admixed ancestries (for example, African Americans) inherit a mosaic of ancestry segments (local ancestry) originating from multiple continental ancestral populations. This offers the ...unique opportunity of investigating the similarity of genetic effects on traits across ancestries within the same population. Here we introduce an approach to estimate correlation of causal genetic effects (r
) across local ancestries and analyze 38 complex traits in African-European admixed individuals (N = 53,001) to observe very high correlations (meta-analysis r
= 0.95, 95% credible interval 0.93-0.97), much higher than correlation of causal effects across continental ancestries. We replicate our results using regression-based methods from marginal genome-wide association study summary statistics. We also report realistic scenarios where regression-based methods yield inflated heterogeneity-by-ancestry due to ancestry-specific tagging of causal effects, and/or polygenicity. Our results motivate genetic analyses that assume minimal heterogeneity in causal effects by ancestry, with implications for the inclusion of ancestry-diverse individuals in studies.