The origin of kpc-scale holes in the atomic hydrogen (H I) distributions of some nearby dwarf irregular galaxies presents an intriguing problem. Star formation histories (SFHs) derived from resolved ...stars give us the unique opportunity to study past star-forming events that may have helped shape the currently visible H I distribution. Our sample of five nearby dwarf irregular galaxies spans over an order of magnitude in both total H I mass and absolute B-band magnitude and is at the low-mass end of previously studied systems. We use Very Large Array H I line data to estimate the energy required to create the centrally dominant hole in each galaxy. We compare this energy estimate to the past energy released by the underlying stellar populations computed from SFHs derived from data taken with the Hubble Space Telescope. The inferred integrated stellar energy released within the characteristic ages exceeds our energy estimates for creating the holes in all cases, assuming expected efficiencies. Therefore, it appears that stellar feedback provides sufficient energy to produce the observed holes. However, we find no obvious signature of single star-forming events responsible for the observed structures when comparing the global SFHs of each galaxy in our sample to each other or to those of dwarf irregular galaxies reported in the literature. We also fail to find evidence of a central star cluster in FUV or H Delta *a imaging. We conclude that large H I holes are likely formed from multiple generations of star formation and only under suitable interstellar medium conditions.
Abstract
Introduction
Genetic variants associated with nicotine dependence have previously been identified, primarily in European-ancestry populations. No genome-wide association studies (GWAS) have ...been reported for smoking behaviors in Hispanics/Latinos in the United States and Latin America, who are of mixed ancestry with European, African, and American Indigenous components.
Methods
We examined genetic associations with smoking behaviors in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) (N = 12 741 with smoking data, 5119 ever-smokers), using ~2.3 million genotyped variants imputed to the 1000 Genomes Project phase 3. Mixed logistic regression models accounted for population structure, sampling, relatedness, sex, and age.
Results
The known region of CHRNA5, which encodes the α5 cholinergic nicotinic receptor subunit, was associated with heavy smoking at genome-wide significance (p ≤ 5 × 10–8) in a comparison of 1929 ever-smokers reporting cigarettes per day (CPD) > 10 versus 3156 reporting CPD ≤ 10. The functional variant rs16969968 in CHRNA5 had a p value of 2.20 × 10–7 and odds ratio (OR) of 1.32 for the minor allele (A); its minor allele frequency was 0.22 overall and similar across Hispanic/Latino background groups (Central American = 0.17; South American = 0.19; Mexican = 0.18; Puerto Rican = 0.22; Cuban = 0.29; Dominican = 0.19). CHRNA4 on chromosome 20 attained p < 10–4, supporting prior findings in non-Hispanics. For nondaily smoking, which is prevalent in Hispanic/Latino smokers, compared to daily smoking, loci on chromosomes 2 and 4 achieved genome-wide significance; replication attempts were limited by small Hispanic/Latino sample sizes.
Conclusions
Associations of nicotinic receptor gene variants with smoking, first reported in non-Hispanic European-ancestry populations, generalized to Hispanics/Latinos despite different patterns of smoking behavior.
Implications
We conducted the first large-scale genome-wide association study (GWAS) of smoking behavior in a US Hispanic/Latino cohort, and the first GWAS of daily/nondaily smoking in any population. Results show that the region of the nicotinic receptor subunit gene CHRNA5, which in non-Hispanic European-ancestry smokers has been associated with heavy smoking as well as cessation and treatment efficacy, is also significantly associated with heavy smoking in this Hispanic/Latino cohort. The results are an important addition to understanding the impact of genetic variants in understudied Hispanic/Latino smokers.
ABSTRACT
Investigators often meta‐analyze multiple genome‐wide association studies (GWASs) to increase the power to detect associations of single nucleotide polymorphisms (SNPs) with a trait. ...Meta‐analysis is also performed within a single cohort that is stratified by, e.g., sex or ancestry group. Having correlated individuals among the strata may complicate meta‐analyses, limit power, and inflate Type 1 error. For example, in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), sources of correlation include genetic relatedness, shared household, and shared community. We propose a novel mixed‐effect model for meta‐analysis, “MetaCor,” which accounts for correlation between stratum‐specific effect estimates. Simulations show that MetaCor controls inflation better than alternatives such as ignoring the correlation between the strata or analyzing all strata together in a “pooled” GWAS, especially with different minor allele frequencies (MAFs) between strata. We illustrate the benefits of MetaCor on two GWASs in the HCHS/SOL. Analysis of dental caries (tooth decay) stratified by ancestry group detected a genome‐wide significant SNP (rs7791001, P‐value = 3.66×10−8, compared to 4.67×10−7 in pooled), with different MAFs between strata. Stratified analysis of body mass index (BMI) by ancestry group and sex reduced overall inflation from λGC=1.050 (pooled) to λGC=1.028 (MetaCor). Furthermore, even after removing close relatives to obtain nearly uncorrelated strata, a naïve stratified analysis resulted in λGC=1.058 compared to λGC=1.027 for MetaCor.
Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown
. The age-related acquisition of somatic mutations that ...lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer
and coronary heart disease
-this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP)
. Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.
The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving ...diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)
. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
Hispanic/Latinos have been underrepresented in genome-wide association studies (GWAS) for anthropometric traits despite their notable anthropometric variability, ancestry proportions, and high burden ...of growth stunting and overweight/obesity. To address this knowledge gap, we analyzed densely imputed genetic data in a sample of Hispanic/Latino adults to identify and fine-map genetic variants associated with body mass index (BMI), height, and BMI-adjusted waist-to-hip ratio (WHRadjBMI). We conducted a GWAS of 18 studies/consortia as part of the Hispanic/Latino Anthropometry (HISLA) Consortium (stage 1, n = 59,771) and generalized our findings in 9 additional studies (stage 2, n = 10,538). We conducted a trans-ancestral GWAS with summary statistics from HISLA stage 1 and existing consortia of European and African ancestries. In our HISLA stage 1 + 2 analyses, we discovered one BMI locus, as well as two BMI signals and another height signal each within established anthropometric loci. In our trans-ancestral meta-analysis, we discovered three BMI loci, one height locus, and one WHRadjBMI locus. We also identified 3 secondary signals for BMI, 28 for height, and 2 for WHRadjBMI in established loci. We show that 336 known BMI, 1,177 known height, and 143 known WHRadjBMI (combined) SNPs demonstrated suggestive transferability (nominal significance and effect estimate directional consistency) in Hispanic/Latino adults. Of these, 36 BMI, 124 height, and 11 WHRadjBMI SNPs were significant after trait-specific Bonferroni correction. Trans-ancestral meta-analysis of the three ancestries showed a small-to-moderate impact of uncorrected population stratification on the resulting effect size estimates. Our findings demonstrate that future studies may also benefit from leveraging diverse ancestries and differences in linkage disequilibrium patterns to discover novel loci and additional signals with less residual population stratification.
Large-scale genetic analyses of diverse populations hold great potential for advancing the field. In our sample of Hispanic/Latino adults, and when combined with publicly available results from other ancestries, we identified, described, and characterized 42 novel genomic findings. We additionally illustrated that novel biologic insights can be garnered from studying ancestrally diverse populations. Overall, our findings indicate the added value of building large, more diverse genome-wide association studies in the future.
We assembled an ancestrally diverse collection of genome-wide association studies (GWAS) of type 2 diabetes (T2D) in 180,834 affected individuals and 1,159,055 controls (48.9% non-European descent) ...through the Diabetes Meta-Analysis of Trans-Ethnic association studies (DIAMANTE) Consortium. Multi-ancestry GWAS meta-analysis identified 237 loci attaining stringent genome-wide significance (P < 5 × 10
), which were delineated to 338 distinct association signals. Fine-mapping of these signals was enhanced by the increased sample size and expanded population diversity of the multi-ancestry meta-analysis, which localized 54.4% of T2D associations to a single variant with >50% posterior probability. This improved fine-mapping enabled systematic assessment of candidate causal genes and molecular mechanisms through which T2D associations are mediated, laying the foundations for functional investigations. Multi-ancestry genetic risk scores enhanced transferability of T2D prediction across diverse populations. Our study provides a step toward more effective clinical translation of T2D GWAS to improve global health for all, irrespective of genetic background.
Turbulent neutral hydrogen (H I) line widths are often thought to be driven primarily by star formation (SF), but the timescale for converting SF energy to H I kinetic energy is unclear. As a ...complication, studies on the connection between H I line widths and SF in external galaxies often use broadband tracers for the SF rate, which must implicitly assume that SF histories (SFHs) have been constant over the timescale of the tracer. In this paper, we compare measures of H I energy to time-resolved SFHs in a number of nearby dwarf galaxies. We find that H I energy surface density is strongly correlated only with SF that occurred 30-40 Myr ago. This timescale corresponds to the approximate lifetime of the lowest mass supernova progenitors (~8M sub(middot in circle)). This analysis suggests that the coupling between SF and the neutral interstellar medium is strongest on this timescale, due either to an intrinsic delay between the release of the peak energy from SF or to the coherent effects of many supernova explosions during this interval. At capital sigma sub(SFR) > 10 super(-3) M sub(middot in circle) yr super(-1) kpc super(-2), we find a mean coupling efficiency between SF energy and H I energy of member of = 0.11 + or - 0.04 using the 30-40 Myr timescale. However, unphysical efficiencies are required in lower capital sigma sub(SFR) systems, implying that SF is not the primary driver of H I kinematics at capital sigma sub(SFR) < 10 super(-3)M sub(middot in circle) yr super(-1) kpc super(-2).
Analyses of data from genome-wide association studies on unrelated individuals have shown that, for human traits and diseases, approximately one-third to two-thirds of heritability is captured by ...common SNPs. However, it is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular whether the causal variants are rare, or whether it is overestimated due to bias in inference from pedigree data. Here we estimated heritability for height and body mass index (BMI) from whole-genome sequence data on 25,465 unrelated individuals of European ancestry. The estimated heritability was 0.68 (standard error 0.10) for height and 0.30 (standard error 0.10) for body mass index. Low minor allele frequency variants in low linkage disequilibrium (LD) with neighboring variants were enriched for heritability, to a greater extent for protein-altering variants, consistent with negative selection. Our results imply that rare variants, in particular those in regions of low linkage disequilibrium, are a major source of the still missing heritability of complex traits and disease.
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage ...variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.