Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a ...novel machine-learning method called REGENIE for fitting a whole-genome regression model for quantitative and binary phenotypes that is substantially faster than alternatives in multi-trait analyses while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes and requires only local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives, which must load genome-wide matrices into memory. This results in substantial savings in compute time and memory usage. We introduce a fast, approximate Firth logistic regression test for unbalanced case-control phenotypes. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach using the UK Biobank dataset with up to 407,746 individuals.
Despite strides in characterizing human history from genetic polymorphism data, progress in identifying genetic signatures of recent demography has been limited. Here we identify very recent ...fine-scale population structure in North America from a network of over 500 million genetic (identity-by-descent, IBD) connections among 770,000 genotyped individuals of US origin. We detect densely connected clusters within the network and annotate these clusters using a database of over 20 million genealogical records. Recent population patterns captured by IBD clustering include immigrants such as Scandinavians and French Canadians; groups with continental admixture such as Puerto Ricans; settlers such as the Amish and Appalachians who experienced geographic or cultural isolation; and broad historical trends, including reduced north-south gene flow. Our results yield a detailed historical portrait of North America after European settlement and support substantial genetic heterogeneity in the United States beyond that uncovered by previous studies.
Statins effectively lower total and plasma LDL-cholesterol, but the magnitude of decrease varies among individuals. To identify single nucleotide polymorphisms (SNPs) contributing to this variation, ...we performed a combined analysis of genome-wide association (GWA) results from three trials of statin efficacy.
Bayesian and standard frequentist association analyses were performed on untreated and statin-mediated changes in LDL-cholesterol, total cholesterol, HDL-cholesterol, and triglyceride on a total of 3932 subjects using data from three studies: Cholesterol and Pharmacogenetics (40 mg/day simvastatin, 6 weeks), Pravastatin/Inflammation CRP Evaluation (40 mg/day pravastatin, 24 weeks), and Treating to New Targets (10 mg/day atorvastatin, 8 weeks). Genotype imputation was used to maximize genomic coverage and to combine information across studies. Phenotypes were normalized within each study to account for systematic differences among studies, and fixed-effects combined analysis of the combined sample were performed to detect consistent effects across studies. Two SNP associations were assessed as having posterior probability greater than 50%, indicating that they were more likely than not to be genuinely associated with statin-mediated lipid response. SNP rs8014194, located within the CLMN gene on chromosome 14, was strongly associated with statin-mediated change in total cholesterol with an 84% probability by Bayesian analysis, and a p-value exceeding conventional levels of genome-wide significance by frequentist analysis (P = 1.8 x 10(-8)). This SNP was less significantly associated with change in LDL-cholesterol (posterior probability = 0.16, P = 4.0 x 10(-6)). Bayesian analysis also assigned a 51% probability that rs4420638, located in APOC1 and near APOE, was associated with change in LDL-cholesterol.
Using combined GWA analysis from three clinical trials involving nearly 4,000 individuals treated with simvastatin, pravastatin, or atorvastatin, we have identified SNPs that may be associated with variation in the magnitude of statin-mediated reduction in total and LDL-cholesterol, including one in the CLMN gene for which statistical evidence for association exceeds conventional levels of genome-wide significance.
PRINCE and TNT are not registered. CAP is registered at Clinicaltrials.gov NCT00451828.
Idiopathic pulmonary fibrosis (IPF) is a devastating disease that probably involves several genetic loci. Several rare genetic variants and one common single nucleotide polymorphism (SNP) of MUC5B ...have been associated with the disease. Our aim was to identify additional common variants associated with susceptibility and ultimately mortality in IPF.
First, we did a three-stage genome-wide association study (GWAS): stage one was a discovery GWAS; and stages two and three were independent case-control studies. DNA samples from European-American patients with IPF meeting standard criteria were obtained from several US centres for each stage. Data for European-American control individuals for stage one were gathered from the database of genotypes and phenotypes; additional control individuals were recruited at the University of Pittsburgh to increase the number. For controls in stages two and three, we gathered data for additional sex-matched European-American control individuals who had been recruited in another study. DNA samples from patients and from control individuals were genotyped to identify SNPs associated with IPF. SNPs identified in stage one were carried forward to stage two, and those that achieved genome-wide significance (p<5 × 10(-8)) in a meta-analysis were carried forward to stage three. Three case series with follow-up data were selected from stages one and two of the GWAS using samples with follow-up data. Mortality analyses were done in these case series to assess the SNPs associated with IPF that had achieved genome-wide significance in the meta-analysis of stages one and two. Finally, we obtained gene-expression profiling data for lungs of patients with IPF from the Lung Genomics Research Consortium and analysed correlation with SNP genotypes.
In stage one of the GWAS (542 patients with IPF, 542 control individuals matched one-by-one to cases by genetic ancestry estimates), we identified 20 loci. Six SNPs reached genome-wide significance in stage two (544 patients, 687 control individuals): three TOLLIP SNPs (rs111521887, rs5743894, rs5743890) and one MUC5B SNP (rs35705950) at 11p15.5; one MDGA2 SNP (rs7144383) at 14q21.3; and one SPPL2C SNP (rs17690703) at 17q21.31. Stage three (324 patients, 702 control individuals) confirmed the associations for all these SNPs, except for rs7144383. Linkage disequilibrium between the MUC5B SNP (rs35705950) and TOLLIP SNPs (rs111521887 r(2)=0·07, rs5743894 r(2)=0·16, and rs5743890 r(2)=0·01) was low. 683 patients from the GWAS were included in the mortality analysis. Individuals who developed IPF despite having the protective TOLLIP minor allele of rs5743890 carried an increased mortality risk (meta-analysis with fixed-effect model: hazard ratio 1·72 95% CI 1·24-2·38; p=0·0012). TOLLIP expression was decreased by 20% in individuals carrying the minor allele of rs5743890 (p=0·097), 40% in those with the minor allele of rs111521887 (p=3·0 × 10(-4)), and 50% in those with the minor allele of rs5743894 (p=2·93 × 10(-5)) compared with homozygous carriers of common alleles for these SNPs.
Novel variants in TOLLIP and SPPL2C are associated with IPF susceptibility. One novel variant of TOLLIP, rs5743890, is also associated with mortality. These associations and the reduced expression of TOLLIP in patients with IPF who carry TOLLIP SNPs emphasise the importance of this gene in the disease.
National Institutes of Health; National Heart, Lung, and Blood Institute; Pulmonary Fibrosis Foundation; Coalition for Pulmonary Fibrosis; and Instituto de Salud Carlos III.
Data from the Pharmacogenomics and Risk of Cardiovascular Disease (PARC) study and the Cardiovascular Health Study (CHS) provide independent and confirmatory evidence for association between common ...polymorphisms of the HNF1A gene encoding hepatocyte nuclear factor-1α and plasma C-reactive protein (CRP) concentration. Analyses with the use of imputation-based methods to combine genotype data from both studies and to test untyped SNPs from the HapMap database identified several SNPs within a 5 kb region of HNF1A intron 1 with the strongest evidence of association with CRP phenotype.
It is unclear whether the current distribution of surgeons practicing female pelvic medicine and reconstructive surgery in the United States is adequate to meet the needs of a growing and aging ...population. We assessed the geographic distribution of female pelvic surgeons as represented by members of the American Urogynecologic Society (AUGS) throughout the United States at the county, state, and American Congress of Obstetricians and Gynecologists district levels.
County-level data from the AUGS, American Congress of Obstetricians and Gynecologists, and the United States Census were analyzed in this observational study. State and national patterns of female pelvic surgeon density were mapped graphically using ArcGIS software and 2010 US Census demographic data.
In 2013, the 1058 AUGS practicing physicians represented 0.13% of the total physician workforce. There were 6.7 AUGS members available for every 1 million women and 20 AUGS members for every 1 million postreproductive-aged women in the United States. The density of female pelvic surgeons was highest in metropolitan areas. Overall, 88% of the counties in the United States lacked female pelvic surgeons. Nationwide, there was a mean of 1 AUGS member for every 31 practicing general obstetrician-gynecologists.
These findings have implications for training, recruiting, and retaining female pelvic surgeons. The uneven distribution of female pelvic surgeons throughout the United States is likely to worsen as graduating female pelvic medicine and reconstructive surgery fellows continue to cluster in urban areas.
The purposes of this study were 1) to examine the performance of a new multimarker regression approach for model-free linkage analysis in comparison to a conventional multipoint approach, and 2) to ...determine the whether a conditioning strategy would improve the performance of the conventional multipoint method when applied to data from two interacting loci. Linkage analysis of the Kofendrerd Personality Disorder phenotype to chromosomes 1 and 3 was performed in three populations for all 100 replicates of the Genetic Analysis Workshop 14 simulated data. Three approaches were used: a conventional multipoint analysis using the Zlr statistic as calculated in the program ALLEGRO; a conditioning approach in which the per-family contribution on one chromosome was weighted according to evidence for linkage on the other chromosome; and a novel multimarker regression approach. The multipoint and multimarker approaches were generally successful in localizing known susceptibility loci on chromosomes 1 and 3, and were found to give broadly similar results. No advantage was found with the per-family conditioning approach. The effect on power and type I error of different choices of weighting scheme (to account for different numbers of affected siblings) in the multimarker approach was examined.
Well designed, large comparative effectiveness trials assessing the efficacy of primary interventions for faecal incontinence are few in number. The objectives of this study were to compare different ...combinations of anorectal manometry-assisted biofeedback, loperamide, education, and oral placebo.
In this randomised factorial trial, participants were recruited from eight clinical sites in the USA. Women with at least one episode of faecal incontinence per month in the past 3 months were randomly assigned 0·5:1:1:1 to one of four groups: oral placebo plus education only, placebo plus anorectal manometry-assisted biofeedback, loperamide plus education only, and loperamide plus anorectal manometry-assisted biofeedback. Participants received 2 mg per day of loperamide or oral placebo with the option of dose escalation or reduction. Women assigned to biofeedback received six visits, including strength and sensory biofeedback training. All participants received a standardised faecal incontinence patient education pamphlet and were followed for 24 weeks after starting treatment. The primary endpoint was change in St Mark's (Vaizey) faecal incontinence severity score between baseline and 24 weeks, analysed by intention-to-treat using general linear mixed modelling. Investigators, interviewers, and outcome evaluators were masked to biofeedback assignment. Participants and all study staff other than the research pharmacist were masked to medication assignment. Randomisation took place within the electronic data capture system, was stratified by site using randomly permuted blocks (block size 7), and the sizes of the blocks and the allocation sequence were known only to the data coordinating centre. This trial is registered with ClinicalTrials.gov, number NCT02008565.
Between April 1, 2014, and Sept 30, 2015, 377 women were enrolled, of whom 300 were randomly assigned to placebo plus education (n=42), placebo plus biofeedback (n=84), loperamide plus education (n=88), and the combined intervention of loperamide plus biofeedback (n=86). At 24 weeks, there were no differences between loperamide versus placebo (model estimated score change -1·5 points, 95% CI -3·4 to 0·4, p=0·12), biofeedback versus education (-0·7 points, -2·6 to 1·2, p=0·47), and loperamide and biofeedback versus placebo and biofeedback (-1·9 points, -4·1 to 0·3, p=0·092) or versus loperamide plus education (-1·1 points, -3·4 to 1·1, p=0·33). Constipation was the most common grade 3 or higher adverse event and was reported by two (2%) of 86 participants in the loperamide and biofeedback group and two (2%) of 88 in the loperamide plus education group. The percentage of participants with any serious adverse events did not differ between the treatment groups. Only one serious adverse event was considered related to treatment (small bowel obstruction in the placebo and biofeedback group).
In women with normal stool consistency and faecal incontinence bothersome enough to seek treatment, we were unable to find evidence against the null hypotheses that loperamide is equivalent to placebo, that anal exercises with biofeedback is equivalent to an educational pamphlet, and that loperamide and biofeedback are equivalent to oral placebo and biofeedback or loperamide plus an educational pamphlet. Because these are common first-line treatments for faecal incontinence, clinicians could consider combining loperamide, anal manometry-assisted biofeedback, and a standard educational pamphlet, but this is likely to result in only negligible improvement over individual therapies and patients should be counselled regarding possible constipation.
Eunice Kennedy Shriver National Institute of Child Health and Human Development and the National Institutes of Health Office of Research on Women's Health.