We have derived a versatile gene-based test for genome-wide association studies (GWAS). Our approach, called VEGAS (versatile gene-based association study), is applicable to all GWAS designs, ...including family-based GWAS, meta-analyses of GWAS on the basis of summary data, and DNA-pooling-based GWAS, where existing approaches based on permutation are not possible, as well as singleton data, where they are. The test incorporates information from a full set of markers (or a defined subset) within a gene and accounts for linkage disequilibrium between markers by using simulations from the multivariate normal distribution. We show that for an association study using singletons, our approach produces results equivalent to those obtained via permutation in a fraction of the computation time. We demonstrate proof-of-principle by using the gene-based test to replicate several genes known to be associated on the basis of results from a family-based GWAS for height in 11,536 individuals and a DNA-pooling-based GWAS for melanoma in ∼1300 cases and controls. Our method has the potential to identify novel associated genes; provide a basis for selecting SNPs for replication; and be directly used in network (pathway) approaches that require per-gene association test statistics. We have implemented the approach in both an easy-to-use web interface, which only requires the uploading of markers with their association p-values, and a separate downloadable application.
Endocannabinoids mediate cellular functions and their activity is controlled by a complex system of enzymes, membrane receptors and transport molecules. Endocannabinoids are present in endometrium, a ...cyclical regenerative tissue requiring tightly regulated cellular mechanisms for maturation. The objective of this study was to investigate the gene expression of key elements involved in the endocannabinoid system across the menstrual cycle. RNA was isolated from endometrial tissue and genome-wide gene expression datasets were generated using RNA-sequencing. An a priori set of 70 genes associated with endocannabinoid system were selected from published literature. Gene expression across the menstrual cycle was analyzed using a moderated t test, corrected for multiple testing with Bonferroni's method. A total of 40 of the 70 genes were present in > 90% of the samples, and significant differential gene expression identified for 29 genes. We identified 4 distinct regulation patterns for synthesizing enzymes, as well as a distinct regulation pattern for degradations and transporting enzymes. This study charts the expression of endometrial endocannabinoid system genes across the menstrual cycle. Altered expression of genes that control endocannabinoid may allow fine control over endocannabinoid concentrations and their influence on cellular function, maturation and differentiation as the endometrium matures through the menstrual cycle.
We have previously shown that a quantitative-trait locus linked to the
OCA2 region of 15q accounts for 74% of variation in human eye color. We conducted additional genotyping to clarify the role of ...the
OCA2 locus in the inheritance of eye color and other pigmentary traits associated with skin-cancer risk in white populations. Fifty-eight synonymous and nonsynonymous exonic single-nucleotide polymorphisms (SNPs) and tagging SNPs were typed in a collection of 3,839 adolescent twins, their siblings, and their parents. The highest association for blue/nonblue eye color was found with three
OCA2 SNPs:
rs7495174 T/C,
rs6497268 G/T, and
rs11855019 T/C (
P values of 1.02×10
−61, 1.57×10
−96, and 4.45×10
−54, respectively) in intron 1. These three SNPs are in one major haplotype block, with TGT representing 78.4% of alleles. The TGT/TGT diplotype found in 62.2% of samples was the major genotype seen to modify eye color, with a frequency of 0.905 in blue or green compared with only 0.095 in brown eye color. This genotype was also at highest frequency in subjects with light brown hair and was more frequent in fair and medium skin types, consistent with the TGT haplotype acting as a recessive modifier of lighter pigmentary phenotypes. Homozygotes for
rs11855019 C/C were predominantly without freckles and had lower mole counts. The minor population impact of the nonsynonymous coding-region polymorphisms Arg305Trp and Arg419Gln associated with nonblue eyes and the tight linkage of the major TGT haplotype within the intron 1 of
OCA2 with blue eye color and lighter hair and skin tones suggest that differences within the 5′ proximal regulatory control region of the
OCA2 gene alter expression or messenger RNA–transcript levels and may be responsible for these associations.
Abstract
STUDY QUESTION
Do genetic effects regulate gene expression in human endometrium?
SUMMARY ANSWER
This study demonstrated strong genetic effects on endometrial gene expression and some ...evidence for genetic regulation of gene expression in a menstrual cycle stage-specific manner.
WHAT IS KNOWN ALREADY
Genetic effects on expression levels for many genes are tissue specific. Endometrial gene expression varies across menstrual cycle stages and between individuals, but there are limited data on genetic control of expression in endometrium.
STUDY DESIGN, SIZE, DURATION
We analysed genome-wide genotype and gene expression data to map cis expression quantitative trait loci (eQTL) in endometrium.
PARTICIPANTS/MATERIALS, SETTING, METHODS
We recruited 123 women of European ancestry. DNA samples from blood were genotyped on Illumina HumanCoreExome chips. Total RNA was extracted from endometrial tissues. Whole-transcriptome profiles were characterized using Illumina Human HT-12 v4.0 Expression Beadchips. We performed eQTL mapping with ~8 000 000 genotyped and imputed single nucleotide polymorphisms (SNPs) and 12 329 genes.
MAIN RESULTS AND THE ROLE OF CHANCE
We identified a total of 18 595 cis SNP-probe associations at a study-wide level of significance (P < 1 × 10−7), which correspond to independent eQTLs for 198 unique genes. The eQTLs with the largest effect in endometrial tissue were rs4902335 for CHURC1 (P = 1.05 × 10−32) and rs147253019 for ZP3 (P = 8.22 × 10−30). We further performed a context-specific eQTL analysis to investigate if genetic effects on gene expression regulation act in a menstrual cycle-specific manner. Interestingly, five cis-eQTLs were identified with a significant stage-by-genotype interaction. The strongest stage interaction was the eQTL for C10ORF33 (PYROXD2) with SNP rs2296438 (P = 2.0 × 10−4), where we observe a 2-fold difference in the average expression levels of heterozygous samples depending on the stage of the menstrual cycle.
LARGE SCALE DATA
The summary eQTL results are publicly available to browse or download.
LIMITATIONS, REASONS FOR CAUTION
A limitation of the present study was the relatively modest sample size. It was not powered to identify trans-eQTLs and larger sample sizes will also be needed to provide better power to detect cis-eQTLs and cycle stage-specific effects, given the substantial changes in expression across the menstrual cycle for many genes.
WIDER IMPLICATIONS OF THE FINDINGS
Identification of endometrial eQTLs provides a platform for better understanding genetic effects on endometriosis risk and other endometrial-related pathologies.
STUDY FUNDING/COMPETING INTEREST(S)
Funding for this work was provided by NHMRC Project Grants GNT1026033, GNT1049472, GNT1046880, GNT1050208, GNT1105321 and APP1083405. There are no competing interests.
Abstract
STUDY QUESTION
Are genetic effects on endometrial gene expression tissue specific and/or associated with reproductive traits and diseases?
SUMMARY ANSWER
Analyses of RNA-sequence data and ...individual genotype data from the endometrium identified novel and disease associated, genetic mechanisms regulating gene expression in the endometrium and showed evidence that these mechanisms are shared across biologically similar tissues.
WHAT IS KNOWN ALREADY
The endometrium is a complex tissue vital for female reproduction and is a hypothesized source of cells initiating endometriosis. Understanding genetic regulation specific to, and shared between, tissue types can aid the identification of genes involved in complex genetic diseases.
STUDY DESIGN, SIZE, DURATION
RNA-sequence and genotype data from 206 individuals was analysed and results were compared with large publicly available datasets.
PARTICIPANTS/MATERIALS, SETTING, METHODS
RNA-sequencing and genotype data from 206 endometrial samples was used to identify the influence of genetic variants on gene expression, via expression quantitative trait loci (eQTL) analysis and to compare these endometrial eQTLs with those in other tissues. To investigate the association between endometrial gene expression regulation and reproductive traits and diseases, we conducted a tissue enrichment analysis, transcriptome-wide association study (TWAS) and summary data-based Mendelian randomisation (SMR) analyses. Transcriptomic data was used to test differential gene expression between women with and without endometriosis.
MAIN RESULTS AND THE ROLE OF CHANCE
A tissue enrichment analysis with endometriosis genome-wide association study summary statistics showed that genes surrounding endometriosis risk loci were significantly enriched in reproductive tissues. A total of 444 sentinel cis-eQTLs (P < 2.57 × 10−9) and 30 trans-eQTLs (P < 4.65 × 10−13) were detected, including 327 novel cis-eQTLs in endometrium. A large proportion (85%) of endometrial eQTLs are present in other tissues. Genetic effects on endometrial gene expression were highly correlated with the genetic effects on reproductive (e.g. uterus, ovary) and digestive tissues (e.g. salivary gland, stomach), supporting a shared genetic regulation of gene expression in biologically similar tissues. The TWAS analysis indicated that gene expression at 39 loci is associated with endometriosis, including five known endometriosis risk loci. SMR analyses identified potential target genes pleiotropically or causally associated with reproductive traits and diseases including endometriosis. However, without taking account of genetic variants, a direct comparison between women with and without endometriosis showed no significant difference in endometrial gene expression.
LARGE SCALE DATA
The eQTL dataset generated in this study is available at http://reproductivegenomics.com.au/shiny/endo_eqtl_rna/. Additional datasets supporting the conclusions of this article are included within the article and the supplementary information files, or are available on reasonable request.
LIMITATIONS, REASONS FOR CAUTION
Data are derived from fresh tissue samples and expression levels are an average of expression from different cell types within the endometrium. Subtle cell-specifc expression changes may not be detected and differences in cell composition between samples and across the menstrual cycle will contribute to sample variability. Power to detect tissue specific eQTLs and differences between women with and without endometriosis was limited by the sample size in this study. The statistical approaches used in this study identify the likely gene targets for specific genetic risk factors, but not the functional mechanism by which changes in gene expression may influence disease risk.
WIDER IMPLICATIONS OF THE FINDINGS
Our results identify novel genetic variants that regulate gene expression in endometrium and the majority of these are shared across tissues. This allows analysis with large publicly available datasets to identify targets for female reproductive traits and diseases. Much larger studies will be required to identify genetic regulation of gene expression that will be specific to endometrium.
STUDY FUNDING/COMPETING INTEREST(S)
This work was supported by the National Health and Medical Research Council (NHMRC) under project grants GNT1026033, GNT1049472, GNT1046880, GNT1050208, GNT1105321, GNT1083405 and GNT1107258. G.W.M is supported by a NHMRC Fellowship (GNT1078399). J.Y is supported by an ARC Fellowship (FT180100186). There are no competing interests.
Many health conditions, ranging from psychiatric disorders to cardiovascular disease, display notable seasonal variation in severity and onset. In order to understand the molecular processes ...underlying this phenomenon, we have examined seasonal variation in the transcriptome of 606 healthy individuals. We show that 74 transcripts associated with a 12-month seasonal cycle were enriched for processes involved in DNA repair and binding. An additional 94 transcripts demonstrated significant seasonal variability that was largely influenced by blood cell count levels. These transcripts were enriched for immune function, protein production, and specific cellular markers for lymphocytes. Accordingly, cell counts for erythrocytes, platelets, neutrophils, monocytes, and CD19 cells demonstrated significant association with a 12-month seasonal cycle. These results demonstrate that seasonal variation is an important environmental regulator of gene expression and blood cell composition. Notable changes in leukocyte counts and genes involved in immune function indicate that immune cell physiology varies throughout the year in healthy individuals.
As custom arrays are cheaper than generic GWAS arrays, larger sample size is achievable for gene discovery. Custom arrays can tag more variants through denser genotyping of SNPs at associated loci, ...but at the cost of losing genome-wide coverage. Balancing this trade-off is important for maximizing experimental designs. We quantified both the gain in captured SNP-heritability at known candidate regions and the loss due to imperfect genome-wide coverage for inflammatory bowel disease using immunochip (iChip) and imputed GWAS data on 61,251 and 38.550 samples, respectively. For Crohn's disease (CD), the iChip and GWAS data explained 19 and 26% of variation in liability, respectively, and SNPs in the densely genotyped iChip regions explained 13% of the SNP-heritability for both the iChip and GWAS data. For ulcerative colitis (UC), the iChip and GWAS data explained 15 and 19% of variation in liability, respectively, and the dense iChip regions explained 10 and 9% of the SNP-heritability in the iChip and the GWAS data. From bivariate analyses, estimates of the genetic correlation in risk between CD and UC were 0.75 (SE 0.017) and 0.62 (SE 0.042) for the iChip and GWAS data, respectively. We also quantified the SNP-heritability of genomic regions that did or did not contain the previous 163 GWAS hits for CD and UC, and SNP-heritability of the overlapping loci between the densely genotyped iChip regions and the 163 GWAS hits. For both diseases, over different genomic partitioning, the densely genotyped regions on the iChip tagged at least as much variation in liability as in the corresponding regions in the GWAS data, however a certain amount of tagged SNP-heritability in the GWAS data was lost using the iChip due to the low coverage at unselected regions. These results imply that custom arrays with a GWAS backbone will facilitate more gene discovery, both at associated and novel loci.
Inter-individual variation in facial shape is one of the most noticeable phenotypes in humans, and it is clearly under genetic regulation; however, almost nothing is known about the genetic basis of ...normal human facial morphology. We therefore conducted a genome-wide association study for facial shape phenotypes in multiple discovery and replication cohorts, considering almost ten thousand individuals of European descent from several countries. Phenotyping of facial shape features was based on landmark data obtained from three-dimensional head magnetic resonance images (MRIs) and two-dimensional portrait images. We identified five independent genetic loci associated with different facial phenotypes, suggesting the involvement of five candidate genes--PRDM16, PAX3, TP63, C5orf50, and COL17A1--in the determination of the human face. Three of them have been implicated previously in vertebrate craniofacial development and disease, and the remaining two genes potentially represent novel players in the molecular networks governing facial development. Our finding at PAX3 influencing the position of the nasion replicates a recent GWAS of facial features. In addition to the reported GWA findings, we established links between common DNA variants previously associated with NSCL/P at 2p21, 8q24, 13q31, and 17q22 and normal facial-shape variations based on a candidate gene approach. Overall our study implies that DNA variants in genes essential for craniofacial development contribute with relatively small effect size to the spectrum of normal variation in human facial morphology. This observation has important consequences for future studies aiming to identify more genes involved in the human facial morphology, as well as for potential applications of DNA prediction of facial shape such as in future forensic applications.
Abstract
The ratio of the length of the index finger to that of the ring finger (2D:4D) is sexually dimorphic and is commonly used as a non-invasive biomarker of prenatal androgen exposure. Most ...association studies of 2D:4D ratio with a diverse range of sex-specific traits have typically involved small sample sizes and have been difficult to replicate, raising questions around the utility and precise meaning of the measure. In the largest genome-wide association meta-analysis of 2D:4D ratio to date (N = 15 661, with replication N = 75 821), we identified 11 loci (9 novel) explaining 3.8% of the variance in mean 2D:4D ratio. We also found weak evidence for association (β = 0.06; P = 0.02) between 2D:4D ratio and sensitivity to testosterone length of the CAG microsatellite repeat in the androgen receptor (AR) gene in females only. Furthermore, genetic variants associated with (adult) testosterone levels and/or sex hormone-binding globulin were not associated with 2D:4D ratio in our sample. Although we were unable to find strong evidence from our genetic study to support the hypothesis that 2D:4D ratio is a direct biomarker of prenatal exposure to androgens in healthy individuals, our findings do not explicitly exclude this possibility, and pathways involving testosterone may become apparent as the size of the discovery sample increases further. Our findings provide new insight into the underlying biology shaping 2D:4D variation in the general population.
Despite the important role DNA methylation plays in transcriptional regulation, the transgenerational inheritance of DNA methylation is not well understood. The genetic heritability of DNA ...methylation has been estimated using twin pairs, although concern has been expressed whether the underlying assumption of equal common environmental effects are applicable due to intrauterine differences between monozygotic and dizygotic twins. We estimate the heritability of DNA methylation on peripheral blood leukocytes using Illumina HumanMethylation450 array using a family based sample of 614 people from 117 families, allowing comparison both within and across generations.
The correlations from the various available relative pairs indicate that on average the similarity in DNA methylation between relatives is predominantly due to genetic effects with any common environmental or zygotic effects being limited. The average heritability of DNA methylation measured at probes with no known SNPs is estimated as 0.187. The ten most heritable methylation probes were investigated with a genome-wide association study, all showing highly statistically significant cis mQTLs. Further investigation of one of these cis mQTL, found in the MHC region of chromosome 6, showed the most significantly associated SNP was also associated with over 200 other DNA methylation probes in this region and the gene expression level of 9 genes.
The majority of transgenerational similarity in DNA methylation is attributable to genetic effects, and approximately 20% of individual differences in DNA methylation in the population are caused by DNA sequence variation that is not located within CpG sites.