von Willebrand factor (VWF) is an essential component of hemostasis and has been implicated in thrombosis. Multimer size and the amount of circulating VWF are known to impact hemostatic function. We ...associated 78 VWF single nucleotide polymorphisms (SNPs) and haplotypes constructed from those SNPs with VWF antigen level in 7856 subjects of European descent. Among the nongenomic factors, age and body mass index contributed 4.8% and 1.6% of VWF variation, respectively. The SNP rs514659 (tags O blood type) contributed 15.4% of the variance. Among the VWF SNPs, we identified 18 SNPs that are associated with levels of VWF. The correlative SNPs are either intronic (89%) or silent exonic (11%). Although SNPs examined are distributed throughout the entire VWF gene without apparent cluster, all the positive SNPs are located in a 50-kb region. Exons in this region encode for VWF D2, D′, and D3 domains that are known to regulate VWF multimerization and storage. Mutations in the D3 domain are also associated with von Willebrand disease. Fifteen of these 18 correlative SNPs are in 2 distinct haplotype blocks. In summary, we identified a cluster of intronic VWF SNPs that associate with plasma levels of VWF, individually or additively, in a large cohort of healthy subjects.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Genetic studies of populations from the Indian subcontinent are of great interest because of India's large population size, complex demographic history, and unique social structure. Despite recent ...large-scale efforts in discovering human genetic variation, India's vast reservoir of genetic diversity remains largely unexplored.
To analyze an unbiased sample of genetic diversity in India and to investigate human migration history in Eurasia, we resequenced one 100-kb ENCODE region in 92 samples collected from three castes and one tribal group from the state of Andhra Pradesh in south India. Analyses of the four Indian populations, along with eight HapMap populations (692 samples), showed that 30% of all SNPs in the south Indian populations are not seen in HapMap populations. Several Indian populations, such as the Yadava, Mala/Madiga, and Irula, have nucleotide diversity levels as high as those of HapMap African populations. Using unbiased allele-frequency spectra, we investigated the expansion of human populations into Eurasia. The divergence time estimates among the major population groups suggest that Eurasian populations in this study diverged from Africans during the same time frame (approximately 90 to 110 thousand years ago). The divergence among different Eurasian populations occurred more than 40,000 years after their divergence with Africans.
Our results show that Indian populations harbor large amounts of genetic variation that have not been surveyed adequately by public SNP discovery efforts. Our data also support a delayed expansion hypothesis in which an ancestral Eurasian founding population remained isolated long after the out-of-Africa diaspora, before expanding throughout Eurasia.
Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. ...The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues.
We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set.
We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Factor VIII (FVIII) functions as a cofactor for factor IXa in the contact coagulation pathway and circulates in a protective complex with von Willebrand factor (VWF). Plasma FVIII activity is ...strongly influenced by environmental and genetic factors through VWF-dependent and -independent mechanisms. Single nucleotide polymorphisms (SNPs) of the coding and promoter sequence in the FVIII gene have been extensively studied for effects on FVIII synthesis, secretion, and activity, but impacts of non–disease-causing intronic SNPs remain largely unknown. We analyzed FVIII SNPs and FVIII activity in 10 434 healthy Americans of European (EA) or African (AA) descent in the Atherosclerosis Risk in Communities (ARIC) study. Among covariates, age, race, diabetes, and ABO contributed 2.2%, 3.5%, 4%, and 10.7% to FVIII intersubject variation, respectively. Four intronic FVIII SNPs associated with FVIII activity and 8 with FVIII-VWF ratio in a sex- and race-dependent manner. The FVIII haplotypes AT and GCTTTT also associated with FVIII activity. Seven VWF SNPs were associated with FVIII activity in EA subjects, but no FVIII SNPs were associated with VWF Ag. These data demonstrate that intronic SNPs could directly or indirectly influence intersubject variation of FVIII activity. Further investigation may reveal novel mechanisms of regulating FVIII expression and activity.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Rhesus macaques are the most widely utilized nonhuman primate model in biomedical research. Previous efforts have validated fewer than 900 single nucleotide polymorphisms (SNPs) in this species, ...which limits opportunities for genetic studies related to health and disease. Extensive information about SNPs and other genetic variation in rhesus macaques would facilitate valuable genetic analyses, as well as provide markers for genome-wide linkage analysis and the genetic management of captive breeding colonies.
We used the available rhesus macaque draft genome sequence, new sequence data from unrelated individuals and existing published sequence data to create a genome-wide SNP resource for Indian-origin rhesus monkeys. The original reference animal and two additional Indian-origin individuals were resequenced to low coverage using SOLiD™ sequencing. We then used three strategies to validate SNPs: comparison of potential SNPs found in the same individual using two different sequencing chemistries, and comparison of potential SNPs in different individuals identified with either the same or different sequencing chemistries. Our approach validated approximately 3 million SNPs distributed across the genome. Preliminary analysis of SNP annotations suggests that a substantial number of these macaque SNPs may have functional effects. More than 700 non-synonymous SNPs were scored by Polyphen-2 as either possibly or probably damaging to protein function and these variants now constitute potential models for studying functional genetic variation relevant to human physiology and disease.
Resequencing of a small number of animals identified greater than 3 million SNPs. This provides a significant new information resource for rhesus macaques, an important research animal. The data also suggests that overall genetic variation is high in this species. We identified many potentially damaging non-synonymous coding SNPs, providing new opportunities to identify rhesus models for human disease.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
In this paper we report molecular dynamics based atomistic simulations of deposition process of Al atoms onto Cu substrate and following nanoindentation process on that nanostructured material. ...Effects of incident energy on the morphology of deposited thin film and mechanical property of this nanostructured material are emphasized. The results reveal that the morphology of growing film is layer-by-layer-like at incident energy of 0.1–10
eV. The epitaxy mode of film growth is observed at incident energy below 1
eV, but film-mixing mode commences when incident energy increase to 10
eV accompanying with increased disorder of film structure, which improves quality of deposited thin film. Following indentation studies indicate deposited thin films pose lower stiffness than single crystal Al due to considerable amount of defects existed in them, but Cu substrate is strengthened by the interface generated from lattice mismatch between deposited Al thin film and Cu substrate.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
Whole genome analysis in large samples from a single population is needed to provide adequate power to assess relative strengths of natural selection across different functional components of the ...genome. In this study, we analyzed next-generation sequencing data from 962 European Americans, and found that as expected approximately 60% of the top 1% of positive selection signals lie in intergenic regions, 33% in intronic regions, and slightly over 1% in coding regions. Several detailed functional annotation categories in intergenic regions showed statistically significant enrichment in positively selected loci when compared to the null distribution of the genomic span of ENCODE categories. There was a significant enrichment of purifying selection signals detected in enhancers, transcription factor binding sites, microRNAs and target sites, but not on lincRNA or piRNAs, suggesting different evolutionary constraints for these domains. Loci in "repressed or low activity regions" and loci near or overlapping the transcription start site were the most significantly over-represented annotations among the top 1% of signals for positive selection.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Background
Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To ...overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples.
Methods
We applied natural language process (NLP) methods for analyzing 30,235 in‐house historical clinical reports through active learning, and then, developed clinical knowledge bases, evidence‐based interpretation methods and reporting criteria to support the whole postdetection pipeline.
Results
Of the 30,235 reports, we obtained 37,175 CNV‐paragraph pairs. For these pairs, the active learning approaches achieved a 0.9466 average F1‐score in sentence classification. The overall accuracy for variant classification was 95.7%, 95.2%, and 100.0% in retrospective, prospective, and clinical utility experiments, respectively.
Conclusion
By integrating NLP methods in CNVs postdetection pipeline, REDBot is a robust and rapid tool with clinical utility for prenatal and POC diagnosis.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Next-generation DNA sequencing is opening new avenues for genetic association studies in common diseases that, like deep vein thrombosis (DVT), have a strong genetic predisposition still largely ...unexplained by currently identified risk variants. In order to develop sequencing and analytical pipelines for the application of next-generation sequencing to complex diseases, we conducted a pilot study sequencing the coding area of 186 hemostatic/proinflammatory genes in 10 Italian cases of idiopathic DVT and 12 healthy controls.
A molecular-barcoding strategy was used to multiplex DNA target capture and sequencing, while retaining individual sequence information. Genomic libraries with barcode sequence-tags were pooled (in pools of 8 or 16 samples) and enriched for target DNA sequences. Sequencing was performed on ABI SOLiD-4 platforms. We produced > 12 gigabases of raw sequence data to sequence at high coverage (average: 42X) the 700-kilobase target area in 22 individuals. A total of 1876 high-quality genetic variants were identified (1778 single nucleotide substitutions and 98 insertions/deletions). Annotation on databases of genetic variation and human disease mutations revealed several novel, potentially deleterious mutations. We tested 576 common variants in a case-control association analysis, carrying the top-5 associations over to replication in up to 719 DVT cases and 719 controls. We also conducted an analysis of the burden of nonsynonymous variants in coagulation factor and anticoagulant genes. We found an excess of rare missense mutations in anticoagulant genes in DVT cases compared to controls and an association for a missense polymorphism of FGA (rs6050; p = 1.9 × 10(-5), OR 1.45; 95% CI, 1.22-1.72; after replication in > 1400 individuals).
We implemented a barcode-based strategy to efficiently multiplex sequencing of hundreds of candidate genes in several individuals. In the relatively small dataset of our pilot study we were able to identify bona fide associations with DVT. Our study illustrates the potential of next-generation sequencing for the discovery of genetic variation predisposing to complex diseases.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK