Retinal vascular caliber provides information about the structure and health of the microvascular system and is associated with cardiovascular and cerebrovascular diseases. Compared to European ...Americans, African Americans tend to have wider retinal arteriolar and venular caliber, even after controlling for cardiovascular risk factors. This has suggested the hypothesis that differences in genetic background may contribute to racial/ethnic differences in retinal vascular caliber. Using 1,365 ancestry-informative SNPs, we estimated the percentage of African ancestry (PAA) and conducted genome-wide admixture mapping scans in 1,737 African Americans from the Atherosclerosis Risk in Communities (ARIC) study. Central retinal artery equivalent (CRAE) and central retinal vein equivalent (CRVE) representing summary measures of retinal arteriolar and venular caliber, respectively, were measured from retinal photographs. PAA was significantly correlated with CRVE (rho = 0.071, P = 0.003), but not CRAE (rho = 0.032, P = 0.182). Using admixture mapping, we did not detect significant admixture association with either CRAE (genome-wide score = -0.73) or CRVE (genome-wide score = -0.69). An a priori subgroup analysis among hypertensive individuals detected a genome-wide significant association of CRVE with greater African ancestry at chromosome 6p21.1 (genome-wide score = 2.31, locus-specific LOD = 5.47). Each additional copy of an African ancestral allele at the 6p21.1 peak was associated with an average increase in CRVE of 6.14 mm in the hypertensives, but had no significant effects in the non-hypertensives (P for heterogeneity <0.001). Further mapping in the 6p21.1 region may uncover novel genetic variants affecting retinal vascular caliber and further insights into the interaction between genetic effects of the microvascular system and hypertension.
The Potsdam Textbook Corpus (PoTeC) is a naturalistic eye-tracking-while-reading corpus containing data from 75 participants reading 12 scientific texts. PoTeC is the first naturalistic ...eye-tracking-while-reading corpus that contains eye-movements from domain-experts as well as novices in a within-participant manipulation: It is based on a 2x2x2 fully-crossed factorial design which includes the participants' level of study and the participants' discipline of study as between-subject factors and the text domain as a within-subject factor. The participants' reading comprehension was assessed by a series of text comprehension questions and their domain knowledge was tested by text-independent background questions for each of the texts. The materials are annotated for a variety of linguistic features at different levels. We envision PoTeC to be used for a wide range of studies including but not limited to analyses of expert and non-expert reading strategies. The corpus and all the accompanying data at all stages of the preprocessing pipeline and all code used to preprocess the data are made available via GitHub: https://github.com/DiLi-Lab/PoTeC.
Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate ...tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated ~1,290,000 years ago, western and other common chimpanzees ~510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.
Variation in gene expression is a fundamental aspect of human phenotypic variation. Several recent studies have analyzed gene expression levels in populations of different continental ancestry and ...reported population differences at a large number of genes. However, these differences could largely be due to non-genetic (e.g., environmental) effects. Here, we analyze gene expression levels in African American cell lines, which differ from previously analyzed cell lines in that individuals from this population inherit variable proportions of two continental ancestries. We first relate gene expression levels in individual African Americans to their genome-wide proportion of European ancestry. The results provide strong evidence of a genetic contribution to expression differences between European and African populations, validating previous findings. Second, we infer local ancestry (0, 1, or 2 European chromosomes) at each location in the genome and investigate the effects of ancestry proximal to the expressed gene (cis) versus ancestry elsewhere in the genome (trans). Both effects are highly significant, and we estimate that 12±3% of all heritable variation in human gene expression is due to cis variants.
Eye movements in reading play a crucial role in psycholinguistic research
studying the cognitive mechanisms underlying human language processing. More
recently, the tight coupling between eye ...movements and cognition has also been
leveraged for language-related machine learning tasks such as the
interpretability, enhancement, and pre-training of language models, as well as
the inference of reader- and text-specific properties. However, scarcity of eye
movement data and its unavailability at application time poses a major
challenge for this line of research. Initially, this problem was tackled by
resorting to cognitive models for synthesizing eye movement data. However, for
the sole purpose of generating human-like scanpaths, purely data-driven
machine-learning-based methods have proven to be more suitable. Following
recent advances in adapting diffusion processes to discrete data, we propose
ScanDL, a novel discrete sequence-to-sequence diffusion model that generates
synthetic scanpaths on texts. By leveraging pre-trained word representations
and jointly embedding both the stimulus text and the fixation sequence, our
model captures multi-modal interactions between the two inputs. We evaluate
ScanDL within- and across-dataset and demonstrate that it significantly
outperforms state-of-the-art scanpath generation methods. Finally, we provide
an extensive psycholinguistic analysis that underlines the model's ability to
exhibit human-like reading behavior. Our implementation is made available at
https://github.com/DiLi-Lab/ScanDL.
Human gaze data offer cognitive information that reflects natural language comprehension. Indeed, augmenting language models with human scanpaths has proven beneficial for a range of NLP tasks, ...including language understanding. However, the applicability of this approach is hampered because the abundance of text corpora is contrasted by a scarcity of gaze data. Although models for the generation of human-like scanpaths during reading have been developed, the potential of synthetic gaze data across NLP tasks remains largely unexplored. We develop a model that integrates synthetic scanpath generation with a scanpath-augmented language model, eliminating the need for human gaze data. Since the model's error gradient can be propagated throughout all parts of the model, the scanpath generator can be fine-tuned to downstream tasks. We find that the proposed model not only outperforms the underlying language model, but achieves a performance that is comparable to a language model augmented with real human gaze data. Our code is publicly available.
The Icelandic population has been sampled in many disease association studies, providing a strong motivation to understand the structure of this population and its ramifications for disease gene ...mapping. Previous work using 40 microsatellites showed that the Icelandic population is relatively homogeneous, but exhibits subtle population structure that can bias disease association statistics. Here, we show that regional geographic ancestries of individuals from Iceland can be distinguished using 292,289 autosomal single-nucleotide polymorphisms (SNPs). We further show that subpopulation differences are due to genetic drift since the settlement of Iceland 1100 years ago, and not to varying contributions from different ancestral populations. A consequence of the recent origin of Icelandic population structure is that allele frequency differences follow a null distribution devoid of outliers, so that the risk of false positive associations due to stratification is minimal. Our results highlight an important distinction between population differences attributable to recent drift and those arising from more ancient divergence, which has implications both for association studies and for efforts to detect natural selection using population differentiation.
Adipocytokines are a subset of cytokines produced by adipose tissue and are associated with risk of type II diabetes and atherosclerosis. Levels of adipocytokines differ between Black and White ...Americans, even after adjustment for differences in adiposity, diseases associated with adipocytokines including type 2 diabetes and cardiovascular disease, and general socioeconomic status indicators such as income. We used a series of ancestry informative markers to estimate genetic ancestry in a population-based study of older Black Americans, and examined the association between genetic ancestry and adipocytokines and soluble receptors to help determine which of these may be most amenable to admixture mapping. We typed 35 ancestry informative markers in 1,241 self-reported Black Americans with available DNA from the Health, Aging, and Body Composition (Health ABC) study with available DNA and used a maximum likelihood approach to estimate percent European ancestry. We used linear regression models to determine the association between these adipocytokines and percent ancestry, and staged models to examine whether adiposity or other measures affected the associations of genetic ancestry and adipocytokines. Mean European ancestry was 22.3+/-15.9%. In multivariate adjusted models, the strongest associations observed were between higher European ancestry and interleukin-6 soluble receptor (IL-6 SR), C-reactive protein (CRP), and adiponectin levels, with interleukin-2 soluble receptor (IL-2 SR) and soluble tumor necrosis factor receptor II (TNF-alpha SR II) also showing more modest but significant associations. The association with adiponectin became stronger after adjustment for adiposity. These novel findings suggest that admixture mapping may identify genetic factors influencing the levels of IL-6 SR, CRP, IL-2 SR, and adiponectin.