Inherited haemoglobinopathies are the most common monogenic diseases, with millions of carriers and patients worldwide. At present, we know several hundred disease-causing mutations on the globin ...gene clusters, in addition to numerous clinically important trans-acting disease modifiers encoded elsewhere and a multitude of polymorphisms with relevance for advanced diagnostic approaches. Moreover, new disease-linked variations are discovered every year that are not included in traditional and often functionally limited locus-specific databases. This paper presents IthaGenes, a new interactive database of haemoglobin variations, which stores information about genes and variations affecting haemoglobin disorders. In addition, IthaGenes organises phenotype, relevant publications and external links, while embedding the NCBI Sequence Viewer for graphical representation of each variation. Finally, IthaGenes is integrated with the companion tool IthaMaps for the display of corresponding epidemiological data on distribution maps. IthaGenes is incorporated in the ITHANET community portal and is free and publicly available at http://www.ithanet.eu/db/ithagenes.
Initiation of regular transfusion in transfusion-dependent thalassemia (TDT) is based on the assessment of clinical phenotype. Pathogenic HBB variants causing β-thalassemia are important determinants ...of phenotype and could be used to aid decision making. We investigated the association of HBB genotype with survival in a cohort study in the four thalassemia centres in Cyprus. HBB genotype was classified as severe (β0/β0 or β+/β0), moderate (β+/β+), or mild (β0/β++ or β+/β++). Risk factors for mortality were evaluated using multivariate Cox proportional-hazards regression. 537 subjects were followed for a total of 20,963 person years. 80.4% (95% CI 76.4-84.7) of individuals survived to 50 years of age with increasing rates of liver, infection and malignancy-related deaths observed during recent follow-up. We evaluated non-modifiable risk factors and found worse outcomes associated with male sex (Hazard ratio 1.9, 95% CI 1.1-3.0, p=0.01) and milder genotype (Hazard ratio 1.6, 95% CI 1.1-2.3, p=0.02). The effect of genotype was confirmed in a second model, which included treatment effects. Patients with a milder genotype initiated transfusion significantly later and had reduced blood requirements compared to those with moderate or severe genotypes, although pre-transfusion hemoglobin levels did not differ between genotypes. Our results suggest that early treatment decisions to delay transfusion and different long-term treatment strategies in milder genotypes have led to adverse long-term effects of under-treated thalassemia and worse survival. We propose that HBB genotype determination and use of this information to aid in decision making can improve long-term outcomes of thalassaemia patients.
The prediction of the secondary structure of a protein is a critical step in the prediction of its tertiary structure and, potentially, its function. Moreover, the backbone dihedral angles, highly ...correlated with secondary structures, provide crucial information about the local three-dimensional structure.
We predict independently both the secondary structure and the backbone dihedral angles and combine the results in a loop to enhance each prediction reciprocally. Support vector machines, a state-of-the-art supervised classification technique, achieve secondary structure predictive accuracy of 80% on a non-redundant set of 513 proteins, significantly higher than other methods on the same dataset. The dihedral angle space is divided into a number of regions using two unsupervised clustering techniques in order to predict the region in which a new residue belongs. The performance of our method is comparable to, and in some cases more accurate than, other multi-class dihedral prediction methods.
We have created an accurate predictor of backbone dihedral angles and secondary structure. Our method, called DISSPred, is available online at http://comp.chem.nottingham.ac.uk/disspred/.
Several types of haemoglobinopathies are caused by copy number variants (CNVs). While diagnosis is often based on haematological and biochemical parameters, a definitive diagnosis requires molecular ...DNA analysis. In some cases, the molecular characterisation of large deletions/duplications is challenging and inconclusive and often requires the use of specific diagnostic procedures, such as multiplex ligation-dependent probe amplification (MLPA). Herein, we collected and comprehensively analysed all known CNVs associated with haemoglobinopathies. The dataset of 291 CNVs was retrieved from the IthaGenes database and was further manually annotated to specify genomic locations, breakpoints and MLPA probes relevant for each CNV. We developed IthaCNVs, a publicly available and easy-to-use online tool that can facilitate the diagnosis of rare and diagnostically challenging haemoglobinopathy cases attributed to CNVs. Importantly, it facilitates the filtering of available entries based on the type of breakpoint information, on specific chromosomal and locus positions, on MLPA probes, and on affected gene(s). IthaCNVs brings together manually curated information about CNV genomic locations, functional effects, and information that can facilitate CNV characterisation through MLPA. It can help laboratory staff and clinicians confirm suspected diagnosis of CNVs based on molecular DNA screening and analysis.
Haemoglobinopathies are the commonest monogenic diseases worldwide and are caused by variants in the globin gene clusters. With over 2400 variants detected to date, their interpretation using the ...American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) guidelines is challenging and computational evidence can provide valuable input about their functional annotation. While many in silico predictors have already been developed, their performance varies for different genes and diseases. In this study, we evaluate 31 in silico predictors using a dataset of 1627 variants in
,
and
. By varying the decision threshold for each tool, we analyse their performance (a) as binary classifiers of pathogenicity and (b) by using different non-overlapping pathogenic and benign thresholds for their optimal use in the ACMG/AMP framework. Our results show that CADD, Eigen-PC, and REVEL are the overall top performers, with the former reaching moderate strength level for pathogenic prediction. Eigen-PC and REVEL achieve the highest accuracies for missense variants, while CADD is also a reliable predictor of non-missense variants. Moreover, SpliceAI is the top performing splicing predictor, reaching strong level of evidence, while GERP++ and phyloP are the most accurate conservation tools. This study provides evidence about the optimal use of computational tools in globin gene clusters under the ACMG/AMP framework.
Beta-turns are secondary structure elements usually classified as coil. Their prediction is important, because of their role in protein folding and their frequent occurrence in protein chains.
We ...have developed a novel method that predicts beta-turns and their types using information from multiple sequence alignments, predicted secondary structures and, for the first time, predicted dihedral angles. Our method uses support vector machines, a supervised classification technique, and is trained and tested on three established datasets of 426, 547 and 823 protein chains. We achieve a Matthews correlation coefficient of up to 0.49, when predicting the location of beta-turns, the highest reported value to date. Moreover, the additional dihedral information improves the prediction of beta-turn types I, II, IV, VIII and "non-specific", achieving correlation coefficients up to 0.39, 0.33, 0.27, 0.14 and 0.38, respectively. Our results are more accurate than other methods.
We have created an accurate predictor of beta-turns and their types. Our method, called DEBT, is available online at http://comp.chem.nottingham.ac.uk/debt/.
The +33 C>G variant NM_000518.5(HBB):c.-18C>G in the 5' untranslated region (UTR) of the β-globin gene is described in the literature as both mild and silent, while it causes a phenotype of ...thalassemia intermedia in the presence of a severe β-thalassemia allele. Despite its potential clinical significance, the determination of its pathogenicity according to established standards requires a greater number of published cases and co-segregation evidence than what is currently available. The present study provides an extensive phenotypic characterization of +33 C>G using 26 heterozygous and 11 compound heterozygous novel cases detected in Cyprus and employs computational predictors (CADD, RegulomeDB) to better understand its impact on clinical severity. Genotype identification of globin gene variants, including α- and δ-thalassemia determinants, and rs7482144 (XmnI) was carried out using Sanger sequencing, gap-PCR, and restriction enzyme digestion methods. The heterozygous state of +33 C>G had a silent phenotype without apparent microcytosis or hypochromia, while compound heterozygosity with a β+ or β0 allele had a spectrum of clinical phenotypes. Awareness of the +33 C>G is required across Mediterranean populations where β-thalassemia is frequent, particularly in Cyprus, with significant relevance in population screening and fetal diagnostic applications.