Molecular visualization is one of the cornerstones in structural bioinformatics and related fields. Today, rasterization is typically used for the interactive display of molecular scenes, while ray ...tracing aims at generating high-quality images, taking typically minutes to hours to generate and requiring the usage of an external off-line program. Recently, real-time ray tracing evolved to combine the interactivity of rasterization-based approaches with the superb image quality of ray tracing techniques. We demonstrate how real-time ray tracing integrated into a molecular modelling and visualization tool allows for better understanding of the structural arrangement of biomolecules and natural creation of publication-quality images in real-time. However, unlike most approaches, our technique naturally integrates into the full-featured molecular modelling and visualization tool BALL View, seamlessly extending a standard workflow with interactive high-quality rendering.
Ribonucleic acid (RNA) is a polymer composed of four bases denoted A, C, G, and U. It generally is a single-stranded molecule where the bases form hydrogen bonds within the same molecule leading to ...structure formation. In comparing different homologous RNA molecules it is important to consider both the base sequence and the structure of the molecules. Traditional alignment algorithms can only account for the sequence of bases, but not for the base pairings. Considering the structure leads to significant computational problems because of the dependencies introduced by the base pairings. In this paper we address the problem of optimally aligning a given RNA sequence of unknown structure to one of known sequence and structure. We phrase the problem as an integer linear program and then solve it using methods from polyhedral combinatorics. In our computational experiments we could solve large problem instances--23S ribosomal RNA with more than 1400 bases--a size intractable for former algorithms.
We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branch-and-cut algorithms. The
generalized maximum ...trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of
maximum trace. The
RNA sequence alignment problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial-time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Due to their extensive structural heterogeneity, the elucidation of glycosylation patterns in glycoproteins such as the subunits of human chorionic gonadotropin (hCG), hCG‐α, and hCG‐β, remains one ...of the most challenging problems in the proteomic analysis of post‐translational modifications. In consequence, glycosylation is usually studied after decomposition of the intact proteins to the proteolytic peptide level. However, by this approach all information about the combination of the different glycopeptides in the intact protein is lost. In this study we have, therefore, attempted to combine the results of glycan identification after tryptic digestion with molecular mass measurements on the native starting material of the new first WHO Reference Reagents (RR) for hCG‐α (99/720) and hCG‐β (99/650). Despite the extremely high number of possible combinations of the glycans identified in the tryptic peptides by HPLC‐MS (>1000 for hCG‐α and >10 000 for hCG‐β), the mass spectra of intact hCG‐α and hCG‐β revealed only a limited number of glycoforms present in hCG preparations from pools of pregnancy urines. Peak annotations for hCG‐α were performed with the help of a bioinformatic algorithm that generated a database containing all possible modifications of the proteins, including modifications possibly introduced during sample preparation such as oxidation or truncation, for subsequent searches for combinations fitting the mass difference between the polypeptide backbone and the measured molecular masses. Fourteen different glycoforms of hCG‐α, containing biantennary, partly sialylized hybrid‐type glycans, including methionine‐oxidized and N‐terminally truncated forms, were identified. Mass spectra of high quality were also obtained for hCG‐β, however, a database search mass accuracy of ±5 Da was insufficient to unambiguously assign the possible combinations of post‐translational modifications. In summary, mass spectrometric fingerprints of intact molecules were shown to be highly useful for the characterization of glycosylation patterns of different hCG preparations such as the new first WHO RR for immunoassays and could be the first step in establishing biophysical reference methods for hCG and related molecules.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Although a steadily increasing number of protein--ligand docking experiments have been performed successfully, there are only few studies concerning protein--sugar interactions. In this study, we ...investigate the interaction of wheat germ agglutinin (WGA) with N-acetylglucosamine and a number of its derivatives and predict the binding free energies using flexible docking techniques. To assess the quality of our predictions, we also determined those binding free energies experimentally in cell-binding studies. The predicted binding site, ligand orientation, and details of the binding mode are in perfect agreement with the known crystal structure of WGA with a sialoglycopeptide. Furthermore, we obtained an excellent linear correlation of our predicted binding free energies with both our own data and experimental data from the literature Monsigny, M., Roche, A.C., Sene, C., Maget Dana, R. & Delmotte, F. (1980) Eur. J. Biochem. 104, 147-153.. In both cases, predicted energies were within 1.0 kJ x mol(-1) of the experimental value. These results illustrate the usefulness of docking-based methods for the qualitative and quantitative prediction of protein--carbohydrate interactions. The insights gained from such theoretical studies may be used to complement the results from the still scarce crystal structures.
Full text
Available for:
BFBNIB, DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, SIK, UILJ, UKNU, UL, UM, UPUK
Phylogenomics with Paralogs Hellmuth, Marc; Wieseke, Nicolas; Lechner, Marcus ...
arXiv.org,
12/2017
Paper, Journal Article
Open access
Phylogenomics heavily relies on well-curated sequence data sets that consist, for each gene, exclusively of 1:1-orthologous. Paralogs are treated as a dangerous nuisance that has to be detected and ...removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from tree-free estimates of orthology, cograph editing can sufficiently reduce the noise in order to find correct event-annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. While the resolution is very poor for individual gene families, we show that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer. We demonstrate that the distribution of paralogs in large gene families contains in itself sufficient phylogenetic signal to infer fully resolved species phylogenies. This source of phylogenetic information is independent of information contained in orthologous sequences and is resilient against horizontal gene transfer. An important consequence is that phylogenomics data sets need not be restricted to 1:1 orthologs.
Although an increased level of the prostate-specific antigen can be an indication for prostate cancer, other reasons often lead to a high rate of false positive results. Therefore, an additional ...serological screening of autoantibodies in patients' sera could improve the detection of prostate cancer. We performed protein macroarray screening with sera from 49 prostate cancer patients, 70 patients with benign prostatic hyperplasia and 28 healthy controls and compared the autoimmune response in those groups. We were able to distinguish prostate cancer patients from normal controls with an accuracy of 83.2%, patients with benign prostatic hyperplasia from normal controls with an accuracy of 86.0% and prostate cancer patients from patients with benign prostatic hyperplasia with an accuracy of 70.3%. Combining seroreactivity pattern with a PSA level of higher than 4.0 ng/ml this classification could be improved to an accuracy of 84.1%. For selected proteins we were able to confirm the differential expression by using luminex on 84 samples. We provide a minimally invasive serological method to reduce false positive results in detection of prostate cancer and according to PSA screening to distinguish men with prostate cancer from men with benign prostatic hyperplasia.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Understanding protein structures is a crucial step in creating molecular insight for researchers as well as students and pupils. The enormous scaling gap between an atomic point of view and objects ...in daily life hampers developing an intuitive relation between them. Especially for high school students, it can be difficult to understand the spatial relations of a protein structure. Due to lack of direct imaging techniques, molecules can only be explored by studying abstract molecular models. Here, the use of Augmented reality (AR) techniques has proven to strongly improve structural perception. In this work we present ProteinScanAR, an augmented reality framework for biomolecular education that allows connecting virtual and real worlds intuitively, and thus enables focusing on the scientific or educational content. Special attention was taken to guarantee implementational and technical requirements as general and simple as possible to alleviate application in nonexpert computer settings. The ProteinScanAR framework is freely available under the GNU Public License (GPL).