An automated peak picking strategy is presented where several peak sets with different signal-to-noise levels are combined to form a more reliable statement on the protein identity. The strategy is ...compared against both manual peak picking and industry standard automated peak picking on a set of mass spectra obtained after tryptic in gel digestion of 2D-gel samples from human fetal fibroblasts. The set of spectra contain samples ranging from strong to weak spectra, and the proposed multiple-scale method is shown to be much better on weak spectra than the industry standard method and a human operator, and equal in performance to these on strong and medium strong spectra. It is also demonstrated that peak sets selected by a human operator display a considerable variability and that it is impossible to speak of a single “true” peak set for a given spectrum. The described multiple-scale strategy both avoids time-consuming parameter tuning and exceeds the human operator in protein identification efficiency. The strategy therefore promises reliable automated user-independent protein identification using peptide mass fingerprints.
Background: Designing amino acid sequences that are stable in a given target structure amounts to maximizing a conditional probability. A straightforward approach to accomplishing this is a nested ...Monte Carlo where the conformation space is explored over and over again for different fixed sequences; this requires excessive computational demand. Several approximate attempts to remedy this situation, based on energy minimization for fixed structure or high-
T expansions, have been proposed. These methods are fast but often not accurate, as folding occurs at low
T.
Results: We have developed a multisequence Monte Carlo procedure where both sequence and conformational space are simultaneously probed with efficient prescriptions for pruning sequence space. The method is explored on hydrophobic/polar models. First we discuss short lattice chains in order to compare with exact data and with other methods. The method is then successfully applied to lattice chains with up to 50 monomers and to off-lattice 20mers.
Conclusions: The multisequence Monte Carlo method offers a new approach to sequence design in coarse-grained models. It is much more efficient than previous Monte Carlo methods, and is, as it stands, applicable to a fairly wide range of two-letter models.
BACKGROUND: Designing amino acid sequences that are stable in a given target structure amounts to maximizing a conditional probability. A straightforward approach to accomplishing this is a nested ...Monte Carlo where the conformation space is explored over and over again for different fixed sequences; this requires excessive computational demand. Several approximate attempts to remedy this situation, based on energy minimization for fixed structure or high-T expansions, have been proposed. These methods are fast but often not accurate, as folding occurs at low T.RESULTS: We have developed a multisequence Monte Carlo procedure where both sequence and conformational space are simultaneously probed with efficient prescriptions for pruning sequence space. The method is explored on hydrophobic/polar models. First we discuss short lattice chains in order to compare with exact data and with other methods. The method is then successfully applied to lattice chains with up to 50 monomers and to off-lattice 20mers.CONCLUSIONS: The multisequence Monte Carlo method offers a new approach to sequence design in coarse-grained models. It is much more efficient than previous Monte Carlo methods, and is, as it stands, applicable to a fairly wide range of two-letter models.
The question of whether proteins originate from random sequences of amino acids is addressed. A statistical analysis is performed in terms of blocked and random walk values formed by binary ...hydrophobic assignments of the amino acids along the protein chains. Theoretical expectations of these variables from random distributions of hydrophobicities are compared with those obtained from functional proteins. The results, which are based upon proteins in the SWISS-PROT data base, convincingly show that the amino acid sequences in proteins differ from what is expected from random sequences in a statistically significant way. By performing Fourier transforms on the random walks, one obtains additional evidence for nonrandomness of the distributions. We have also analyzed results from a synthetic model containing only two amino acid types, hydrophobic and hydrophilic. With reasonable criteria on good folding properties in terms of thermodynamical and kinetic behavior, sequences that fold well are isolated. Performing the same statistical analysis on the sequences that fold well indicates similar deviations from randomness as for the functional proteins. The deviations from randomness can be interpreted as originating from anticorrelations in terms of an Ising spin model for the hydrophobicities. Our results, which differ from some previous investigations using other methods, might have impact on how permissive with respect to sequence specificity protein folding process is-only sequences with nonrandom hydrophobicity distributions fold well. Other distributions give rise to energy landscapes with poor folding properties and hence did not survive the evolution.
Perhaps more than any other “-omics” endeavor, the accuracy and level of detail obtained from mapping the major connection pathways in the living human brain with diffusion MRI depend on the ...capabilities of the imaging technology used. The current tools are remarkable; allowing the formation of an “image” of the water diffusion probability distribution in regions of complex crossing fibers at each of half a million voxels in the brain. Nonetheless our ability to map the connection pathways is limited by the image sensitivity and resolution, and also the contrast and resolution in encoding of the diffusion probability distribution.
The goal of our Human Connectome Project (HCP) is to address these limiting factors by re-engineering the scanner from the ground up to optimize the high b-value, high angular resolution diffusion imaging needed for sensitive and accurate mapping of the brain's structural connections. Our efforts were directed based on the relative contributions of each scanner component. The gradient subsection was a major focus since gradient amplitude is central to determining the diffusion contrast, the amount of T2 signal loss, and the blurring of the water PDF over the course of the diffusion time. By implementing a novel 4-port drive geometry and optimizing size and linearity for the brain, we demonstrate a whole-body sized scanner with Gmax=300mT/m on each axis capable of the sustained duty cycle needed for diffusion imaging. The system is capable of slewing the gradient at a rate of 200T/m/s as needed for the EPI image encoding. In order to enhance the efficiency of the diffusion sequence we implemented a FOV shifting approach to Simultaneous MultiSlice (SMS) EPI capable of unaliasing 3 slices excited simultaneously with a modest g-factor penalty allowing us to diffusion encode whole brain volumes with low TR and TE. Finally we combine the multi-slice approach with a compressive sampling reconstruction to sufficiently undersample q-space to achieve a DSI scan in less than 5min. To augment this accelerated imaging approach we developed a 64-channel, tight-fitting brain array coil and show its performance benefit compared to a commercial 32-channel coil at all locations in the brain for these accelerated acquisitions.
The technical challenges of developing the over-all system are discussed as well as results from SNR comparisons, ODF metrics and fiber tracking comparisons. The ultra-high gradients yielded substantial and immediate gains in the sensitivity through reduction of TE and improved signal detection and increased efficiency of the DSI or HARDI acquisition, accuracy and resolution of diffusion tractography, as defined by identification of known structure and fiber crossing.
•Approach for advancing the sensitivity of the diffusion connectivity measurement.•Optimization of Gmax=300mT/m gradient, RF coil and sequence.•Improved sensitivity and diffusion contrast in high quality DSI/Q Ball.
► Six different single- and two-step carbohydrate derivatisation approaches compared. ► Reference compounds comprised 8 monosaccharides (C5–C6), GA and DHA. ► Ethoximation–trimethylsilylation was ...found to be superior to other approaches. ► Analysis of carbohydrate-rich matrices of different origin is possible. ► Calculation of a novel matrix impact value is proposed to evaluate matrix effects.
Gas chromatographic analysis of complex carbohydrate mixtures requires highly effective and reliable derivatisation strategies for successful separation, identification, and quantitation of all constituents. Different single-step (per-trimethylsilylation, isopropylidenation) and two-step approaches (ethoximation–trimethylsilylation, ethoximation–trifluoroacetylation, benzoximation–trimethylsilylation, benzoximation–trifluoroacetylation) have been comprehensively studied with regard to chromatographic characteristics, informational value of mass spectra, ease of peak assignment, robustness toward matrix effects, and quantitation using a set of reference compounds that comprise eight monosaccharides (C5–C6), glycolaldehyde, and dihydroxyacetone. It has been shown that isopropylidenation and the two oximation-trifluoroacetylation approaches are least suitable for complex carbohydrate matrices. Whereas the former is limited to compounds that contain vicinal dihydroxy moieties in cis configuration, the latter two methods are sensitive to traces of trifluoroacetic acid which strongly supports decomposition of ketohexoses. It has been demonstrated for two “real” carbohydrate-rich matrices of biological and synthetic origin, respectively, that two-step ethoximation–trimethylsilylation is superior to other approaches due to the low number of peaks obtained per carbohydrate, good peak separation performance, structural information of mass spectra, low limits of detection and quantitation, minor relative standard deviations, and low sensitivity toward matrix effects.