Prostate cancer (PCa) is the most common cancer and the third most frequent cause of male cancer death in Germany. MicroRNAs (miRNA) appear to be involved in the development and progression of PCa. A ...diagnostic differentiation from benign prostate hyperplasia (BPH) is often only possible through transrectal punch biopsy. This procedure is described as painful and carries risks. It was investigated whether urinary miRNAs can be used as biomarkers to differentiate the prostate diseases above. Therefore urine samples from urological patients with BPH (25) or PCa (28) were analysed using Next-Generation Sequencing to detect the expression profile of total and exosomal miRNA/piRNA. 79 miRNAs and 5 piwi-interacting RNAs (piRNAs) were significantly differentially expressed (adjusted p-value < 0.05 and log2-Fc > 1 or < -1). Of these, 6 miRNAs and 2 piRNAs could be statistically validated (AUC on test cohort > = 0.7). In addition, machine-learning algorithms were used to identify a panel of 22 additional miRNAs, whose interaction makes it possible to differentiate the groups as well. There are promising individual candidates for potential use as biomarkers in prostate cancer. The innovative approach of applying machine learning methods to this kind of data could lead to further small RNAs coming into scientific focus, which have so far been neglected.
Many challenges in proteomics result from the high-throughput nature of the experiments. This paper first presents pre-analytical problems, which still occur, although the call for standardization in ...omics has been ongoing for many years. This article also discusses aspects that affect bioinformatic analysis based on three sets of reference data measured with different orbitrap instruments. Despite continuous advances in mass spectrometer technology as well as analysis software, data-set-wise quality control is still necessary, and decoy-based estimation, although challenged by modern instruments, should be utilized. We draw attention to the fact that numerous young researchers perceive proteomics as a mature, readily applicable technology. However, it is important to emphasize that the maximum potential of the technology can only be realized by an educated handling of its limitations.
Cerebrospinal fluid is investigated in biomarker studies for various neurological disorders of the central nervous system due to its proximity to the brain. Currently, only a limited number of ...biomarkers have been validated in independent studies. The high variability in the protein composition and protein abundance of cerebrospinal fluid between as well as within individuals might be an important reason for this phenomenon. To evaluate this possibility, we investigated the inter- and intraindividual variability in the cerebrospinal fluid proteome globally, with a specific focus on disease biomarkers described in the literature. Cerebrospinal fluid from a longitudinal study group including 12 healthy control subjects was analyzed by label-free quantification (LFQ) via LC-MS/MS. Data were quantified via MaxQuant. Then, the intra- and interindividual variability and the reference change value were calculated for every protein. We identified and quantified 791 proteins, and 216 of these proteins were abundant in all samples and were selected for further analysis. For these proteins, we found an interindividual coefficient of variation of up to 101.5% and an intraindividual coefficient of variation of up to 29.3%. Remarkably, these values were comparably high for both proteins that were published as disease biomarkers and other proteins. Our results support the hypothesis that natural variability greatly impacts cerebrospinal fluid protein biomarkers because high variability can lead to unreliable results. Thus, we suggest controlling the variability of each protein to distinguish between good and bad biomarker candidates, e.g., by utilizing reference change values to improve the process of evaluating potential biomarkers in future studies.
In bottom-up proteomics, proteins are enzymatically digested into peptides before measurement with mass spectrometry. The relationship between proteins and their corresponding peptides can be ...represented by bipartite graphs. We conduct a comprehensive analysis of bipartite graphs using quantified peptides from measured data sets as well as theoretical peptides from an in silico digestion of the corresponding complete taxonomic protein sequence databases. The aim of this study is to characterize and structure the different types of graphs that occur and to compare them between data sets. We observed a large influence of the accepted minimum peptide length during in silico digestion. When changing from theoretical peptides to measured ones, the graph structures are subject to two opposite effects. On the one hand, the graphs based on measured peptides are on average smaller and less complex compared to graphs using theoretical peptides. On the other hand, the proportion of protein nodes without unique peptides, which are a complicated case for protein inference and quantification, is considerably larger for measured data. Additionally, the proportion of graphs containing at least one protein node without unique peptides rises when going from database to quantitative level. The fraction of shared peptides and proteins without unique peptides as well as the complexity and size of the graphs highly depends on the data set and organism. Large differences between the structures of bipartite peptide-protein graphs have been observed between database and quantitative level as well as between analyzed species. In the analyzed measured data sets, the proportion of protein nodes without unique peptides ranged from 6.4% to 55.0%. This highlights the need for novel methods that can quantify proteins without unique peptides. The knowledge about the structure of the bipartite peptide-protein graphs gained in this study will be useful for the development of such algorithms.
Histopathological differentiation between severe urocystitis with reactive urothelial atypia and carcinoma in situ (CIS) can be difficult, particularly after a treatment that deliberately induces an ...inflammatory reaction, such as intravesical instillation of Bacillus Calmette-Guèrin. However, precise grading in bladder cancer is critical for therapeutic decision making and thus requires reliable immunohistochemical biomarkers. Herein, an exemplary potential biomarker in bladder cancer was identified by the novel approach of Fourier transform infrared imaging for label-free tissue annotation of tissue thin sections. Identified regions of interest are collected by laser microdissection to provide homogeneous samples for liquid chromatography–tandem mass spectrometry–based proteomic analysis. This approach afforded label-free spatial classification with a high accuracy and without interobserver variability, along with the molecular resolution of the proteomic analysis. Cystitis and invasive high-grade urothelial carcinoma samples were analyzed. Three candidate biomarkers were identified and verified by immunohistochemistry in a small cohort, including low-grade urothelial carcinoma samples. The best-performing candidate AHNAK2 was further evaluated in a much larger independent verification cohort that also included CIS samples. Reactive urothelial atypia and CIS were distinguishable on the basis of the expression of this newly identified and verified immunohistochemical biomarker, with a sensitivity of 97% and a specificity of 69%. AHNAK2 can differentiate between reactive urothelial atypia in the setting of an acute or chronic cystitis and nonmuscle invasive-type CIS.
Diagnosing urothelial cancer (UCa) via invasive cystoscopy is painful, specifically in men, and can cause infection and bleeding. Because the UCa risk is higher for male patients, urinary ...non-invasive UCa biomarkers are highly desired to stratify men for invasive cystoscopy. We previously identified multiple DNA methylation sites in urine samples that detect UCa with a high sensitivity and specificity in men. Here, we identified the most relevant markers by employing multiple statistical approaches and machine learning (random forest, boosted trees, LASSO) using a dataset of 251 male UCa patients and 111 controls. Three CpG sites located in
,
and an intergenic region on chromosome 16 have been concordantly selected by all approaches, and their combination in a single decision matrix for clinical use was tested based on their respective thresholds of the individual CpGs. The combination of
and
yielded the best overall sensitivity (61%) at a pre-set specificity of 95%. This combination exceeded both the diagnostic performance of the most sensitive bioinformatic approach and that of the best single CpG. In summary, we showed that overlap analysis of multiple statistical approaches identifies the most reliable biomarkers for UCa in a male collective. The results may assist in stratifying men for cystoscopy.
In proteomics, liquid chromatography–tandem mass spectrometry (LC–MS/MS) is established for identifying peptides and proteins. Duplicated spectra, that is, multiple spectra of the same peptide, occur ...both in single MS/MS runs and in large spectral libraries. Clustering tandem mass spectra is used to find consensus spectra, with manifold applications. First, it speeds up database searches, as performed for instance by Mascot. Second, it helps to identify novel peptides across species. Third, it is used for quality control to detect wrongly annotated spectra. We compare different clustering algorithms based on the cosine distance between spectra. CAST, MS-Cluster, and PRIDE Cluster are popular algorithms to cluster tandem mass spectra. We add well-known algorithms for large data sets, hierarchical clustering, DBSCAN, and connected components of a graph, as well as the new method N-Cluster. All algorithms are evaluated on real data with varied parameter settings. Cluster results are compared with each other and with peptide annotations based on validation measures such as purity. Quality control, regarding the detection of wrongly (un)annotated spectra, is discussed for exemplary resulting clusters. N-Cluster proves to be highly competitive. All clustering results benefit from the so-called DISMS2 filter that integrates additional information, for example, on precursor mass.
Desmin mutations cause familial and sporadic cardiomyopathies. In addition to perturbing the contractile apparatus, both desmin deficiency and mutated desmin negatively impact mitochondria. Impaired ...myocardial metabolism secondary to mitochondrial defects could conceivably exacerbate cardiac contractile dysfunction. We performed metabolic myocardial phenotyping in left ventricular cardiac muscle tissue in desmin knock-out mice. Our analyses revealed decreased mitochondrial number, ultrastructural mitochondrial defects, and impaired mitochondria-related metabolic pathways including fatty acid transport, activation, and catabolism. Glucose transporter 1 and hexokinase-1 expression and hexokinase activity were increased. While mitochondrial creatine kinase expression was reduced, fetal creatine kinase expression was increased. Proteomic analysis revealed reduced expression of proteins involved in electron transport mainly of complexes I and II, oxidative phosphorylation, citrate cycle, beta-oxidation including auxiliary pathways, amino acid catabolism, and redox reactions and oxidative stress. Thus, desmin deficiency elicits a secondary cardiac mitochondriopathy with severely impaired oxidative phosphorylation and fatty and amino acid metabolism. Increased glucose utilization and fetal creatine kinase upregulation likely portray attempts to maintain myocardial energy supply. It may be prudent to avoid medications worsening mitochondrial function and other metabolic stressors. Therapeutic interventions for mitochondriopathies might also improve the metabolic condition in desmin deficient hearts.
Proteomic studies using mass spectrometry (MS)-based quantification are a main approach to the discovery of new biomarkers. However, a number of analytical conditions in front and during MS data ...acquisition can affect the accuracy of the obtained outcome. Therefore, comprehensive quality assessment of the acquired data plays a central role in quantitative proteomics, though, due to the immense complexity of MS data, it is often neglected. Here, we address practically the quality assessment of quantitative MS data, describing key steps for the evaluation, including the levels of raw data, identification and quantification. With this, four independent datasets from cerebrospinal fluid, an important biofluid for neurodegenerative disease biomarker studies, were assessed, demonstrating that sample processing-based differences are already reflected at all three levels but with varying impacts on the quality of the quantitative data. Specifically, we provide guidance to critically interpret the quality of MS data for quantitative proteomics. Moreover, we provide the free and open source quality control tool
, enabling systematic, rapid and uncomplicated data comparison of raw data, identification and feature detection levels through defined quality metrics and a step-by-step quality control workflow.
(1) Background: Neuroblastomas (NBs) are the most common extracranial solid tumors of children. The amplification of the Myc-N proto-oncogene (MYCN) is a major driver of NB aggressiveness, while high ...expression of the neurotrophin receptor NTRK1/TrkA is associated with mild disease courses. The molecular effects of NTRK1 signaling in MYCN-amplified NB, however, are still poorly understood and require elucidation. (2) Methods: Inducible NTRK1 expression was realized in four NB cell lines with (IMR5, NGP) or without MYCN amplification (SKNAS, SH-SY5Y). Proteome and phosphoproteome dynamics upon NTRK1 activation by its ligand, NGF, were analyzed in a time-dependent manner in IMR5 cells. Target validation by immunofluorescence staining and automated image processing was performed using the three other NB cell lines. (3) Results: In total, 230 proteins and 134 single phosphorylated class I phosphosites were found to be significantly regulated upon NTRK1 activation. Among known NTRK1 targets, Stathmin and the neurosecretory protein VGF were recovered. Additionally, we observed the upregulation and phosphorylation of Lamin A/C (LMNA) that accumulated inside nuclear foci. (4) Conclusions: We provide a comprehensive picture of NTRK1-induced proteome and phosphoproteome dynamics. The phosphorylation of LMNA within nucleic aggregates was identified as a prominent feature of NTRK1 signaling independent of the MYCN status of NB cells.