Object detection is a fundamental step for automated video analysis in many vision applications. Object detection in a video is usually performed by object detectors or background subtraction ...techniques. Often, an object detector requires manually labeled examples to train a binary classifier, while background subtraction needs a training sequence that contains no objects to build a background model. To automate the analysis, object detection without a separate training phase becomes a critical task. People have tried to tackle this task by using motion information. But existing motion-based methods are usually limited when coping with complex scenarios such as nonrigid motion and dynamic background. In this paper, we show that the above challenges can be addressed in a unified framework named DEtecting Contiguous Outliers in the LOw-rank Representation (DECOLOR). This formulation integrates object detection and background learning into a single process of optimization, which can be solved by an alternating algorithm efficiently. We explain the relations between DECOLOR and other sparsity-based methods. Experiments on both simulated data and real sequences demonstrate that DECOLOR outperforms the state-of-the-art approaches and it can work effectively on a wide range of complex scenarios.
Low-rank modeling generally refers to a class of methods that solves problems by representing variables of interest as low-rank matrices. It has achieved great success in various fields including ...computer vision, data mining, signal processing, and bioinformatics. Recently, much progress has been made in theories, algorithms, and applications of low-rank modeling, such as exact low-rank matrix recovery via convex programming and matrix completion applied to collaborative filtering. These advances have brought more and more attention to this topic. In this article, we review the recent advances of low-rank modeling, the state-of-the-art algorithms, and the related applications in image analysis. We first give an overview of the concept of low-rank modeling and the challenging problems in this area. Then, we summarize the models and algorithms for low-rank matrix recovery and illustrate their advantages and limitations with numerical experiments. Next, we introduce a few applications of low-rank modeling in the context of image analysis. Finally, we conclude this article with some discussions.
In mass spectrometry (MS) based proteomic data analysis, peak detection is an essential step for subsequent analysis. Recently, there has been significant progress in the development of various peak ...detection algorithms. However, neither a comprehensive survey nor an experimental comparison of these algorithms is yet available. The main objective of this paper is to provide such a survey and to compare the performance of single spectrum based peak detection methods.
In general, we can decompose a peak detection procedure into three consequent parts: smoothing, baseline correction and peak finding. We first categorize existing peak detection algorithms according to the techniques used in different phases. Such a categorization reveals the differences and similarities among existing peak detection algorithms. Then, we choose five typical peak detection algorithms to conduct a comprehensive experimental study using both simulation data and real MALDI MS data.
The results of comparison show that the continuous wavelet-based algorithm provides the best average performance.
Cell delamination is a conserved morphogenetic process important for the generation of cell diversity and maintenance of tissue homeostasis. Here, we used
embryonic neuroblasts as a model to study ...the apical constriction process during cell delamination. We observe dynamic myosin signals both around the cell adherens junctions and underneath the cell apical surface in the neuroectoderm. On the cell apical cortex, the nonjunctional myosin forms flows and pulses, which are termed medial myosin pulses. Quantitative differences in medial myosin pulse intensity and frequency are crucial to distinguish delaminating neuroblasts from their neighbors. Inhibition of medial myosin pulses blocks delamination. The fate of a neuroblast is set apart from that of its neighbors by Notch signaling-mediated lateral inhibition. When we inhibit Notch signaling activity in the embryo, we observe that small clusters of cells undergo apical constriction and display an abnormal apical myosin pattern. Together, these results demonstrate that a contractile actomyosin network across the apical cell surface is organized to drive apical constriction in delaminating neuroblasts.
In genome-wide association studies, we normally discover associations between genetic variants and diseases/traits in primary studies, and validate the findings in replication studies. We consider ...the associations identified in both primary and replication studies as true findings. An important question under this two-stage setting is how to determine significance levels in both studies. In traditional methods, significance levels of the primary and replication studies are determined separately. We argue that the separate determination strategy reduces the power in the overall two-stage study. Therefore, we propose a novel method to determine significance levels jointly. Our method is a reanalysis method that needs summary statistics from both studies. We find the most powerful significance levels when controlling the false discovery rate in the two-stage study. To enjoy the power improvement from the joint determination method, we need to select single nucleotide polymorphisms for replication at a less stringent significance level. This is a common practice in studies designed for discovery purpose. We suggest this practice is also suitable in studies with validation purpose in order to identify more true findings. Simulation experiments show that our method can provide more power than traditional methods and that the false discovery rate is well-controlled. Empirical experiments on datasets of five diseases/traits demonstrate that our method can help identify more associations. The R-package is available at: http://bioinformatics.ust.hk/RFdr.html.
Collecting millions of genetic variations is feasible with the advanced genotyping technology. With a huge amount of genetic variations data in hand, developing efficient algorithms to carry out the ...gene-gene interaction analysis in a timely manner has become one of the key problems in genome-wide association studies (GWAS). Boolean operation-based screening and testing (BOOST), a recent work in GWAS, completes gene-gene interaction analysis in 2.5 days on a desktop computer. Compared with central processing units (CPUs), graphic processing units (GPUs) are highly parallel hardware and provide massive computing resources. We are, therefore, motivated to use GPUs to further speed up the analysis of gene-gene interactions.
We implement the BOOST method based on a GPU framework and name it GBOOST. GBOOST achieves a 40-fold speedup compared with BOOST. It completes the analysis of Wellcome Trust Case Control Consortium Type 2 Diabetes (WTCCC T2D) genome data within 1.34 h on a desktop computer equipped with Nvidia GeForce GTX 285 display card.
GBOOST code is available at http://bioinformatics.ust.hk/BOOST.html#GBOOST.
Gene-gene interactions have long been recognized to be fundamentally important for understanding genetic causes of complex disease traits. At present, identifying gene-gene interactions from ...genome-wide case-control studies is computationally and methodologically challenging. In this paper, we introduce a simple but powerful method, named “BOolean Operation-based Screening and Testing” (BOOST). For the discovery of unknown gene-gene interactions that underlie complex diseases, BOOST allows examination of all pairwise interactions in genome-wide case-control studies in a remarkably fast manner. We have carried out interaction analyses on seven data sets from the Wellcome Trust Case Control Consortium (WTCCC). Each analysis took less than 60 hr to completely evaluate all pairs of roughly 360,000 SNPs on a standard 3.0 GHz desktop with 4G memory running the Windows XP system. The interaction patterns identified from the type 1 diabetes data set display significant difference from those identified from the rheumatoid arthritis data set, although both data sets share a very similar hit region in the WTCCC report. BOOST has also identified some disease-associated interactions between genes in the major histocompatibility complex region in the type 1 diabetes data set. We believe that our method can serve as a computationally and statistically useful tool in the coming era of large-scale interaction mapping in genome-wide case-control studies.
Protein acetylation, one of many types of post-translational modifications (PTMs), is involved in a variety of biological and cellular processes. In the present study, we applied both CsCl density ...gradient (CDG) centrifugation-based protein fractionation and a dimethyl-labeling-based 4C quantitative PTM proteomics workflow in the study of dynamic acetylproteomic changes in Arabidopsis. This workflow integrates the dimethyl chemical labeling with chromatography-based acetylpeptide separation and enrichment followed by mass spectrometry (MS) analysis, the extracted ion chromatogram (XIC) quantitation-based computational analysis of mass spectrometry data to measure dynamic changes of acetylpeptide level using an in-house software program, named Stable isotope-based Quantitation-Dimethyl labeling (SQUA-D), and finally the confirmation of ethylene hormone-regulated acetylation using immunoblot analysis. Eventually, using this proteomic approach, 7456 unambiguous acetylation sites were found from 2638 different acetylproteins, and 5250 acetylation sites, including 5233 sites on lysine side chain and 17 sites on protein N termini, were identified repetitively. Out of these repetitively discovered acetylation sites, 4228 sites on lysine side chain (i.e. 80.5%) are novel. These acetylproteins are exemplified by the histone superfamily, ribosomal and heat shock proteins, and proteins related to stress/stimulus responses and energy metabolism. The novel acetylproteins enriched by the CDG centrifugation fractionation contain many cellular trafficking proteins, membrane-bound receptors, and receptor-like kinases, which are mostly involved in brassinosteroid, light, gravity, and development signaling. In addition, we identified 12 highly conserved acetylation site motifs within histones, P-glycoproteins, actin depolymerizing factors, ATPases, transcription factors, and receptor-like kinases. Using SQUA-D software, we have quantified 33 ethylene hormone-enhanced and 31 hormone-suppressed acetylpeptide groups or called unique PTM peptide arrays (UPAs) that share the identical unique PTM site pattern (UPSP). This CDG centrifugation protein fractionation in combination with dimethyl labeling-based quantitative PTM proteomics, and SQUA-D may be applied in the quantitation of any PTM proteins in any model eukaryotes and agricultural crops as well as tissue samples of animals and human beings.
Abstract
Background
Cross-linking mass spectrometry (XL-MS) is a powerful technique for detecting protein–protein interactions (PPIs) and modeling protein structures in a high-throughput manner. In ...XL-MS experiments, proteins are cross-linked by a chemical reagent (namely cross-linker), fragmented, and then fed into a tandem mass spectrum (MS/MS). Cross-linkers are either cleavable or non-cleavable, and each type requires distinct data analysis tools. However, both types of cross-linkers suffer from imbalanced fragmentation efficiency, resulting in a large number of unidentifiable spectra that hinder the discovery of PPIs and protein conformations. To address this challenge, researchers have sought to improve the sensitivity of XL-MS through invention of novel cross-linking reagents, optimization of sample preparation protocols, and development of data analysis algorithms. One promising approach to developing new data analysis methods is to apply a protein feedback mechanism in the analysis. It has significantly improved the sensitivity of analysis methods in the cleavable cross-linking data. The application of the protein feedback mechanism to the analysis of non-cleavable cross-linking data is expected to have an even greater impact because the majority of XL-MS experiments currently employs non-cleavable cross-linkers.
Results
In this study, we applied the protein feedback mechanism to the analysis of both non-cleavable and cleavable cross-linking data and observed a substantial improvement in cross-link spectrum matches (CSMs) compared to conventional methods. Furthermore, we developed a new software program, ECL 3.0, that integrates two algorithms and includes a user-friendly graphical interface to facilitate wider applications of this new program.
Conclusions
ECL 3.0 source code is available at
https://github.com/yuweichuan/ECL-PF.git
. A quick tutorial is available at
https://youtu.be/PpZgbi8V2xI
.
Motivation: Hundreds of thousands of single nucleotide polymorphisms (SNPs) are available for genome-wide association (GWA) studies nowadays. The epistatic interactions of SNPs are believed to be ...very important in determining individual susceptibility to complex diseases. However, existing methods for SNP interaction discovery either suffer from high computation complexity or perform poorly when marginal effects of disease loci are weak or absent. Hence, it is desirable to develop an effective method to search epistatic interactions in genome-wide scale. Results: We propose a new method SNPHarvester to detect SNP–SNP interactions in GWA studies. SNPHarvester creates multiple paths in which the visited SNP groups tend to be statistically associated with diseases, and then harvests those significant SNP groups which pass the statistical tests. It greatly reduces the number of SNPs. Consequently, existing tools can be directly used to detect epistatic interactions. By using a wide range of simulated data and a real genome-wide data, we demonstrate that SNPHarvester outperforms its recent competitor significantly and is promising for practical disease prognosis. Availability: http://bioinformatics.ust.hk/SNPHarvester.html Contact: eeyang@ust.hk Supplementary information: Supplementary data are available at Bioinformatics online.