Many publicly available data repositories and resources have been developed to support protein-related information management, data-driven hypothesis generation, and biological knowledge discovery. ...To help researchers quickly find the appropriate protein-related informatics resources, we present a comprehensive review (with categorization and description) of major protein bioinformatics databases in this chapter. We also discuss the challenges and opportunities for developing next-generation protein bioinformatics databases and resources to support data integration and data analytics in the Big Data era.
When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and ...platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets.
We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7.
Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference-based assembly as measured by assembly contiguity and correctness.
Trimming of short read sequences can improve the quality of de novo and reference-based assembly and assembler performance. The parallel processing capability of ngsShoRT reduces trimming time and improves the memory efficiency when dealing with large datasets. We recommend combining sequencing artifacts removal, and quality score based read filtering and base trimming as the most consistent method for improving sequence quality and downstream assemblies. ngsShoRT source code, user guide and tutorial are available at http://research.bioinformatics.udel.edu/genomics/ngsShoRT/. ngsShoRT can be incorporated as a pre-processing step in genome and transcriptome assembly projects.
Thousands of Coronavirus Disease 2019 (COVID-19) patients have been discharged from hospitals Persistent follow-up studies are required to evaluate the prevalence of post-COVID-19 fibrosis.
This ...study involves 462 laboratory-confirmed patients with COVID-19 who were admitted to Shenzhen Third People's Hospital from January 11, 2020 to April 26, 2020. A total of 457 patients underwent thin-section chest CT scans during the hospitalization or after discharge to identify the pulmonary lesion. A total of 287 patients were followed up from 90 to 150 days after the onset of the disease, and lung function tests were conducted about three months after the onset. The risk factors affecting the persistence of pulmonary fibrosis were identified through regression analysis and the prediction model of the persistence of pulmonary fibrosis was established.
Parenchymal bands, irregular interfaces, reticulation and traction bronchiectasis were the most common CT features in all COVID-19 patients. During the 0-30, 31-60, 61-90, 91-120 and > 120 days after onset, 86.87%, 74.40%, 79.56%, 68.12% and 62.03% patients developed with pulmonary fibrosis and 4.53%, 19.61%, 18.02%, 38.30% and 48.98% patients reversed pulmonary fibrosis, respectively. It was observed that Age, BMI, Fever, and Highest PCT were predictive factors for sustaining fibrosis even after 90 days from onset. A predictive model of the persistence with pulmonary fibrosis was developed based-on the Logistic Regression method with an accuracy, PPV, NPV, Sensitivity and Specificity of the model of 76%, 71%, 79%, 67%, and 82%, respectively. More than half of the COVID-19 patients revealed abnormal conditions in lung function after 90 days from onset, and the ratio of abnormal lung function did not differ on a statistically significant level between the fibrotic and non-fibrotic groups.
Persistent pulmonary fibrosis was more likely to develop in patients with older age, higher BMI, severe/critical condition, fever, a longer viral clearance time, pre-existing disease and delayed hospitalization. Fibrosis developed in COVID-19 patients could be reversed in about a third of the patients after 120 days from onset. The pulmonary function of less than half of COVID-19 patients could turn to normal condition after three months from onset. An effective prediction model with an average area under the curve (AUC) of 0.84 was established to predict the persistence of pulmonary fibrosis in COVID-19 patients for early diagnosis.
We have developed a new web application for peptide matching using Apache Lucene-based search engine. The Peptide Match service is designed to quickly retrieve all occurrences of a given query ...peptide from UniProt Knowledgebase (UniProtKB) with isoforms. The matched proteins are shown in summary tables with rich annotations, including matched sequence region(s) and links to corresponding proteins in a number of proteomic/peptide spectral databases. The results are grouped by taxonomy and can be browsed by organism, taxonomic group or taxonomy tree. The service supports queries where isobaric leucine and isoleucine are treated equivalent, and an option for searching UniRef100 representative sequences, as well as dynamic queries to major proteomic databases. In addition to the web interface, we also provide RESTful web services. The underlying data are updated every 4 weeks in accordance with the UniProt releases.
http://proteininformationresource.org/peptide.shtml.
chenc@udel.edu.
Supplementary data are available at Bioinformatics online.
The Protein Ontology (PRO; http://purl.obolibrary.org/obo/pr) formally defines and describes taxon-specific and taxon-neutral protein-related entities in three major areas: proteins related by ...evolution; proteins produced from a given gene; and protein-containing complexes. PRO thus serves as a tool for referencing protein entities at any level of specificity. To enhance this ability, and to facilitate the comparison of such entities described in different resources, we developed a standardized representation of proteoforms using UniProtKB as a sequence reference and PSI-MOD as a post-translational modification reference. We illustrate its use in facilitating an alignment between PRO and Reactome protein entities. We also address issues of scalability, describing our first steps into the use of text mining to identify protein-related entities, the large-scale import of proteoform information from expert curated resources, and our ability to dynamically generate PRO terms. Web views for individual terms are now more informative about closely-related terms, including for example an interactive multiple sequence alignment. Finally, we describe recent improvement in semantic utility, with PRO now represented in OWL and as a SPARQL endpoint. These developments will further support the anticipated growth of PRO and facilitate discoverability of and allow aggregation of data relating to protein entities.
Conventional ophthalmic solutions often eliminate rapidly after administration and cannot provide and maintain an adequate concentration of the drug in the precorneal area. To solve these problems, ...we developed a thermosensitive in situ gelling and mucoadhesive ophthalmic drug delivery system containing puerarin based on poloxamer analogs (21% (w/v) poloxamer 407/5% (w/v) poloxamer 188) and carbopol (0.1% (w/v) or 0.2% (w/v) carbopol 1342P NF). The combined solutions would convert to firm gels under physiological condition and attach to the ocular mucosal surface for a relative long time. The incorporation of carbopol 1342P NF not only did not affect the pseudoplastic behavior with hysteresis of the poloxamer analogs solution and leaded to a higher shear stress at each shear rate, but also enhanced the mucoadhesive force significantly. In vitro release studies demonstrated diffusion-controlled release of puerarin from the combined solutions over a period of 8
h. In vivo evaluation (the elimination of puerarin in tear and intraocular pressure-lowering effect) indicated the combined solutions had better ability to retain drug than poloxamer analogs or carbopol alone. It appears that ocular bioavailability can be increased more readily by using the in situ gelling and mucoadhesive vehicle.
Coffee leaf rust caused by the fungus Hemileia vastatrix is one of the most important leaf diseases of coffee plantations worldwide. Current knowledge of the H. vastatrix genome is limited and only a ...small fraction of the total fungal secretome has been identified. In order to obtain a more comprehensive understanding of its secretome, we aimed to sequence and assemble the entire H. vastatrix genome using two next-generation sequencing platforms and a hybrid assembly strategy. This resulted in a 547 Mb genome of H. vastatrix race XXXIII (Hv33), with 13,364 predicted genes that encode 13,034 putative proteins with transcriptomic support. Based on this proteome, 615 proteins contain putative secretion peptides, and lack transmembrane domains. From this putative secretome, 111 proteins were identified as candidate effectors (EHv33) unique to H. vastatrix, and a subset consisting of 17 EHv33 genes was selected for a temporal gene expression analysis during infection. Five genes were significantly induced early during an incompatible interaction, indicating their potential role as pre-haustorial effectors possibly recognized by the resistant coffee genotype. Another nine genes were significantly induced after haustorium formation in the compatible interaction. Overall, we suggest that this fungus is able to selectively mount its survival strategy with effectors that depend on the host genotype involved in the infection process.
Although hatching is perhaps the most abrupt and profound metabolic challenge that a chicken must undergo; there have been no attempts to functionally map the metabolic pathways induced in liver ...during the embryo-to-hatchling transition. Furthermore, we know very little about the metabolic and regulatory factors that regulate lipid metabolism in late embryos or newly-hatched chicks. In the present study, we examined hepatic transcriptomes of 12 embryos and 12 hatchling chicks during the peri-hatch period-or the metabolic switch from chorioallantoic to pulmonary respiration.
Initial hierarchical clustering revealed two distinct, albeit opposing, patterns of hepatic gene expression. Cluster A genes are largely lipolytic and highly expressed in embryos. While, Cluster B genes are lipogenic/thermogenic and mainly controlled by the lipogenic transcription factor THRSPA. Using pairwise comparisons of embryo and hatchling ages, we found 1272 genes that were differentially expressed between embryos and hatchling chicks, including 24 transcription factors and 284 genes that regulate lipid metabolism. The three most differentially-expressed transcripts found in liver of embryos were MOGAT1, DIO3 and PDK4, whereas THRSPA, FASN and DIO2 were highest in hatchlings. An unusual finding was the "ectopic" and extremely high differentially expression of seven feather keratin transcripts in liver of 16 day embryos, which coincides with engorgement of liver with yolk lipids. Gene interaction networks show several transcription factors, transcriptional co-activators/co-inhibitors and their downstream genes that exert a 'ying-yang' action on lipid metabolism during the embryo-to-hatching transition. These upstream regulators include ligand-activated transcription factors, sirtuins and Kruppel-like factors.
Our genome-wide transcriptional analysis has greatly expanded the hepatic repertoire of regulatory and metabolic genes involved in the embryo-to-hatchling transition. New knowledge was gained on interactive transcriptional networks and metabolic pathways that enable the abrupt switch from ectothermy (embryo) to endothermy (hatchling) in the chicken. Several transcription factors and their coactivators/co-inhibitors appear to exert opposing actions on lipid metabolism, leading to the predominance of lipolysis in embryos and lipogenesis in hatchlings. Our analysis of hepatic transcriptomes has enabled discovery of opposing, interconnected and interdependent transcriptional regulators that provide precise ying-yang or homeorhetic regulation of lipid metabolism during the critical embryo-to-hatchling transition.
The novel coronavirus disease 2019 (COVID-19) outbreak started in Wuhan, Hubei, China since Dec 2019 and cases of infection have been continuously reported in various countries. It is now clear that ...the SARS-COV-2 coronavirus is transmissible from human to human. Nucleic acid detection is considered as the gold standard for the diagnosis of COVID-19. In this case report, we describe our experience in detection of SARS-COV-2 from a confirmed patient using nucleic acid test of bronchoalveolar-lavage fluid (BALF) samples but not nasopharyngeal swabs.
We present a case of severely ill SARS-COV-2 infected 46-year-old man with fever, coughing and chest tightness. We performed viral detection using his BALF samples and imaging method (CT) for confirmation. The patient received combination of interferonalfa-1b and ribavirin, lopinavir and ritonavir for antiviral treatment at different stages. Other medication was also given to him in combination for anti-inflammation, intestinal microbial regulation, phlegm elimination, liver protection and pulmonary fibrosis prevention purposes. We provided oxygen supply to him using BIPAP ventilator and high-flow humidification oxygen therapy instrument to facilitate respiration. The patient was cured and discharged.
This case report described an effective supportive medication scheme to treat SARS-COV-2 infected patient and emphasized the necessity of detection of the viral genome using BALF samples and its significance in the diagnosis and prognosis of the disease.
Although the FDA has given emergency use authorization (EUA) for some antiviral drugs for the treatment of COVID-19, no direct antiviral drugs have been identified for the treatment of critically ill ...patients, the most important treatment is suppression of the hyperinflammation. The purpose of this study was to evaluate the role of corticosteroids in hospitalized severe or critical patients positive for COVID-19. This is a retrospective single-center descriptive study. Patients classified as having severe or critical COVID-19 infections with acute respiratory dysfunction syndrome in Shenzhen Third People’s Hospital were enrolled from January 11th to March 30th, 2020. Ninety patients were classified as having severe or critical COVID-19 infections. The patients were treated with methylprednisolone with a low-to-moderate dosage and short duration. The days from the symptom onset to methylprednisolone were about 8 days. Eighteen patients were treated with invasive ventilation and intensive care unit (ICU) care. All the patients in the severe group and ten in the critical group recovered and were discharged. Three critical cases with invasive ventilation died. Although cases were much more severe in the corticosteroid-treated group, the mortality was not significantly increased. Early use of low-to-moderate dosage and short duration of corticosteroid may be the more accurate immune-modulatory treatment and brings more benefits to severe patients with COVID-19.