In biomedical text mining, named entity recognition (NER) is an important task used to extract information from biomedical articles. Previously proposed methods for NER are dictionary- or rule-based ...methods and machine learning approaches. However, these traditional approaches are heavily reliant on large-scale dictionaries, target-specific rules, or well-constructed corpora. These methods to NER have been superseded by the deep learning-based approach that is independent of hand-crafted features. However, although such methods of NER employ additional conditional random fields (CRF) to capture important correlations between neighboring labels, they often do not incorporate all the contextual information from text into the deep learning layers.
We propose herein an NER system for biomedical entities by incorporating n-grams with bi-directional long short-term memory (BiLSTM) and CRF; this system is referred to as a contextual long short-term memory networks with CRF (CLSTM). We assess the CLSTM model on three corpora: the disease corpus of the National Center for Biotechnology Information (NCBI), the BioCreative II Gene Mention corpus (GM), and the BioCreative V Chemical Disease Relation corpus (CDR). Our framework was compared with several deep learning approaches, such as BiLSTM, BiLSTM with CRF, GRAM-CNN, and BERT. On the NCBI corpus, our model recorded an F-score of 85.68% for the NER of diseases, showing an improvement of 1.50% over previous methods. Moreover, although BERT used transfer learning by incorporating more than 2.5 billion words, our system showed similar performance with BERT with an F-scores of 81.44% for gene NER on the GM corpus and a outperformed F-score of 86.44% for the NER of chemicals and diseases on the CDR corpus. We conclude that our method significantly improves performance on biomedical NER tasks.
The proposed approach is robust in recognizing biological entities in text.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
In biomedical articles, a named entity recognition (NER) technique that identifies entity names from texts is an important element for extracting biological knowledge from articles. After NER is ...applied to articles, the next step is to normalize the identified names into standard concepts (i.e., disease names are mapped to the National Library of Medicine's Medical Subject Headings disease terms). In biomedical articles, many entity normalization methods rely on domain-specific dictionaries for resolving synonyms and abbreviations. However, the dictionaries are not comprehensive except for some entities such as genes. In recent years, biomedical articles have accumulated rapidly, and neural network-based algorithms that incorporate a large amount of unlabeled data have shown considerable success in several natural language processing problems.
In this study, we propose an approach for normalizing biological entities, such as disease names and plant names, by using word embeddings to represent semantic spaces. For diseases, training data from the National Center for Biotechnology Information (NCBI) disease corpus and unlabeled data from PubMed abstracts were used to construct word representations. For plants, a training corpus that we manually constructed and unlabeled PubMed abstracts were used to represent word vectors. We showed that the proposed approach performed better than the use of only the training corpus or only the unlabeled data and showed that the normalization accuracy was improved by using our model even when the dictionaries were not comprehensive. We obtained F-scores of 0.808 and 0.690 for normalizing the NCBI disease corpus and manually constructed plant corpus, respectively. We further evaluated our approach using a data set in the disease normalization task of the BioCreative V challenge. When only the disease corpus was used as a dictionary, our approach significantly outperformed the best system of the task.
The proposed approach shows robust performance for normalizing biological entities. The manually constructed plant corpus and the proposed model are available at http://gcancer.org/plant and http://gcancer.org/normalization , respectively.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
A new, modified synthesis of pyrroles is described. The reaction of 2,5-hexandione with a variety of amines yielded the expected pyrrole analogues in excellent yields. The reactions were carried out ...under the ultimate green conditions excluding both catalyst and solvent applying simple stirring at room temperature. The variety of amines include aqueous ammonium hydroxide for the synthesis of pyrroles with a free NH group, and benzylamines, anilines and phenylene-diamines for the synthesis of several N-derivatized pyrroles. The reaction also occurs efficiently with a variety of 1,4-diketones, although the reaction rates and yields are lower for the diketones that do not possess terminal methyl group(s).
A catalyst and solvent-free room temperature synthesis of pyrroles is described.
The energy efficiency of microwave-assisted reactions was studied under heterogeneous catalytic conditions. Based on earlier publications the choice of catalyst was a semi-synthetic montmorillonite ...K-10. This material absorbs microwave energy effectively and is an excellent catalyst for microwave-assisted organic synthesis. The energy consumption of six different types of K-10 catalyzed reactions with multiple substrates and varied experimental parameters were determined under microwave irradiation and conventional heating. The parallel reactions were carried out under the same conditions to ensure the comparability of the data. While in the majority of the studied reactions, the microwave-assisted method appeared to be more energy efficient by various extents, in one case the conventional heating was found to be more efficient. The data, in agreement with a previous report, indicate that reactions should be studied on a case-by-case basis and that an automatic green label for microwave-assisted reactions is not warranted.
The energy efficiency of microwave-assisted and conventionally heated solvent-free heterogeneous catalytic reactions was studied.
The biomedical field is currently reaping the benefits of research on biomimetic nanoparticles (NPs), which are synthetic nanoparticles fabricated with natural cellular materials for nature-inspired ...biomedical applications. These camouflage NPs are capable of retaining not only the physiochemical properties of synthetic nanoparticles but also the original biological functions of the cellular materials. Accordingly, NPs coated with cell-derived membrane components have achieved remarkable growth as prospective biomedical materials. Particularly, bacterial outer membrane vesicle (OMV), which is a cell membrane coating material for NPs, is regarded as an important molecule that can be employed in several biomedical applications, including immune response activation, cancer therapeutics, and treatment for bacterial infections with photothermal activity. The currently available cell membrane-coated NPs are summarized in this review. Furthermore, the general features of bacterial OMVs and several multifunctional NPs that could serve as inner core materials in the coating strategy are presented, and several methods that can be used to prepare OMV-coated NPs (OMV-NPs) and their characterization are highlighted. Finally, some perspectives of OMV-NPs in various biomedical applications for future potential breakthrough are discussed. This in-depth review, which includes potential challenges, will encourage researchers to fabricate innovative and improvised, new-generation biomimetic materials through future biomedical applications.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Endoplasmic reticulum (ER)-associated degradation (ERAD) mediates the proteasomal clearance of proteins from the early secretory pathway. In this process, ubiquitinated substrates are extracted from ...membrane-embedded dislocation complexes by the AAA ATPase VCP and targeted to the cytosolic 26S proteasome. In addition to its well-established role in the degradation of misfolded proteins, ERAD also regulates the abundance of key proteins such as enzymes involved in cholesterol synthesis. However, due to the lack of generalizable methods, our understanding of the scope of proteins targeted by ERAD remains limited. To overcome this obstacle, we developed a VCP inhibitor substrate trapping approach (VISTA) to identify endogenous ERAD substrates. VISTA exploits the small-molecule VCP inhibitor CB5083 to trap ERAD substrates in a membrane-associated, ubiquitinated form. This strategy, coupled with quantitative ubiquitin proteomics, identified previously validated (e.g., ApoB100, Insig2, and DHCR7) and novel (e.g., SCD1 and RNF5) ERAD substrates in cultured human hepatocellular carcinoma cells. Moreover, our results indicate that RNF5 autoubiquitination on multiple lysine residues targets it for ubiquitin and VCP--dependent clearance. Thus, VISTA provides a generalizable discovery method that expands the available toolbox of strategies to elucidate the ERAD substrate landscape.
Carbapenem-resistant
(CRAB) is the most detrimental pathogen that causes hospital-acquired infections. Tigecycline (TIG) is currently used as a potent antibiotic for treating CRAB infections; ...however, its overuse substantially induces the development of resistant isolates. Some molecular aspects of the resistance mechanisms of AB to TIG have been reported, but they are expected to be far more complicated and diverse than what has been characterized thus far. In this study, we identified bacterial extracellular vesicles (EVs), which are nano-sized lipid-bilayered spherical structures, as mediators of TIG resistance. Using laboratory-made TIG-resistant AB (TIG-R AB), we demonstrated that TIG-R AB produced more EVs than control TIG-susceptible AB (TIG-S AB). Transfer analysis of TIG-R AB-derived EVs treated with proteinase or DNase to recipient TIG-S AB showed that TIG-R EV proteins are major factors in TIG resistance transfer. Additional transfer spectrum analysis demonstrated that EV-mediated TIG resistance was selectively transferred to
,
, and
. However, this action was not observed in
and
. Finally, we showed that EVs are more likely to induce TIG resistance than antibiotics. Our data provide direct evidence that EVs are potent cell-derived components with a high, selective occurrence of TIG resistance in neighboring bacterial cells.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Current systems for modulating the abundance of proteins of interest in living cells are powerful tools for studying protein function but differ in terms of their complexity and ease of use. ...Moreover, no one system is ideal for all applications, and the best system for a given protein of interest must often be determined empirically. The thalidomide-like molecules (collectively called the IMiDs) bind to the ubiquitously expressed cereblon ubiquitin ligase complex and alter its substrate specificity such that it targets the IKZF1 and IKZF3 lymphocyte transcription factors for destruction. Here, we mapped the minimal IMiD-responsive IKZF3 degron and show that this peptidic degron can be used to target heterologous proteins for destruction with IMiDs in a time- and dose-dependent manner in cultured cells grown ex vivo or in vivo.
Full text
Available for:
BFBNIB, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK
Streptococcus thermophilus
is one of the lactic acid bacteria applied as the main starter for dairy foods. A type strain of
Streptococcus salivarius
subsp.
thermophilus
ATCC 19258 has been used in ...the genetic and biochemical characterization of their genes or gene products. While the genome sequence of NCTC 12958 as an equivalent to ATCC 19258 is available, characterization of whether both collections are identical remains to be validated. Here, we report the complete genome sequence of ATCC 19258, which contains one 2.1 Mb chromosome with a 39.0% of G + C content, and includes 2255 protein-coding sequences, 77 RNAs, 4 riboswitches, and 3 CRISPRs. The data were further compared with NCTC 12958 and found that 54 mutations and 4 gaps occurred in NCTC 12958, resulted in both the mutations and insertions of nucleotides in the genome. Unlike ATCC 19258, pre-termination of three genes encoding IS981 transposase B, MltF, and FetB were detected in NCTC 12958. Our study highlights that type strains of
Streptococcus thermophilus
in two available independent strain collections are possibly different and therefore, the functions of previously identified or hitherto uncharacterized genes of
Streptococcus thermophilus
should be carefully assigned based on the genomic database of the strain.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ