Multi-label text classification (MLTC) is the task that assigns each document to the most relevant subset of class labels. Previous works usually ignored the correlation and semantics of labels ...resulting in information loss. To deal with this problem, we propose a new model that explores label dependencies and semantics by using graph convolutional networks (GCN). Particularly, we introduce an efficient correlation matrix to model label correlation based on occurrence and co-occurrence probabilities. To enrich the semantic information of labels, we design a method to use external information from Wikipedia for label embeddings. Correlated label information learned from GCN is combined with fine-grained document representation generated from another sub-net for classification. Experimental results on three benchmark datasets show that our model outweighs prior state-of-the-art methods. Ablation studies also show several aspects of the proposed model. Our code is available at
https://github.com/chiennv2000/LR-GCN
.
Automatically generated papers have been used to manipulate bibliography indexes on numerous occasions. This paper is interested in different means to generate texts such as recurrent neural network, ...Markov model, or probabilistic context free grammar, and if it is possible to detect them using a current approach. Then, probabilistic context free grammar (PCFG) is focused on as the one most used. However, even though there have been multiple approaches to detect such types of paper, they are all working at the document level and are unable to detect a small amount of generated text inside a larger body of genuinely written text. Thus, we present the grammatical structure similarity measurement to detect sentences or short fragments of automatically generated text from known PCFG generators. The proposed approach is tested against a pattern checker and various common machine learning methods. Additionally, the ability to detect a modified PCFG generator is also tested.
Information extraction (IE) is a vital step of digitization that reduces paperwork in offices. However, the adaptation of common IE systems to actual business cases faces two issues. First, the ...number of training samples is small (i.e. 100–200 examples). Second, span extraction models based on question answering formulation require a long time for training and inference. To overcome these issues, we introduce a new query-based model for the extraction of information from business documents. For data limitation, the model employs transfer learning which adapts the knowledge of pre-trained language models (i.e. BERT) to specific domains. To do that, we design a new CNN layer for the adaptation of the model to specific domains. For the speed, different from the encoding of normal span extraction methods (BERT-QA), the proposed model encodes short tags and context documents in two channels in parallel, which speeds up training and inference time. Information from short tags is fused with context documents learned from CNN by using attention to predict start and end positions of extracted spans. Promising results on five domain-specific datasets in English and Japanese indicate that the proposed model produces high-quality outputs and can be applied for business scenarios.
•A practical information extraction model for business cases is proposed.•FastQA+CNN achieves the best results in terms of F-scores and speed on five datasets.•Deep analysis on several aspects of the model.•Separately encoding short tags and the context speeds up the training and inference.
In the context of social media, users usually post relevant information corresponding to the contents of events mentioned in a Web document. This information posses two important values in that (i) ...it reflects the content of an event and (ii) it shares hidden topics with sentences in the main document. In this paper, we present a novel model to capture the nature of relationships between document sentences and post information (comments or tweets) in sharing hidden topics for summarization of Web documents by utilizing relevant post information. Unlike previous methods which are usually based on hand-crafted features, our approach ranks document sentences and user posts based on their importance to the topics. The sentence-user-post relation is formulated in a share topic matrix, which presents their mutual reinforcement support. Our proposed matrix co-factorization algorithm computes the score of each document sentence and user post and extracts the top ranked document sentences and comments (or tweets) as a summary. We apply the model to the task of summarization on three datasets in two languages, English and Vietnamese, of social context summarization and also on DUC 2004 (a standard corpus of the traditional summarization task). According to the experimental results, our model significantly outperforms the basic matrix factorization and achieves competitive ROUGE-scores with state-of-the-art methods.
•A novel ranking framework for social context summarization is proposed.•The framework relies on the reinforcement support of social information.•14 features in two groups: distance and statistical ...are proposed.•A new open-domain dataset is created and manually annotated.•Combining intra-relation and inter-relation benefits the summarization.
Traditional summarization methods only use the internal information of a Web document while ignoring its social information such as tweets from Twitter, which can provide a perspective viewpoint for readers towards an event. This paper proposes a framework named SoRTESum to take the advantages of social information such as document content reflection to extract summary sentences and social messages. In order to do that, the summarization was formulated in two steps: scoring and ranking. In the scoring step, the score of a sentence or social message is computed by using intra-relation and inter-relation which integrate the support of local and social information in a mutual reinforcement form. To calculate these relations, 16 features are proposed. After scoring, the summarization is generated by selecting top m ranked sentences and social messages. SoRTESum was extensively evaluated on two datasets. Promising results show that: (i) SoRTESum obtains significant improvements of ROUGE-scores over state-of-the-art baselines and competitive results with the learning to rank approach trained by RankBoost and (ii) combining intra-relation and inter-relation benefits single-document summarization.
This paper reports laminar and turbulent minimum ignition energies (MIEL and MIET) of hydrogen/air mixtures at two equivalence ratios (ϕ = 0.18 and 5.1) where Lewis numbers Le ≈ 0.3 and 2.3, ...respectively, over wide ranges of the electrode spark gap (dgap = 0.3–6.5 mm) and the r.m.s. turbulent fluctuating velocity (u′ = 0–8.3 m/s). Depending on the coupling effects of Le, dgap, and u′, we explain what causes two distinct phenomena: Turbulent Facilitated Ignition (TFI) meaning MIEL >> MIET and MIE Transition meaning a change from MIET ≥ MIEL to MIET >> MIEL when u′ is greater than some critical value. High-speed Schlieren imaging shows that the embryonic spark kernel in quiescence is ball (rod) like when dgap < 1 mm (dgap > 1 mm), demonstrating large (very small or negligible) positive curvature. This explains why TFI, an unusual phenomenon, only occurs at sufficiently small dgap < 1 mm and at sufficiently large Le >> 1 because large positive curvature stretch weakens reaction rate due to differential diffusion, making successful ignition in quiescence very difficult to achieve. At dgap = 0.58 mm and Le ≈ 2.3, a non-monotonic decrease and increase of MIET with increasing u′ is observed, because the dissipation of ignition kernel by sufficiently intense turbulence re-declares its dominance leading to the increase of MIET. There is no TFI when dgap > 1 mm regardless of Le. The scenario changes to MIE transition when dgap = 2 mm at Le ≈ 2.3, where MIEL << MIET. Moreover, when Le ≈ 0.3, MIE transition is shown to appear at dgap = 0.3 mm, but is clearly suppressed at dgap = 0.58 mm beyond which successful ignition is very easy to achieve. These findings are important for spark ignition in premixed turbulent combustion.
Information extraction plays an important role for data transformation in business cases. However, building extraction systems in actual cases face two challenges: (i) the availability of labeled ...data is usually limited and (ii) highly detailed classification is required. This paper introduces a model for addressing the two challenges. Different from prior studies that usually require a large number of training samples, our extraction model is trained with a small number of data for extracting a large number of information types. To do that, the model takes into account the contextual aspect of pre-trained language models trained on a huge amount of data on general domains for word representation. To adapt to our downstream task, the model employs transfer learning by stacking Convolutional Neural Networks to learn hidden representation for classification. To confirm the efficiency of our method, we apply the model to two actual cases of document processing for bidding and sale documents of two Japanese companies. Experimental results on real testing sets show that, with a small number of training data, our model achieves high accuracy accepted by our clients.
This work is the first exploration of the static bending and dynamic response analyses of piezoelectric bidirectional functionally graded plates by combining the third-order shear deformation theory ...of Reddy and the finite element approach, which can numerically model mechanical relations of the structure. The present approach and mechanical model are confirmed through the verification examples. The geometrical and material study is conducted to evaluate the effects of the feedback coefficients, volume fraction parameter, and constraint conditions on the static and dynamic behaviors of piezoelectric bidirectional functionally graded structures, and this work presents a wide variety of static and dynamic behaviors of the plate with many interesting results. There are many meanings that have not been mentioned by any work, especially the working performance of the structure is better than that when the feedback parameter of the piezoelectric component is added, that is, the piezoelectric layer increases the working efficiency. Numerical investigations are the important basis for calculating and designing related materials and structures in technical practice.
Display omitted
•A polyvinyl alcohol/chitosan/graphene composite modified electrode was introduced for trace Pb(II) detection.•Highly sensitive and selective detection of Pb(II) can be obtained via ...the electrode.•Cost-effective, easy-to-use, and environmentally friendly detection without using complicated apparatus.•The electrode was successfully applied for Pb(II) detection in real water samples.
A novel polyvinyl alcohol/chitosan-thermally reduced graphene modified glassy carbon electrode (PVA/chitosan-TRG/GCE) has been studied for determination of lead (Pb) in aqueous samples by square wave anodic stripping voltammetry (SWASV). The graphene was obtained from thermal exfoliation-reduction of graphite oxide in the nitrogen atmosphere with an eco-friendly procedure. Chitosan was blended with polyvinyl alcohol to obtain advances in electrochemical and mechanical properties. The surface of the PVA/chitosan-TRG composite exhibited a homogeneous and well-dispersed status, as observed from FESEM images. The electrochemical characteristics of the electrode before and after the modification were characterized via electrochemical impedance spectroscopy (EIS) and SWASV. The effects of experimental conditions such as buffer pH, pre-concentration time, pre-concentration potential, ratio of graphene to PVA/chitosan were systematically investigated. The modified electrode displayed a significant enhancement in sensitivity and selectivity for the detection of Pb. Under optimized conditions, the modified electrode revealed a detection range from 1 ppb to 50 ppb for Pb with a correlation coefficient of 0.996. The limit of detection (LOD) at the modified electrode, based on the signal-to-noise ratio (S/N = 3), gave a value of 0.05 ppb with a preconcentration time of 5 min. The selectivity study revealed that most common foreign ions did not bring significant interference for Pb detection. Besides, the electrode has been successfully applied for Pb detection in real water samples. The proposed method offers scope for on-site detection of Pb at trace level in aqueous samples with a cost-effective, easy-to-use, and environmentally friendly procedure without using complicated apparatus.
•A novel framework for social context summarization is proposed.•The framework relies on the reinforcement support of external information.•23 features in three groups: local, user-generated, and ...third-party are proposed.•A new open-domain dataset is created and manually annotated.•Combining internal and external information benefits the summarization.
In the context of social media, users mutually share their interests of an event mentioned in a Web document. Its content can also be found in different news providers with a writing variation. This paper presents a framework which exploits the support of social context (user-generated content such as comments or tweets and third-party sources such as relevant documents retrieved from a search engine) to extract high-quality summaries. The extraction was formulated in two steps: sentence scoring and selection. The scoring is modeled as a learning to rank problem, which employs Ranking SVM to mutually exploits sentences, user-generated content, and third-party sources in the form of features to cover summary aspects. For the selection, summaries are extracted by using a score-based or voting method. For evaluation, three datasets of sentence and highlight extraction in two languages were taken as a case study. Experimental results indicate that by integrating user-generated content and third-party sources, our framework obtains improvements of ROUGE-scores over state-of-the-art methods for single-document summarization.