DIKUL - logo
E-viri
Celotno besedilo
Recenzirano
  • Improvements in Multi-Docum...
    Agarwal, Raksha; Chatterjee, Niladri

    Expert systems with applications, 03/2022, Letnik: 190
    Journal Article

    The present work proposes a scheme for multi-document abstractive text summarization using node-aligned Word Graph based representation of clustered sentences. In the first step, the proposed scheme uses SBERT embedding for representing the sentences as fixed-size vectors. The sentences belonging to the same cluster are then represented using Word Graph, in which words of different sentences are aligned based on their semantic and syntactic similarities. The advantage of the above representation is that it utilizes alignment information of words between pairs of similar sentences to merge nodes in the Word Graph, and thereby facilitating the generation of sentences with multiple chunks of information. A sentence scoring function assisted by an intensification function is used to measure the grammaticality and informativeness of the generated sentences. Integer Linear Programming has been used to make the final selection of the scored sentences for generating the abstract. Experiments conducted for the task of sentence fusion and multi-document summarization demonstrate superior performance in comparison with the state-of-the-art techniques available in the literature. •Word Graph based representation of clusters of similar sentences.•The nodes of the Word Graph are aligned to fuse multiple chunks of information.•Similar sentences are compressed by traversing between fixed nodes of the graph.•Integer Linear Programming based maximization of grammaticality and informativeness.•Improved results are obtained for Sentence Fusion and Multi-Document Summarization.