DIKUL - logo
E-viri
Recenzirano Odprti dostop
  • Self-Information Loss Compe...
    Wang, Weikuan; Feng, Ao

    Mathematical problems in engineering, 02/2021, Letnik: 2021
    Journal Article

    The technology of automatic text generation by machine has always been an important task in natural language processing, but the low-quality text generated by the machine seriously affects the user experience due to poor readability and fuzzy effective information. The machine-generated text detection method based on traditional machine learning relies on a large number of artificial features with detection rules. The general method of text classification based on deep learning tends to the orientation of text topics, but logical information between texts sequences is not well utilized. For this problem, we propose an end-to-end model which uses the text sequences self-information to compensate for the information loss in the modeling process, to learn the logical information between the text sequences for machine-generated text detection. This is a text classification task. We experiment on a Chinese question and answer the dataset collected from a biomedical social media, which includes human-written text and machine-generated text. The result shows that our method is effective and exceeds most baseline models.