UNI-MB - logo
UMNIK - logo
 
E-resources
Peer reviewed Open access
  • Research on Named Entity Re...
    Gan, Yong; Jia, Dongwei; Wang, Yifan

    Journal of physics. Conference series, 09/2021, Volume: 2025, Issue: 1
    Journal Article

    Abstract Because the computer cannot directly understand the text corpus in the NLP task, the first thing to do is to represent the characteristics of the natural language numerically, and the word vector technology provides a good way to express it. Because Word2vec considers context and has fewer dimensions, it is now more popular words embedded. However, due to the particularity of Chinese, word2vec cannot accurately identify the polysemy of words. In this paper, a lightweight and effective method is used to merge vocabulary into character representation. This approach avoids designing complex sequence modeling architectures. for any neural network model, simply fine-tuning the character input layer can introduce vocabulary information. The model also uses the modified LSTM to bridge the enormous LSTM and the Transformer model. The interaction between input and context provides a richer modeling space that significantly improves testing on all four public datasets.