Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed Open access
  • Message Clustering Method f...
    XU Xudong, ZHANG Zhixiang, ZHANG Xian

    Jisuanji kexue yu tansuo, 06/2020, Volume: 14, Issue: 6
    Journal Article

    Message clustering is one of the main steps of protocol reverse engineering. For the private binary protocol packets, the current message clustering method has the problem of message vectorization feature redundancy, and the traditional clustering method has the problem that the cluster center and the number of clusters are difficult to determine. According to the idea of n-gram serialization, the sequence item-location matrix of the message is constructed, frequent items are mined, and the message feature vector is constructed, which effectively removes the sequence noise in the message vectorization. The contour coefficient is used to guide the split hierarchical clus-tering, which avoids the initial clustering number and clustering center selection, so as to realize the clustering of private binary protocol messages under unsupervised conditions. The testing is carried out on a data set of 7 types of messages with 4 protocals: AIS, DNS, ICMP and ARP. The t-SNE visual interface is used to observe the distri