A word-embedding-based steganalysis method for linguistic steganography via synonym substitution

The development of steganography technology threatens the security of privacy information in smart campus. To prevent privacy disclosure, a linguistic steganalysis method based on word embedding is proposed to detect the privacy information hidden in synonyms in the texts. With the continuous Skip-g...

全面介紹

Saved in:
書目詳細資料
Main Authors: Xiang, Lingyun, Yu, Jingmin, Yang, Chunfang, Zeng, Daojian, Shen, Xiaobo
其他作者: School of Computer Science and Engineering
格式: Article
語言:English
出版: 2018
主題:
在線閱讀:https://hdl.handle.net/10356/103246
http://hdl.handle.net/10220/47273
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:The development of steganography technology threatens the security of privacy information in smart campus. To prevent privacy disclosure, a linguistic steganalysis method based on word embedding is proposed to detect the privacy information hidden in synonyms in the texts. With the continuous Skip-gram language model, each synonym and words in its context are represented as word embeddings, which aims to encode semantic meanings of words into low-dimensional dense vectors. The context fitness, which characterizes the suitability of a synonym by its semantic correlations with context words, is effectively estimated by their corresponding word embeddings and weighted by TF-IDF values of context words. By analyzing the differences of context fitness values of synonyms in the same synonym set and the differences of those in the cover and stego text, three features are extracted and fed into a support vector machine classifier for steganalysis task. The experimental results show that the proposed steganalysis improves the average F-value at least 4.8% over two baselines. In addition, the detection performance can be further improved by learning better word embeddings.