Syntactic paraphrase-based synthetic data generation for backdoor attacks against Chinese language models

Man Hu, Yatao Yang*, Deng Pan, Zhongliang Guo, Luwei Xiao, Deyu Lin, Shuai Zhao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Language Models (LMs) have shown significant advancements in various Natural Language Processing (NLP) tasks. However, recent studies indicate that LMs are particularly susceptible to malicious backdoor attacks, where attackers manipulate the models to exhibit specific behaviors when they encounter particular triggers. While existing research has focused on backdoor attacks against English LMs, Chinese LMs remain largely unexplored. Moreover, existing backdoor attacks against Chinese LMs exhibit limited stealthiness. In this paper, we investigate the high detectability of current backdoor attacks against Chinese LMs and propose a more stealthy backdoor attack method based on syntactic paraphrasing. Specifically, we leverage large language models (LLMs) to construct a syntactic paraphrasing mechanism that transforms benign inputs into poisoned samples with predefined syntactic structures. Subsequently, we exploit the syntactic structures of these poisoned samples as triggers to create more stealthy and robust backdoor attacks across various attack strategies. Extensive experiments conducted on three major NLP tasks with various Chinese PLMs and LLMs demonstrate that our method can achieve comparable attack performance (almost 100% success rate). Additionally, the poisoned samples generated by our method show lower perplexity and fewer grammatical errors compared to traditional character-level backdoor attacks. Furthermore, our method exhibits strong resistance against two state-of-the-art backdoor defense mechanisms.

Original languageEnglish
Article number103376
Pages (from-to)1-13
Number of pages13
JournalInformation Fusion
Volume124
Early online date12 Jun 2025
DOIs
Publication statusPublished - Dec 2025

Keywords

  • Backdoor attacks
  • Generative artificial intelligence
  • Large language models
  • Model security
  • Syntactic structure
  • Synthetic data generation

Fingerprint

Dive into the research topics of 'Syntactic paraphrase-based synthetic data generation for backdoor attacks against Chinese language models'. Together they form a unique fingerprint.

Cite this