Backdoor attacks on text classifiers can be made more effective and subtle by crafting trigger attributes that are indistinguishable from normal texts.
Previous attacks often rely on triggers that are ungrammatical or unusual, making them easily detectable by human annotators during manual inspection.
The study proposes 'AttrBkd', a method for crafting subtle trigger attributes by extracting fine-grained attributes from existing backdoor attacks.
Human evaluations show that AttrBkd with baseline-derived attributes is more effective and subtle compared to original baseline backdoor attacks.