Large language models (LLMs) are being fine-tuned on domain-specific datasets, which may contain sensitive and confidential information like patient demographics.
A new benchmark task called PropInfer is introduced to assess property inference in LLMs under question-answering and chat-completion fine-tuning paradigms.
PropInfer is built on the ChatDoctor dataset and includes various property types and task configurations.
The study explores prompt-based generation and shadow-model attacks to evaluate property inference in LLMs.
Empirical evaluations on multiple pretrained LLMs demonstrate the success of these attacks, highlighting a vulnerability in LLMs when it comes to inferring confidential properties of training data.