Researchers propose a federated co-training approach called FedCT to boost privacy in federated semi-supervised learning.
FedCT shares only definitive (hard) labels on a public unlabeled dataset, improving privacy and allowing the use of local models that are not suitable for parameter aggregation.
Clients use a consensus of the shared labels as pseudo-labels for local training, enhancing privacy without compromising model quality.
Empirical evaluations and theoretical analyses suggest the applicability of FedCT in various federated learning scenarios, including fine-tuning of large language models.