Instruction tuning is crucial for Large Language Models (LLMs) to follow user instructions effectively.
Existing data selection methods in instruction tuning have limitations, such as evaluating quality at the sample level and overlooking token-level informativeness and scoring method robustness.
T-SHIRT is a new data selection framework introduced to address these limitations, focusing on token-level informativeness and selecting robust and reliable samples for instruction tuning.
Models instruction-tuned with T-SHIRT on a curated dataset can outperform those trained on the full dataset by up to 5.48 points on average across eight benchmarks, remaining cost-effective and highly efficient.