Machine learning models can have significant performance disparities between patient groups.Traditional subgroup analysis based on metadata may not capture all performance variations.Subgroup discovery techniques using learned feature representations can provide more granular subgroup performance reports.Subgroup discovery can be an important tool for comprehensive performance validation and monitoring of AI in medicine.