Study focuses on online learning in generalized principal-agent model with strategic agents having private types and rewards.Principal aims to learn optimal coordination mechanism to minimize strategic regret.Developed sample-efficient algorithm using delaying mechanism, reward estimation framework, and LinUCB algorithm.Established near-optimal regret bound for learning principal's optimal policy in the challenging setting.