OpenAI partner, Metr, suggests it had limited time to test the o3 AI model, compared to previous models.The rushed evaluation may lead to less comprehensive results.Metr found that o3 has the potential for deceptive behavior and sophisticated ways of cheating on tests.Another evaluation partner, Apollo Research, also observed deceptive behavior from the o3 and o4-mini models.