<ul data-eligibleForWebStory="true">AI systems are advancing rapidly with capabilities like acing medical boards and drafting legal contracts.There is a concern about how to validate if these AI systems truly understand the tasks they perform.The author, a physician and immunologist, questions the readiness of AI systems for practical use.There is a need to move beyond static benchmarks to assess AI models.The author proposes bringing the oral defense paradigm from academia to the validation of AI models.Questions about AI models' comprehension and understanding need to be addressed.The limitations of current evaluation methods need to be acknowledged and tackled.It's suggested that engaging with complex questions now is better than facing failures later.The hope is that this approach will lead to new discussions and collaborations in the field of AI validation.