<ul data-eligibleForWebStory="true">Databricks Inc. introduces Agent Bricks, a tool on the Mosaic AI platform to evaluate AI performance using large language model judges.Agent Bricks aims to reflect best practices and techniques for building reliable AI agents.AI scientist Jonathan Frankle discusses the development and evolution of Agent Bricks.The tool helps users evaluate the reliability of their agents by defining criteria and comparing performance.Agent Bricks employs large language models (LLMs) as judges, ensuring alignment with human judgment.Human involvement is crucial in the agent development process to train agents effectively.Databricks focuses on scaled reinforcement learning to customize LLMs using available data.Tools like Agent Bricks encourage users to think like software engineers and prioritize reliability over quick demonstrations.AI engineering involves careful calibration to solve specific problems effectively.The aim is to continuously test and evaluate AI models to enhance reliability and performance.Databricks updates enable vibe coding, but tools like Agent Bricks aim to improve users' engineering mindset.Feedback and support to SiliconANGLE is appreciated to continue offering free and relevant content.TheCUBE's coverage of Databricks' Data + AI Summit includes an interview with Jonathan Frankle.Join the community on YouTube with over 15,000 #CubeAlumni experts, including industry luminaries like Andy Jassy and Michael Dell.