A computer vision project aimed at identifying physical damage on laptops faced challenges like hallucinations and unreliable outputs.
The team experimented with different approaches, including using an agentic framework to break down the image interpretation task into smaller agents.
They found that image quality significantly impacted the model's output and tried training it with a mix of high and low-resolution images to improve consistency.
Combining different AI techniques such as image captioning with language models and utilizing an agentic framework led to a more precise and manageable system for structured damage detection.