Large language models like Anthropic’s Claude 3.5 can perform various tasks, yet their internal logic remains a mystery to most.Anthropic engineers have introduced attribution graphs, a technique that acts as an MRI scanner for neural networks.Attribution graphs trace activations to draw causal diagrams and explain why a specific token was predicted.This synthetic biology for AI aims to provide transparency and understanding of how the model works.