Large Language Models (LLMs) require robust safety guardrails to prevent inappropriate outputs and protect user interactions, addressing concerns like generating malicious content and protecting sensitive information.
Implementing safety measures on Amazon SageMaker AI involves pre-deployment interventions and runtime interventions for AI safety.
Pre-deployment interventions embed safety principles into models during training and include model guardrails as examples.
Runtime interventions involve active safety monitoring during model operation, such as prompt engineering and output filtering.
Combining protection layers from built-in guardrails to external safety models creates comprehensive AI safety systems.
Built-in model guardrails provide the first line of defense, equipped with sophisticated safety features implemented during pre-training and fine-tuning phases.
Amazon SageMaker JumpStart offers models like Meta Llama 3 with red teaming and specialized testing for critical risks.
Amazon Bedrock Guardrails ApplyGuardrail API offers runtime safeguards based on predefined validation rules to protect against sensitive information and inappropriate content.
Implementing Amazon Bedrock Guardrails with a SageMaker endpoint involves creating guardrails, evaluating content, and processing inputs and outputs.
Foundation models like Llama Guard provide detailed safety checks beyond rule-based approaches, evaluating content safety in multiple languages and hazard categories.