The Cloudera AI Inference service is a deployment environment that enables you to integrate and deploy generative AI (GenAI) and predictive models into your production environments, incorporating Cloudera's enterprise-grade security, privacy, and data governance.
Cloudera AI Inference service is highly scalable, secure, and high-performance deployment environment for serving production AI models and related applications.
The service is targeted at the production-serving end of the MLOPs/LLMOPs pipeline.
Cloudera AI Inference is a new, purpose-built service for hosting all production AI models and related applications
The Cloudera AI Inference service provides secure and scalable deployment for pre-trained GenAI models.
It also has strong authentication and authorization capabilities, fast recovery from failures, and easy to operate rolling updates.
With tooling such as Prometheus and Istio, users can monitor the system and model performance.
Users can train, fine-tune machine learning models in the AI Workbench, and deploy them to the Cloudera AI Inference service for production use cases.
The Cloudera AI Inference service is designed to handle model deployment automatically and can efficiently orchestrate hundreds of models and applications and scale each deployment to hundred of replicas dynamically.
The service complements the Cloudera AI Workbench which is mainly used for exploration, development, and testing phases of the MLOPs workflow.