NVIDIA Nemotron Super 49B and Nano 8B reasoning models now available in Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

A naukri.com initiative

New

Home

ML News

NVIDIA Nem...

Amazon

227

Image Credit: Amazon

NVIDIA Nemotron Super 49B and Nano 8B reasoning models now available in Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

The NVIDIA Llama 3.3 Nemotron Super 49B V1 and Llama 3.1 Nemotron Nano 8B V1 reasoning models are now accessible on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart for building generative AI concepts on AWS.
NVIDIA's Nemotron Inference Microservices (NIM) seamlessly integrate with AWS services like Amazon EC2 and Amazon SageMaker, enabling scalable deployment of generative AI models.
The Llama 3.3 Nemotron Super 49B V1 offers improved reasoning and chat preferences, fitting onto a single Hopper GPU, while the Llama 3.1 Nemotron Nano 8B V1 is enhanced for model accuracy, suitable for H100 or A100 GPUs.
Amazon Bedrock Marketplace streamlines access to various AI models and tools, offering secure integrations and scalable infrastructure for generative AI applications.
To deploy the Nemotron models in Amazon Bedrock Marketplace, users can subscribe to the model, configure deployment details like endpoint name and instance type, and begin exploring the model's capabilities in the Amazon Bedrock playground.
SageMaker JumpStart provides pre-trained foundation models like the Llama 3.3 Nemotron Super 49B V1 and Llama 3.1 Nemotron Nano 8B V1, enabling quick deployment for diverse AI tasks.
Before deployment on SageMaker, users need the necessary IAM permissions, and subscribing to the model package is required to access and deploy the NVIDIA Llama models.
Deployment using SageMaker involves defining the model package ARN, creating endpoint configurations, creating the endpoint, and running inference requests, allowing users to leverage the advanced AI capabilities of the Nemotron models.
Multiple deployment options are available, including leveraging SageMaker Studio or programmatically using the SageMaker Python SDK for deploying the Nemotron models.
Users can perform real-time inference with the Nemotron models in non-reasoning and reasoning modes, utilizing OpenAI API inference protocol to generate text based on user prompts.
The post provides detailed steps for deploying, running inference, and cleaning up resources to ensure efficient utilization of the Amazon Bedrock Marketplace and SageMaker.

Read Full Article

13 Likes

Discover more

For uninterrupted reading, download the app