DeepSeek-R1 model excels at complex reasoning tasks, when combined with Amazon OpenSearch Service it enables robust Retrieval Augmented Generation (RAG) applications in enterprises.
This post shows you how to set up RAG using DeepSeek-R1 on Amazon SageMaker with an OpenSearch Service vector database as the knowledge base.
The OpenSearch Service provides rich capabilities for RAG use cases, as well as vector embedding-powered semantic search.
You create an OpenSearch connector and model that enable the retrieval_augmented_generation processor within OpenSearch to execute a user query, perform a search, and use DeepSeek to generate a text response.
You will create a connector to SageMaker with Amazon Titan Text Embeddings V2 to create embeddings for a set of documents with population statistics.
The post guides you through running a set of scripts to create the entire architecture and data flow including SageMaker endpoint, two IAM roles, OpenSearch connector and model prepared for RAG workflow.
A RAG workflow essentially involves adding information to the prompt so that the LLM generating the response is more accurate, using search pipelines and OpenSearch retrieval_augmented_generation processor.
With an OpenSearch based knowledge base of population statistics for two cities (NYC and Miami), Amazon DeepSeek model on SageMaker generates a response to a question on the population changes in New York City and Miami.
Adapt the code from this post to create your own knowledge base and run your own RAG queries.
The OpenSearch connector framework is a flexible way to access models hosted on other platforms, while DeepSeek’s reasoning capabilities offer a powerful and cost-effective AI model for building generative applications.