In this post, we show how to use Amazon Kinesis Data Streams to buffer and aggregate real-time streaming data for delivery into Amazon OpenSearch Service domains and collections using Amazon OpenSearch Ingestion.
Kinesis Data Streams enhances log aggregation by decoupling producer and consumer applications, and providing a resilient, scalable buffer to capture and serve log data.
OpenSearch Ingestion is a serverless pipeline that provides powerful tools for extracting, transforming, and loading data into an OpenSearch Service domain.
The use case for centralizing log aggregation is also discussed for an organization that has a compliance need to archive and retain its log data, and how standardizing logging approaches reduces development and operational overhead for organizations.
The article guides readers through creating an AWS Identity and Access Management (IAM) role that allows read access to the Kinesis data stream and read/write access to the OpenSearch domain for configuring OpenSearch Ingestion pipeline to process log data and providing detailed explanation with example how to parse the log message fields.
Readers are also provided with several key areas to monitor for maintaining the health of log ingestion pipeline such as Kinesis Data Streams metrics, CloudWatch subscription filter metrics, OpenSearch Ingestion metrics, and OpenSearch Service metrics.
Lastly, the article concludes with some suggestions for other use cases for OpenSearch Ingestion and Kinesis Data Streams, such as using anomaly detection, trace analytics, and hybrid search with OpenSearch Service.
The authors of the article are M Mehrtens, Arjun Nambiar, and Muthu Pitchaimani.