A developer used the AWS SDK to create an application that aggregates and produces log records for 10 services. The application delivers data to an Amazon Kinesis Data Streams stream.
Each record contains a log message with a service name, creation timestamp, and other log information. The stream has 15 shards in provisioned capacity mode. The stream uses service name as the partition key.
The developer notices that when all the services are producing logs, ProvisionedThroughputExceededException errors occur during PutRecord requests. The stream metrics show that the write capacity the applications use is below the provisioned capacity.
How should the developer resolve this issue?
Partition Key Issue:
Using 'service name' as the partition key results in uneven data distribution. Some shards may become hot due to excessive logs from certain services, leading to throttling errors.
Changing the partition key to 'creation timestamp' ensures a more even distribution of records across shards.
Incorrect Options Analysis:
Option A: On-demand capacity mode eliminates throughput management but is more expensive and does not address the root cause.
Option B: Adding more shards does not solve the issue if the partition key still creates hot shards.
Option D: Using separate streams increases complexity and is unnecessary.
Nicolette
6 days ago