A company has a gaming application that stores data in Amazon DynamoDB tables. A data engineer needs to ingest the game data into an Amazon OpenSearch Service cluster. Data updates must occur in near real time.
Which solution will meet these requirements?
Problem Analysis:
The company uses DynamoDB for gaming data storage and needs to ingest data into Amazon OpenSearch Service in near real time.
Data updates must propagate quickly to OpenSearch for analytics or search purposes.
Key Considerations:
DynamoDB Streams provide near-real-time capture of table changes (inserts, updates, and deletes).
Integration with AWS Lambda allows seamless processing of these changes.
OpenSearch offers APIs for indexing and updating documents, which Lambda can invoke.
Solution Analysis:
Option A: Step Functions with Periodic Export
Not suitable for near-real-time updates; introduces significant latency.
Operationally complex to manage periodic exports and S3 data ingestion.
Option B: AWS Glue Job
AWS Glue is designed for ETL workloads but lacks real-time processing capabilities.
Option C: DynamoDB Streams + Lambda
DynamoDB Streams capture changes in near real time.
Lambda can process these streams and use the OpenSearch API to update the index.
This approach provides low latency and seamless integration with minimal operational overhead.
Option D: Custom OpenSearch Plugin
Writing a custom plugin adds complexity and is unnecessary with existing AWS integrations.
Implementation Steps:
Enable DynamoDB Streams for the relevant DynamoDB tables.
Create a Lambda function to process stream records:
Parse insert, update, and delete events.
Use OpenSearch APIs to index or update documents based on the event type.
Set up a trigger to invoke the Lambda function whenever there are changes in the DynamoDB Stream.
Monitor and log errors for debugging and operational health.
Amazon DynamoDB Streams Documentation
A mobile gaming company wants to capture data from its gaming app. The company wants to make the data available to three internal consumers of the data. The data records are approximately 20 KB in size.
The company wants to achieve optimal throughput from each device that runs the gaming app. Additionally, the company wants to develop an application to process data streams. The stream-processing application must have dedicated throughput for each internal consumer.
Which solution will meet these requirements?
Problem Analysis:
Input Requirements: Gaming app generates approximately 20 KB data records, which must be ingested and made available to three internal consumers with dedicated throughput.
Key Requirements:
High throughput for ingestion from each device.
Dedicated processing bandwidth for each consumer.
Key Considerations:
Amazon Kinesis Data Streams supports high-throughput ingestion with PutRecords API for batch writes.
The Enhanced Fan-Out feature provides dedicated throughput to each consumer, avoiding bandwidth contention.
This solution avoids bottlenecks and ensures optimal throughput for the gaming application and consumers.
Solution Analysis:
Option A: Kinesis Data Streams + Enhanced Fan-Out
PutRecords API is designed for batch writes, improving ingestion performance.
Enhanced Fan-Out allows each consumer to process the stream independently with dedicated throughput.
Option B: Data Firehose + Dedicated Throughput Request
Firehose is not designed for real-time stream processing or fan-out. It delivers data to destinations like S3, Redshift, or OpenSearch, not multiple independent consumers.
Option C: Data Firehose + Enhanced Fan-Out
Firehose does not support enhanced fan-out. This option is invalid.
Option D: Kinesis Data Streams + EC2 Instances
Hosting stream-processing applications on EC2 increases operational overhead compared to native enhanced fan-out.
Final Recommendation:
Use Kinesis Data Streams with Enhanced Fan-Out for high-throughput ingestion and dedicated consumer bandwidth.
A data engineer needs to create a new empty table in Amazon Athena that has the same schema as an existing table named old-table.
Which SQL statement should the data engineer use to meet this requirement?
A.
B.
C.
D.
Problem Analysis:
The goal is to create a new empty table in Athena with the same schema as an existing table (old_table).
The solution must avoid copying any data.
Key Considerations:
CREATE TABLE AS (CTAS) is commonly used in Athena for creating new tables based on an existing table.
Adding the WITH NO DATA clause ensures only the schema is copied, without transferring any data.
Solution Analysis:
Option A: Copies both schema and data. Does not meet the requirement for an empty table.
Option B: Inserts data into an existing table, which does not create a new table.
Option C: Creates an empty table but does not copy the schema.
Option D: Creates a new table with the same schema and ensures it is empty by using WITH NO DATA.
Final Recommendation:
Use D. CREATE TABLE new_table AS (SELECT * FROM old_table) WITH NO DATA to create an empty table with the same schema.
A company hosts its applications on Amazon EC2 instances. The company must use SSL/TLS connections that encrypt data in transit to communicate securely with AWS infrastructure that is managed by a customer.
A data engineer needs to implement a solution to simplify the generation, distribution, and rotation of digital certificates. The solution must automatically renew and deploy SSL/TLS certificates.
Which solution will meet these requirements with the LEAST operational overhead?
The best solution for managing SSL/TLS certificates on EC2 instances with minimal operational overhead is to use AWS Certificate Manager (ACM). ACM simplifies certificate management by automating the provisioning, renewal, and deployment of certificates.
AWS Certificate Manager (ACM):
ACM manages SSL/TLS certificates for EC2 and other AWS resources, including automatic certificate renewal. This reduces the need for manual management and avoids operational complexity.
ACM also integrates with other AWS services to simplify secure connections between AWS infrastructure and customer-managed environments.
Alternatives Considered:
A (Self-managed certificates): Managing certificates manually on EC2 instances increases operational overhead and lacks automatic renewal.
C (Secrets Manager automation): While Secrets Manager can store keys and certificates, it requires custom automation for rotation and does not handle SSL/TLS certificates directly.
D (ECS Service Connect): This is unrelated to SSL/TLS certificate management and would not address the operational need.
A company uses AWS Glue Data Catalog to index data that is uploaded to an Amazon S3 bucket every day. The company uses a daily batch processes in an extract, transform, and load (ETL) pipeline to upload data from external sources into the S3 bucket.
The company runs a daily report on the S3 dat
a. Some days, the company runs the report before all the daily data has been uploaded to the S3 bucket. A data engineer must be able to send a message that identifies any incomplete data to an existing Amazon Simple Notification Service (Amazon SNS) topic.
Which solution will meet this requirement with the LEAST operational overhead?
AWS Glue workflows are designed to orchestrate the ETL pipeline, and you can create data quality checks to ensure the uploaded datasets are complete before running reports. If there is an issue with the data, AWS Glue workflows can trigger an Amazon EventBridge event that sends a message to an SNS topic.
AWS Glue Workflows:
AWS Glue workflows allow users to automate and monitor complex ETL processes. You can include data quality actions to check for null values, data types, and other consistency checks.
In the event of incomplete data, an EventBridge event can be generated to notify via SNS.
Alternatives Considered:
A (Airflow cluster): Managed Airflow introduces more operational overhead and complexity compared to Glue workflows.
B (EMR cluster): Setting up an EMR cluster is also more complex compared to the Glue-centric solution.
D (Lambda functions): While Lambda functions can work, using Glue workflows offers a more integrated and lower operational overhead solution.
Marquetta
14 days agoWade
24 days agoGlory
28 days agoTatum
1 months agoMelodie
2 months agoVicki
2 months agoGaston
2 months agoPedro
2 months agoTanesha
3 months agoFredric
3 months agoGlenn
3 months agoEliseo
3 months agoShawna
3 months agoEloisa
4 months agoDaron
4 months agoLashonda
4 months agoEdgar
4 months agoRessie
4 months agoIlene
5 months agoKarina
5 months ago