New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon MLS-C01 Exam - Topic 4 Question 101 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 101
Topic #: 4
[All MLS-C01 Questions]

A company ingests machine learning (ML) data from web advertising clicks into an Amazon S3 data lake. Click data is added to an Amazon Kinesis data stream by using the Kinesis Producer Library (KPL). The data is loaded into the S3 data lake from the data stream by using an Amazon Kinesis Data Firehose delivery stream. As the data volume increases, an ML specialist notices that the rate of data ingested into Amazon S3 is relatively constant. There also is an increasing backlog of data for Kinesis Data Streams and Kinesis Data Firehose to ingest.

Which next step is MOST likely to improve the data ingestion rate into Amazon S3?

Show Suggested Answer Hide Answer
Suggested Answer: D

The best visualization for this task is to create a bar plot, faceted by year, of average sales for each region and add a horizontal line in each facet to represent average sales. This way, the data scientist can easily compare the yearly average sales for each region with the overall average sales and see the trends over time. The bar plot also allows the data scientist to see the relative performance of each region within each year and across years. The other options are less effective because they either do not show the yearly trends, do not show the overall average sales, or do not group the data by region.

References:

pandas.DataFrame.groupby --- pandas 2.1.4 documentation

pandas.DataFrame.plot.bar --- pandas 2.1.4 documentation

Matplotlib - Bar Plot - Online Tutorials Library


Contribute your Thoughts:

0/2000 characters
Brittni
3 months ago
Not sure if decreasing retention will really solve the backlog issue.
upvoted 0 times
...
Jacob
3 months ago
Definitely need to increase shards, that’s a common bottleneck.
upvoted 0 times
...
German
3 months ago
Surprised that increasing S3 prefixes is even an option here.
upvoted 0 times
...
Horace
4 months ago
I think adding more consumers could help too.
upvoted 0 times
...
Agustin
4 months ago
Increasing the number of shards for the data stream is key!
upvoted 0 times
...
Brittani
4 months ago
I recall that decreasing the retention period might not really affect the ingestion rate, so I would lean towards increasing the number of shards.
upvoted 0 times
...
Nieves
4 months ago
I practiced a similar question, and I think increasing S3 prefixes could help with write performance, but I'm not confident about that.
upvoted 0 times
...
Juliana
4 months ago
I'm not entirely sure, but I feel like adding more consumers could help too. Maybe option D?
upvoted 0 times
...
Nicolette
5 months ago
I remember that increasing the number of shards can help with throughput in Kinesis, so I think option C might be the right choice.
upvoted 0 times
...
Evette
5 months ago
I'm not entirely sure about this one. The options seem a bit tricky, and I don't want to just guess. I think I'll need to review the details of the Kinesis Data Streams and Kinesis Data Firehose services to understand how they work and how they might be impacting the data ingestion rate. That should help me figure out the best solution.
upvoted 0 times
...
Marguerita
5 months ago
Okay, I've got this. The question is asking about improving the data ingestion rate into Amazon S3, and the problem is that there's an increasing backlog in the Kinesis Data Streams and Kinesis Data Firehose. Increasing the number of shards for the data stream is the way to go - that will allow more data to be processed in parallel and reduce the backlog.
upvoted 0 times
...
Angelyn
5 months ago
Hmm, I'm a bit confused by this question. There are a few options presented, and I'm not sure which one would be the most effective. I'll need to think through the different components of the system and how they interact to determine the best approach.
upvoted 0 times
...
Jesusita
5 months ago
This looks like a classic AWS data ingestion problem. I think the key is to identify the bottleneck in the system and address that. Increasing the number of shards for the data stream seems like the most likely solution to improve the ingestion rate.
upvoted 0 times
...
Leana
5 months ago
Alex seems to have the highest scores across the board, so I think they would be the most suitable for the test designer role. The question is pretty clear-cut.
upvoted 0 times
...
Renay
5 months ago
I remember discussing how non-current assets can affect profit calculations, but I'm not totally sure about the depreciation side of things.
upvoted 0 times
...
Eulah
10 months ago
Wait, wait, wait... Did someone say 'backlog'? That's like a data traffic jam! We need to get those bits moving, pronto. Crank up those shards, my dudes!
upvoted 0 times
Carylon
9 months ago
C: Let's do it, more shards it is!
upvoted 0 times
...
Novella
9 months ago
B: Agreed, that should help clear up the backlog and improve the data ingestion rate.
upvoted 0 times
...
Elden
10 months ago
A: Yeah, we definitely need to increase the number of shards for the data stream.
upvoted 0 times
...
...
Johanna
10 months ago
Aha, I see what they're getting at. Increasing the number of shards is the way to go. It's like turbocharging your data pipeline - more horsepower to handle that growing backlog!
upvoted 0 times
Shawana
9 months ago
D: Definitely, more shards means more capacity to handle the data.
upvoted 0 times
...
Ressie
9 months ago
C: I agree, it's like giving a boost to the system.
upvoted 0 times
...
Shonda
9 months ago
B: Yeah, that makes sense. It will help with the data ingestion rate.
upvoted 0 times
...
Ira
10 months ago
A: I think increasing the number of shards is the best option.
upvoted 0 times
...
...
Jennifer
10 months ago
Whoa, hold up! Adding more consumers using the Kinesis Client Library? That's a bold move, my friend. I'd be a little worried about the complexity and overhead that could bring.
upvoted 0 times
Vernell
9 months ago
B: Yeah, that could help distribute the workload better and improve the ingestion rate.
upvoted 0 times
...
Amber
9 months ago
A: I think increasing the number of shards for the data stream might be a better option.
upvoted 0 times
...
...
Stefany
11 months ago
I'm not sure about decreasing the retention period - that might cause us to lose valuable data. Increasing the number of prefixes or adding more consumers could be better options.
upvoted 0 times
...
Melita
11 months ago
Hmm, increasing the number of shards for the data stream seems like the most logical choice here. More shards should help distribute the load and improve the ingestion rate.
upvoted 0 times
Lisha
9 months ago
D: Decreasing the retention period for the data stream might also help with the backlog issue.
upvoted 0 times
...
Andrew
9 months ago
C: I think increasing the number of S3 prefixes for the delivery stream could also improve the ingestion rate.
upvoted 0 times
...
Tandra
9 months ago
B: But wouldn't adding more consumers using the Kinesis Client Library also help distribute the workload?
upvoted 0 times
...
Filiberto
10 months ago
A: I agree, increasing the number of shards for the data stream should help with the data ingestion rate.
upvoted 0 times
...
...
Matthew
11 months ago
I'm not sure about that. Maybe adding more consumers using the Kinesis Client Library could also help speed up the process.
upvoted 0 times
...
Izetta
11 months ago
I agree with Pearlie. More shards would allow for parallel processing and faster data ingestion.
upvoted 0 times
...
Pearlie
11 months ago
I think increasing the number of shards for the data stream could help improve the ingestion rate into Amazon S3.
upvoted 0 times
...

Save Cancel