Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon Exam MLS-C01 Topic 4 Question 97 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 97
Topic #: 4
[All MLS-C01 Questions]

A machine learning (ML) developer for an online retailer recently uploaded a sales dataset into Amazon SageMaker Studio. The ML developer wants to obtain importance scores for each feature of the dataset. The ML developer will use the importance scores to feature engineer the dataset.

Which solution will meet this requirement with the LEAST development effort?

Show Suggested Answer Hide Answer
Suggested Answer: A

SageMaker Data Wrangler is a feature of SageMaker Studio that provides an end-to-end solution for importing, preparing, transforming, featurizing, and analyzing data. Data Wrangler includes built-in analyses that help generate visualizations and data insights in a few clicks. One of the built-in analyses is the Quick Model visualization, which can be used to quickly evaluate the data and produce importance scores for each feature. A feature importance score indicates how useful a feature is at predicting a target label. The feature importance score is between [0, 1] and a higher number indicates that the feature is more important to the whole dataset. The Quick Model visualization uses a random forest model to calculate the feature importance for each feature using the Gini importance method. This method measures the total reduction in node impurity (a measure of how well a node separates the classes) that is attributed to splitting on a particular feature. The ML developer can use the Quick Model visualization to obtain the importance scores for each feature of the dataset and use them to feature engineer the dataset. This solution requires the least development effort compared to the other options.

References:

* Analyze and Visualize

* Detect multicollinearity, target leakage, and feature correlation with Amazon SageMaker Data Wrangler


Contribute your Thoughts:

Steffanie
14 days ago
Wait, so we're supposed to use machine learning to figure out which features are important? Shouldn't we just ask the customers what they care about?
upvoted 0 times
...
Golda
16 days ago
Option C? Singular value decomposition? That's just showing off. I'll stick to the simple solutions.
upvoted 0 times
...
Vincenza
19 days ago
Option B might be overkill for this task. PCA is more for dimensionality reduction, not feature importance.
upvoted 0 times
Zona
2 days ago
D) Use the multicollinearity feature to perform a lasso feature selection to perform an importance scores analysis.
upvoted 0 times
...
Alaine
4 days ago
C) Use a SageMaker notebook instance to perform a singular value decomposition analysis.
upvoted 0 times
...
Celestina
9 days ago
A) Use SageMaker Data Wrangler to perform a Gini importance score analysis.
upvoted 0 times
...
...
Annabelle
1 months ago
I prefer option C because it provides a different perspective on the data.
upvoted 0 times
...
Alba
1 months ago
I'd go with Option D. Lasso feature selection is great for identifying the most important features.
upvoted 0 times
Nakita
19 days ago
I agree, lasso feature selection can help identify important features.
upvoted 0 times
...
Raul
21 days ago
Option D is a good choice for feature selection.
upvoted 0 times
...
...
Tommy
1 months ago
I'm not sure, but I think option B could also work well.
upvoted 0 times
...
Zena
2 months ago
I disagree, I believe option D is the most efficient.
upvoted 0 times
...
Arlyne
2 months ago
I think option A is the best choice.
upvoted 0 times
...
Leatha
2 months ago
Option A seems like the easiest way to get the importance scores. I don't want to spend too much time on this.
upvoted 0 times
Marisha
27 days ago
Option A sounds efficient. Let's go with that.
upvoted 0 times
...
Rupert
1 months ago
A) Use SageMaker Data Wrangler to perform a Gini importance score analysis.
upvoted 0 times
...
...

Save Cancel