New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Microsoft DP-100 Exam - Topic 10 Question 11 Discussion

Actual exam question for Microsoft's DP-100 exam
Question #: 11
Topic #: 10
[All DP-100 Questions]

You plan to use automated machine learning to train a regression model. You have data that has features which have missing values, and categorical features with few distinct values.

You need to configure automated machine learning to automatically impute missing values and encode categorical features as part of the training task.

Which parameter and value pair should you use in the AutoMLConfig class?

Show Suggested Answer Hide Answer
Suggested Answer: A

Featurization str or FeaturizationConfig

Values: 'auto' / 'off' / FeaturizationConfig

Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used.

Column type is automatically detected. Based on the detected column type preprocessing/featurization is done as follows:

Categorical: Target encoding, one hot encoding, drop high cardinality categories, impute missing values.

Numeric: Impute missing values, cluster distance, weight of evidence.

DateTime: Several features such as day, seconds, minutes, hours etc.

Text: Bag of words, pre-trained Word embedding, text target encoding.


https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig

Contribute your Thoughts:

0/2000 characters
Danica
4 months ago
A seems right, but I wonder if it really covers all edge cases.
upvoted 0 times
...
Rosio
4 months ago
I’m surprised that A does all that. I thought it was more complicated!
upvoted 0 times
...
Nobuko
4 months ago
Wait, isn't C a better option? It's about regression, not classification!
upvoted 0 times
...
Svetlana
4 months ago
I agree, A is the best choice for this scenario!
upvoted 0 times
...
Lai
5 months ago
Definitely go with A, 'featurization = auto' handles missing values and encodes categories.
upvoted 0 times
...
Pearly
5 months ago
I vaguely remember something about 'exclude_nan_labels', but I don't think it relates to imputing missing values directly. It might be more about filtering data.
upvoted 0 times
...
Carmelina
5 months ago
I feel like 'task = classification' doesn't fit since we're dealing with a regression model, but I can't recall the exact parameter for handling missing values.
upvoted 0 times
...
Marjory
5 months ago
I think we practiced a question where we had to set parameters for AutoMLConfig, and 'enable_voting_ensemble' was more about model selection rather than data preprocessing.
upvoted 0 times
...
Sylvie
5 months ago
I remember that 'featurization = auto' is often used to handle missing values and categorical features, but I'm not entirely sure if it's the right choice here.
upvoted 0 times
...
Justine
5 months ago
This seems like a tricky VXLAN configuration issue. I'll need to carefully review the options and think through the potential solutions.
upvoted 0 times
...
Laura
5 months ago
This looks like a pretty straightforward networking question. I'll start by carefully reviewing the ping and ipconfig output to identify any obvious issues with the IP configuration.
upvoted 0 times
...
Thurman
5 months ago
This is a tricky one. I'm not super confident, but I think the outcomes measures would be looking at things like patient satisfaction and perceptions of how well the provider addressed their medical problems. That seems to fit the description best.
upvoted 0 times
...
Alease
5 months ago
The homeopathy definition in option A sounds familiar from my healthcare policy class. I think that's the correct technical description.
upvoted 0 times
...

Save Cancel