Databricks Exam Databricks-Certified-Professional-Data-Scientist Topic 4 Question 73 Discussion

Actual exam question for Databricks's Databricks-Certified-Professional-Data-Scientist exam

Question #: 73
Topic #: 4

[All Databricks-Certified-Professional-Data-Scientist Questions]

What is the best way to evaluate the quality of the model found by an unsupervised algorithm like k-means clustering, given metrics for the cost of the clustering (how well it fits the data) and its stability (how similar the clusters are across multiple runs over the same data)?

AThe lowest cost clustering subject to a stability constraint

BThe lowest cost clustering

CThe most stable clustering subject to a minimal cost constraint

DThe most stable clustering
There is a tradeoff between cost and stability in unsupervised learning. The more tightly you fit the data, the less stable the model will be, and vice versa. The idea is to find a good balance with more weight given to the cost. Typically a good approach is to set a stability threshold and select the model that achieves the lowest cost above the stability threshold.

Show Suggested Answer

Suggested Answer: A

by Tamera at Sep 17, 2024, 04:29 PM

Limited Time Offer

25%

Off

Get Premium Databricks-Certified-Professional-Data-Scientist Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Stefanie

2 months ago

Option C all the way. I'd rather have a super stable model, even if it's not the absolute lowest cost. Stability is key in unsupervised learning!

upvoted 0 times

...

Kris

2 months ago

I believe the optimal approach is to set a stability threshold and select the model that achieves the lowest cost above that threshold. This way we balance cost and stability effectively.

upvoted 0 times

...

Sharen

2 months ago

Definitely option A. The stability of the clusters is just as important as the cost, so we need to consider both factors.

upvoted 0 times

Eladia

1 months ago

Exactly, setting a stability threshold can help us prioritize the cost while ensuring the clusters are stable.

upvoted 0 times

...

Jaleesa

1 months ago

I agree, it's important to find that balance. We don't want a model that fits the data perfectly but is not stable.

upvoted 0 times

...

Evangelina

2 months ago

Option A is the best choice. We need to balance cost and stability in the clustering model.

upvoted 0 times

...

Gearldine

2 months ago

I agree, but we also need to consider stability. Maybe the most stable clustering subject to a minimal cost constraint?

upvoted 0 times

...

Rhea

2 months ago

I think the best way to evaluate the quality of the model is to find the lowest cost clustering subject to a stability constraint. That seems like the most balanced approach to me.

upvoted 0 times

...

Leila

2 months ago

This is a tough one, but I'm going with the lowest cost clustering subject to a stability constraint. Ain't no point in having super stable clusters if they don't fit the data well, am I right?

upvoted 0 times

...

Hana

3 months ago

I think the best way is to choose the lowest cost clustering.

upvoted 0 times

...

Casie

3 months ago

Hold up, what about that tradeoff though? I reckon the best answer is the one that balances cost and stability, like the question says. Gotta find that sweet spot, you know?

upvoted 0 times

Arletta

1 months ago

A) The lowest cost clustering subject to a stability constraint

upvoted 0 times

...

Velda

1 months ago

Hold up, what about that tradeoff though? I reckon the best answer is the one that balances cost and stability, like the question says. Gotta find that sweet spot, you know?

upvoted 0 times

...

Shawnta

2 months ago

There is a tradeoff between cost and stability in unsupervised learning. The more tightly you fit the data, the less stable the model will be, and vice versa. The idea is to find a good balance with more weight given to the cost. Typically a good approach is to set a stability threshold and select the model that achieves the lowest cost above the stability threshold.

upvoted 0 times

...

Delbert

2 months ago

A) The lowest cost clustering subject to a stability constraint

upvoted 0 times

...

Bettina

3 months ago

Nah man, you gotta consider stability too. The most stable clustering is the way to go, even if the cost is a bit higher. You don't want your clusters changing all the time, that's just confusing.

upvoted 0 times

...

Sheldon

3 months ago

I think the best way is to go with the lowest cost clustering, because that's the whole point of k-means, right? Stability is overrated. Just give me the clusters that fit the data the best!

upvoted 0 times

Adela

2 months ago

I think the best way is to go with the lowest cost clustering, because that's the whole point of k-means, right? Stability is overrated. Just give me the clusters that fit the data the best!

upvoted 0 times

...

Daron

2 months ago

upvoted 0 times

...

Deane

3 months ago

A) The lowest cost clustering

upvoted 0 times

...