Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks Machine Learning Professional Topic 1 Question 20 Discussion

Actual exam question for Databricks's Databricks Machine Learning Professional exam
Question #: 20
Topic #: 1
[All Databricks Machine Learning Professional Questions]

A machine learning engineering team wants to build a continuous pipeline for data preparation of a machine learning application. The team would like the data to be fully processed and made ready for inference in a series of equal-sized batches.

Which of the following tools can be used to provide this type of continuous processing?

Show Suggested Answer Hide Answer
Suggested Answer: C

Contribute your Thoughts:

Ling
1 months ago
Hey, I heard MLflow is the new hotness for machine learning ops. Maybe they can just throw some emojis at the data and it'll magically get processed.
upvoted 0 times
Marge
1 days ago
C: Spark UDFs might also be useful for data processing.
upvoted 0 times
...
Bernardine
4 days ago
B: Yeah, MLflow is great for managing the machine learning lifecycle.
upvoted 0 times
...
Gertude
13 days ago
A: I think MLflow could definitely help with that.
upvoted 0 times
...
...
Mari
2 months ago
Spark UDFs? That's just for extending Spark's functionality, not for continuous data processing. I'm with the others - Structured Streaming is the way to go.
upvoted 0 times
...
Marci
2 months ago
AutoML? Really? That's for automating the machine learning model development process, not data preprocessing. I'd say Structured Streaming is the clear winner here.
upvoted 0 times
Julie
1 days ago
Spark UDFs might work well for custom data processing functions within the pipeline.
upvoted 0 times
...
Helaine
7 days ago
I think MLflow could also be useful for tracking and managing the machine learning pipeline.
upvoted 0 times
...
Tijuana
16 days ago
I agree, Structured Streaming is the best choice for continuous processing.
upvoted 0 times
...
Sabina
25 days ago
Delta Lake is great for reliable data lakes, but maybe not the best fit for this specific task.
upvoted 0 times
...
Emerson
29 days ago
MLflow could also be useful for tracking experiments and managing the machine learning lifecycle.
upvoted 0 times
...
Kerrie
2 months ago
I agree, Structured Streaming is the best choice for continuous processing.
upvoted 0 times
...
...
Janey
2 months ago
Hmm, I'm not sure about Structured Streaming. Isn't that more for stream processing? I feel like Delta Lake might be a better fit since it can handle batch processing as well.
upvoted 0 times
Dortha
22 days ago
Let's go with Delta Lake for the continuous processing pipeline.
upvoted 0 times
...
Annmarie
28 days ago
I'm not sure about Structured Streaming either, but Delta Lake seems like a versatile option.
upvoted 0 times
...
Lucina
1 months ago
I agree, Delta Lake can handle both batch and stream processing.
upvoted 0 times
...
Susana
2 months ago
I think Delta Lake would be a good choice for continuous processing.
upvoted 0 times
...
...
Loren
3 months ago
I think Structured Streaming is the right choice here. It's specifically designed for continuous, real-time data processing, which is exactly what the team needs for their machine learning pipeline.
upvoted 0 times
Judy
2 months ago
I think MLflow could also be a good option for managing the machine learning pipeline.
upvoted 0 times
...
Geraldine
2 months ago
I agree, Structured Streaming is perfect for continuous processing.
upvoted 0 times
...
...
Beth
3 months ago
I personally prefer using MLflow for managing the machine learning pipeline.
upvoted 0 times
...
Starr
3 months ago
I agree with Denae, Structured Streaming is a good choice for processing data in batches.
upvoted 0 times
...
Denae
3 months ago
I think Structured Streaming can be used for continuous processing.
upvoted 0 times
...

Save Cancel