BlackFriday 2024! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Machine Learning Engineer Topic 4 Question 89 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 89
Topic #: 4
[All Professional Machine Learning Engineer Questions]

While running a model training pipeline on Vertex Al, you discover that the evaluation step is failing because of an out-of-memory error. You are currently using TensorFlow Model Analysis (TFMA) with a standard Evaluator TensorFlow Extended (TFX) pipeline component for the evaluation step. You want to stabilize the pipeline without downgrading the evaluation quality while minimizing infrastructure overhead. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: C

The best option to stabilize the pipeline without downgrading the evaluation quality while minimizing infrastructure overhead is to use Dataflow as the runner for the evaluation step. Dataflow is a fully managed service for executing Apache Beam pipelines that can scale up and down according to the workload. Dataflow can handle large-scale, distributed data processing tasks such as model evaluation, and it can also integrate with Vertex AI Pipelines and TensorFlow Extended (TFX). By using the flag-runner=DataflowRunnerinbeam_pipeline_args, you can instruct the Evaluator component to run the evaluation step on Dataflow, instead of using the default DirectRunner, which runs locally and may cause out-of-memory errors. Option A is incorrect because addingtfma.MetricsSpec()to limit the number of metrics in the evaluation step may downgrade the evaluation quality, as some important metrics may be omitted. Moreover, reducing the number of metrics may not solve the out-of-memory error, as the evaluation step may still consume a lot of memory depending on the size and complexity of the data and the model. Option B is incorrect because migrating the pipeline to Kubeflow hosted on Google Kubernetes Engine (GKE) may increase the infrastructure overhead, as you need to provision, manage, and monitor the GKE cluster yourself. Moreover, you need to specify the appropriate node parameters for the evaluation step, which may require trial and error to find the optimal configuration. Option D is incorrect because moving the evaluation step out of the pipeline and running it on custom Compute Engine VMs may also increase the infrastructure overhead, as you need to create, configure, and delete the VMs yourself. Moreover, you need to ensure that the VMs have sufficient memory for the evaluation step, which may require trial and error to find the optimal machine type.Reference:

Dataflow documentation

Using DataflowRunner

Evaluator component documentation

Configuring the Evaluator component


Contribute your Thoughts:

Deandrea
1 months ago
I'm a little worried about the Dataflow option. Isn't that just a fancy way of saying 'throw more money at the problem'? The Kubeflow approach seems like a better balance of cost and performance.
upvoted 0 times
Donte
12 days ago
I'm not sure about Dataflow either. Kubeflow does seem like a good middle ground.
upvoted 0 times
...
Johnna
18 days ago
B) Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the appropriate node parameters for the evaluation step.
upvoted 0 times
...
Pansy
20 days ago
A) Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step.
upvoted 0 times
...
...
Helga
1 months ago
I think moving the evaluation step to custom Compute Engine VMs with more memory is the most efficient option.
upvoted 0 times
...
Rosendo
2 months ago
Moving the evaluation step out of the pipeline and onto custom VMs? Sounds like a lot of extra work. I'd try the Kubeflow or Dataflow options first - they seem more elegant and efficient.
upvoted 0 times
Ezekiel
12 days ago
Moving the evaluation step out of the pipeline and onto custom VMs? Sounds like a lot of extra work. I'd try the Kubeflow or Dataflow options first - they seem more elegant and efficient.
upvoted 0 times
...
Colby
14 days ago
B) Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the appropriate node parameters for the evaluation step.
upvoted 0 times
...
Dong
24 days ago
A) Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step.
upvoted 0 times
...
...
Giuseppe
2 months ago
I disagree, I believe migrating to Kubeflow on GKE with appropriate node parameters is the best solution.
upvoted 0 times
...
Angelica
2 months ago
I think we should add tfma.MetricsSpec() to limit the number of metrics.
upvoted 0 times
...
Ronald
2 months ago
Hmm, limiting the metrics sounds like a quick fix, but I'm not sure it'll address the root cause of the memory issue. Might be worth looking into the Kubeflow or Dataflow options to really scale up the resources.
upvoted 0 times
Melodie
23 days ago
Hmm, limiting the metrics sounds like a quick fix, but I'm not sure it'll address the root cause of the memory issue. Might be worth looking into the Kubeflow or Dataflow options to really scale up the resources.
upvoted 0 times
...
Lenny
24 days ago
C) Include the flag -runner=DataflowRunner in beam_pipeline_args to run the evaluation step on Dataflow.
upvoted 0 times
...
Quentin
25 days ago
B) Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the appropriate node parameters for the evaluation step.
upvoted 0 times
...
Terina
2 months ago
A) Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step.
upvoted 0 times
...
...

Save Cancel