Google Exam Professional Machine Learning Engineer Topic 9 Question 23 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam

Question #: 23
Topic #: 9

[All Professional Machine Learning Engineer Questions]

You developed an ML model with Al Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?

ASignificantly increase the max_batch_size TensorFlow Serving parameter

BSwitch to the tensorflow-model-server-universal version of TensorFlow Serving

CSignificantly increase the max_enqueued_batches TensorFlow Serving parameter

DRecompile TensorFlow Serving using the source to support CPU-specific optimizations Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes

Show Suggested Answer

Suggested Answer: D

by Lizette at May 04, 2022, 07:09 AM

Limited Time Offer

25%

Off

Get Premium Professional Machine Learning Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!