Google Exam Professional Machine Learning Engineer Topic 1 Question 96 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam

Question #: 96
Topic #: 1

[All Professional Machine Learning Engineer Questions]

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

AAdd a regularization term such as the Min-Diff algorithm to the loss function.

BTrain a classifier using the chat messages in their original language.

CReplace the in-house word2vec with GPT-3 or T5.

DRemove moderation for languages for which the false positive rate is too high.

Show Suggested Answer

Suggested Answer: B

Vertex AI batch prediction is the most appropriate and efficient way to apply a pre-trained model like TensorFlow's SavedModel to a large dataset, especially for batch processing.

The Vertex AI batch prediction job works by exporting your dataset (in this case, historical data from BigQuery) to a suitable format (like Avro or CSV) and then processing it in Cloud Storage where the model is stored.

Avro format is recommended for large datasets as it is highly efficient for data storage and is optimized for read/write operations in Google Cloud, which is why option B is correct.

Option A suggests using BigQuery ML for inference, but it does not support running arbitrary TensorFlow models directly within BigQuery ML. Hence, BigQuery ML is not a valid option for this particular task.

Option C (exporting to CSV) is a valid alternative but is less efficient compared to Avro in terms of performance.

by Precious at Jan 25, 2025, 09:18 AM

Limited Time Offer

25%

Off

Get Premium Professional Machine Learning Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Alease

2 months ago

This chat moderation task reminds me of that old saying - 'lost in translation' takes on a whole new meaning when millions of players are involved!

upvoted 0 times

...

Nieves

2 months ago

Replace the in-house word2vec with GPT-3 or T5? Sounds like a job for Optimus Prime!

upvoted 0 times

...

Ardella

2 months ago

I wouldn't recommend removing moderation for languages with high false positive rates. That could lead to unchecked toxicity in those communities. Better to keep trying to improve the model.

upvoted 0 times

Levi

27 days ago

I agree, removing moderation for languages with high false positive rates is not a good idea. We should keep working on improving the model.

upvoted 0 times

...

Sabrina

1 months ago

B) Train a classifier using the chat messages in their original language.

upvoted 0 times

...

Helene

1 months ago

A) Add a regularization term such as the Min-Diff algorithm to the loss function.

upvoted 0 times

...

3 months ago

I think we should train a classifier using the chat messages in their original language.

upvoted 0 times

...