Google Exam Professional Machine Learning Engineer Topic 11 Question 38 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam

Question #: 38
Topic #: 11

[All Professional Machine Learning Engineer Questions]

You want to rebuild your ML pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over 12 hours to run. To speed up development and pipeline run time, you want to use a serverless tool and SQL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting the speed and processing requirements?

AUse Data Fusion's GUI to build the transformation pipelines, and then write the data into BigQuery

BConvert your PySpark into SparkSQL queries to transform the data and then run your pipeline on Dataproc to write the data into BigQuery.

CIngest your data into Cloud SQL convert your PySpark commands into SQL queries to transform the data, and then use federated queries from BigQuery for machine learning

DIngest your data into BigQuery using BigQuery Load, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table

Show Suggested Answer

Suggested Answer: B

by Jennifer at May 04, 2022, 10:32 AM

Limited Time Offer

25%

Off

Get Premium Professional Machine Learning Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!