Google Exam Professional Data Engineer Topic 2 Question 62 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 62
Topic #: 2

[All Professional Data Engineer Questions]

You've migrated a Hadoop job from an on-premises cluster to Dataproc and Good Storage. Your Spark job is a complex analytical workload fiat consists of many shuffling operations, and initial data are parquet toes (on average 200-400 MB size each) You see some degradation in performance after the migration to Dataproc so you'd like to optimize for it. Your organization is very cost-sensitive so you'd Idee to continue using Dataproc on preemptibles (with 2 non-preemptibles workers only) for this workload. What should you do?

ASwitch from HODs to SSDs override the preemptible VMs configuration to increase the boot disk size

BIncrease the see of your parquet files to ensure them to be 1 GB minimum

CSwitch to TFRecords format (appr 200 MB per We) instead of parquet files

DSwitch from HDDs to SSDs. copy initial data from Cloud Storage to Hadoop Distributed File System (HDFS) run the Spark job and copy results back to Cloud Storage

Show Suggested Answer

Suggested Answer: A

by Sylvia at Dec 12, 2022, 04:31 AM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!