Google Professional Data Engineer Exam - Topic 6 Question 28 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 28
Topic #: 6

[All Professional Data Engineer Questions]

You use BigQuery as your centralized analytics platform. New data is loaded every day, and an ETL pipeline modifies the original data and prepares it for the final users. This ETL pipeline is regularly modified and can generate errors, but sometimes the errors are detected only after 2 weeks. You need to provide a method to recover from these errors, and your backups should be optimized for storage costs. How should you organize your data in BigQuery and store your backups?

AOrganize your data in a single table, export, and compress and store the BigQuery data in Cloud Storage.

BOrganize your data in separate tables for each month, and export, compress, and store the data in Cloud Storage.

COrganize your data in separate tables for each month, and duplicate your data on a separate dataset in BigQuery.

DOrganize your data in separate tables for each month, and use snapshot decorators to restore the table to a time prior to the corruption.

Show Suggested Answer

Suggested Answer: D

by Merissa at May 04, 2022, 02:17 PM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Jenise

4 months ago

Option C seems unnecessary with the duplication. Why not just use snapshots?

upvoted 0 times

...

Lashawnda

5 months ago

I disagree, option B is more organized and cost-effective for backups.

upvoted 0 times

...

Ahmed

5 months ago

Wait, can you really restore to a prior state with snapshots? That’s cool!

upvoted 0 times

...

Chuck

5 months ago

Storing everything in one table (option A) sounds risky to me.

upvoted 0 times

...

Mireya

5 months ago

I think option D is the best for recovery. Snapshots are super useful!

upvoted 0 times

...

Paris

5 months ago

I feel like having separate tables for each month makes sense, but I’m not sure if duplicating data in BigQuery is the best approach for backups.

upvoted 0 times

...

Bok

5 months ago

I practiced a similar question where we had to optimize for storage costs, and I think exporting and compressing data in Cloud Storage was a key point.

upvoted 0 times

...

Annabelle

5 months ago

I'm not entirely sure, but I think using snapshot decorators could be a good way to restore data without needing to store duplicates.

upvoted 0 times

...

Loren

5 months ago

I remember we discussed the importance of organizing data by time periods, like months, to make it easier to manage and recover from errors.

upvoted 0 times

...

Helga

5 months ago

This one seems pretty straightforward. I think the answer is C - encryption requirement on all network traffic is not a prevention technique for IP spoofing.

upvoted 0 times

...

Ronnie

6 months ago

Ugh, I'm drawing a blank on this. I know there are different ways to install apps on Linux, but I can't recall the preferred source. I'll have to make an educated guess and hope for the best.

upvoted 0 times

...

Micaela

6 months ago

I think a VPN is mostly about security, maybe option C is correct? It creates a secure tunnel.

upvoted 0 times

...