Cyber Monday 2024! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Data Engineer Topic 4 Question 74 Discussion

Actual exam question for Google's Professional Data Engineer exam
Question #: 74
Topic #: 4
[All Professional Data Engineer Questions]

You are collecting loT sensor data from millions of devices across the world and storing the data in BigQuery. Your access pattern is based on recent data tittered by location_id and device_version with the following query:

You want to optimize your queries for cost and performance. How should you structure your data?

Show Suggested Answer Hide Answer
Suggested Answer: C

Contribute your Thoughts:

Terrilyn
8 months ago
That's a good point, Candida. I was also considering option B, but I'm a little concerned about the potential for data skew if some locations or device versions are much more heavily used than others.
upvoted 0 times
Tayna
7 months ago
C: Good point, we should weigh the benefits of both before making a decision.
upvoted 0 times
...
Rebbecca
8 months ago
B: True, but we should consider the potential for data skew with clustering.
upvoted 0 times
...
Rory
8 months ago
A: It could, but partitioning can also help with organizing the data efficiently.
upvoted 0 times
...
Cordie
8 months ago
D: I think clustering would further improve query performance.
upvoted 0 times
...
Amie
8 months ago
C: But what about clustering the table data by create_date, location_id and device_version?
upvoted 0 times
...
Shakira
8 months ago
B: I agree, that would help optimize the queries for cost and performance.
upvoted 0 times
...
Dalene
8 months ago
A: You should partition table data by create_date, location_id and device_version.
upvoted 0 times
...
...
Candida
8 months ago
Hmm, let me think this through. I'm leaning towards option B because partitioning by create_date and clustering by location_id and device_version seems like it could give us the best of both worlds in terms of querying efficiency.
upvoted 0 times
...
Hyman
8 months ago
Haha, this is starting to sound like a real-life engineering meeting. I'm glad we're all putting in the effort to think this through carefully.
upvoted 0 times
...
Cassie
8 months ago
Ah, good catch, Michael. That's a really important consideration. Maybe option D could be a better choice, with clustering by create_date and partitioning by location and device_version?
upvoted 0 times
...

Save Cancel