Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Data Engineer Topic 5 Question 103 Discussion

Actual exam question for Google's Professional Data Engineer exam
Question #: 103
Topic #: 5
[All Professional Data Engineer Questions]

Your company's customer_order table in BigOuery stores the order history for 10 million customers, with a table size of 10 PB. You need to create a dashboard for the support team to view the order history. The dashboard has two filters, countryname and username. Both are string data types in the BigQuery table. When a filter is applied, the dashboard fetches the order history from the table and displays the query results. However, the dashboard is slow to show the results when applying the filters to the following query:

How should you redesign the BigQuery table to support faster access?

Show Suggested Answer Hide Answer
Suggested Answer: C

To improve the performance of querying a large BigQuery table with filters on countryname and username, clustering the table by these fields is the most effective approach. Here's why option C is the best choice:

Clustering in BigQuery:

Clustering organizes data based on the values in specified columns. This can significantly improve query performance by reducing the amount of data scanned during query execution.

Clustering by countryname and username means that data is physically sorted and stored together based on these fields, allowing BigQuery to quickly locate and read only the relevant data for queries using these filters.

Filter Efficiency:

With the table clustered by countryname and username, queries that filter on these columns can benefit from efficient data retrieval, reducing the amount of data processed and speeding up query execution.

This directly addresses the performance issue of the dashboard queries that apply filters on these fields.

Steps to Implement:

Redesign the Table:

Create a new table with clustering on countryname and username:

CREATE TABLE project.dataset.new_table

CLUSTER BY countryname, username AS

SELECT * FROM project.dataset.customer_order;

Migrate Data:

Transfer the existing data from the original table to the new clustered table.

Update Queries:

Modify the dashboard queries to reference the new clustered table.


BigQuery Clustering Documentation

Optimizing Query Performance

Contribute your Thoughts:

Cecily
10 days ago
I'm not sure. Wouldn't partitioning by _PARTITIONTIME also help with faster access?
upvoted 0 times
...
Alba
14 days ago
Haha, I bet the support team is just going to use the filters to find out which customers have the most orders and then send them a coupon or something. Just kidding, but you know they're probably thinking it.
upvoted 0 times
...
Mattie
15 days ago
Clustering the table by country field and partitioning by username field sounds like a clever solution. That way, the data will be organized in a way that makes it easy to filter by both criteria.
upvoted 0 times
...
Carin
17 days ago
I agree with Judy. Clustering by country and partitioning by username seems like the best option for faster access.
upvoted 0 times
...
Glendora
20 days ago
Partitioning by _PARTITIONTIME is a good general-purpose approach, but in this case, it might not be the most optimal solution since the filters are based on country and username.
upvoted 0 times
...
Twila
22 days ago
Clustering the table by country and username fields could also work, but partitioning might be more efficient as it creates separate files for each combination of country and username.
upvoted 0 times
...
Tracie
24 days ago
The partitioning by country and username fields seems like a good approach to improve the query performance. That way, the data will be organized in a way that aligns with the filter criteria.
upvoted 0 times
Rana
1 days ago
The partitioning by country and username fields seems like a good approach to improve the query performance. That way, the data will be organized in a way that aligns with the filter criteria.
upvoted 0 times
...
Colette
2 days ago
A) Cluster the table by country field, and partition by username field.
upvoted 0 times
...
...
Judy
27 days ago
I think we should cluster the table by country field and partition by username field.
upvoted 0 times
...

Save Cancel