BlackFriday 2024! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Data Engineer Topic 5 Question 75 Discussion

Actual exam question for Google's Professional Data Engineer exam
Question #: 75
Topic #: 5
[All Professional Data Engineer Questions]

An aerospace company uses a proprietary data format to store its night dat

a. You need to connect this new data source to BigQuery and stream the data into BigQuery. You want to efficiency import the data into BigQuery where consuming as few resources as possible. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: D

Contribute your Thoughts:

Paz
7 months ago
Agreed, option D is the way to go. The only thing I'm a bit concerned about is the custom connector. I hope it's well-documented and easy to work with. Otherwise, we might spend more time than we'd like trying to get that set up. But overall, I think it's the most efficient solution.
upvoted 0 times
...
Lettie
7 months ago
I'm not sure about the other options. Using a standard Dataflow pipeline to store the raw data and then transform it later seems like it would waste a lot of resources. And a Dataproc job with Hive? That feels like overkill for this use case. I think the Beam/Dataflow approach is the way to go.
upvoted 0 times
...
Ludivina
7 months ago
I agree, option D does sound like the best approach. The Avro format will be more efficient than CSV, and the Apache Beam custom connector should give us the flexibility we need to handle the proprietary format. Plus, streaming the data directly into BigQuery will be more efficient than storing the raw data first and then transforming it.
upvoted 0 times
Malika
7 months ago
Streaming the data directly into BigQuery will be more efficient than storing the raw data first and then transforming it.
upvoted 0 times
...
Troy
7 months ago
I agree, the Apache Beam custom connector should give us the flexibility we need to handle the proprietary format.
upvoted 0 times
...
Desiree
7 months ago
Option D does sound like the best approach. The Avro format will be more efficient than CSV.
upvoted 0 times
...
...
Jeanice
7 months ago
Hmm, this is a tricky one. The proprietary data format is definitely a challenge, and we need to find an efficient way to get it into BigQuery. I'm leaning towards option D - using an Apache Beam custom connector to write a Dataflow pipeline that streams the data in Avro format. That way, we can preserve the structure of the data and avoid the overhead of converting to CSV.
upvoted 0 times
...

Save Cancel