Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Data Engineer Topic 5 Question 75 Discussion

Actual exam question for Google's Professional Data Engineer exam
Question #: 75
Topic #: 5
[All Professional Data Engineer Questions]

An aerospace company uses a proprietary data format to store its night dat

a. You need to connect this new data source to BigQuery and stream the data into BigQuery. You want to efficiency import the data into BigQuery where consuming as few resources as possible. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: D

Contribute your Thoughts:

Paz
10 months ago
Agreed, option D is the way to go. The only thing I'm a bit concerned about is the custom connector. I hope it's well-documented and easy to work with. Otherwise, we might spend more time than we'd like trying to get that set up. But overall, I think it's the most efficient solution.
upvoted 0 times
...
Lettie
10 months ago
I'm not sure about the other options. Using a standard Dataflow pipeline to store the raw data and then transform it later seems like it would waste a lot of resources. And a Dataproc job with Hive? That feels like overkill for this use case. I think the Beam/Dataflow approach is the way to go.
upvoted 0 times
...
Ludivina
10 months ago
I agree, option D does sound like the best approach. The Avro format will be more efficient than CSV, and the Apache Beam custom connector should give us the flexibility we need to handle the proprietary format. Plus, streaming the data directly into BigQuery will be more efficient than storing the raw data first and then transforming it.
upvoted 0 times
Malika
10 months ago
Streaming the data directly into BigQuery will be more efficient than storing the raw data first and then transforming it.
upvoted 0 times
...
Troy
10 months ago
I agree, the Apache Beam custom connector should give us the flexibility we need to handle the proprietary format.
upvoted 0 times
...
Desiree
10 months ago
Option D does sound like the best approach. The Avro format will be more efficient than CSV.
upvoted 0 times
...
...
Jeanice
10 months ago
Hmm, this is a tricky one. The proprietary data format is definitely a challenge, and we need to find an efficient way to get it into BigQuery. I'm leaning towards option D - using an Apache Beam custom connector to write a Dataflow pipeline that streams the data in Avro format. That way, we can preserve the structure of the data and avoid the overhead of converting to CSV.
upvoted 0 times
...

Save Cancel