Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Associate Data Practitioner Exam - Topic 4 Question 8 Discussion

Actual exam question for Google's Associate Data Practitioner exam
Question #: 8
Topic #: 4
[All Associate Data Practitioner Questions]

You need to create a data pipeline that streams event information from applications in multiple Google Cloud regions into BigQuery for near real-time analysis. The data requires transformation before loading. You want to create the pipeline using a visual interface. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: A

Pushing event information to a Pub/Sub topic and then creating a Dataflow job using the Dataflow job builder is the most suitable solution. The Dataflow job builder provides a visual interface to design pipelines, allowing you to define transformations and load data into BigQuery. This approach is ideal for streaming data pipelines that require near real-time transformations and analysis. It ensures scalability across multiple regions and integrates seamlessly with Pub/Sub for event ingestion and BigQuery for analysis.


Contribute your Thoughts:

0/2000 characters
Antonio
4 months ago
Wait, can we really do all that with just a visual interface?
upvoted 0 times
...
Jame
4 months ago
D seems outdated for this use case, right?
upvoted 0 times
...
Jose
4 months ago
C doesn't really fit the near real-time requirement.
upvoted 0 times
...
Giuseppe
4 months ago
I think B could work too, but it might be slower.
upvoted 0 times
...
Bong
5 months ago
Option A is the best choice for real-time processing!
upvoted 0 times
...
Tayna
5 months ago
I vaguely remember something about using Cloud Storage and external tables, but that seems more suited for batch processing rather than real-time analysis.
upvoted 0 times
...
Kattie
5 months ago
I feel like option A is a solid choice since Dataflow is designed for data processing, but I wonder if the visual interface part is covered there.
upvoted 0 times
...
Katheryn
5 months ago
I think option B sounds familiar because we practiced using Cloud Run for transformations, but I can't recall if it's the most efficient way to load data into BigQuery.
upvoted 0 times
...
Audrie
5 months ago
I remember we discussed using Pub/Sub for streaming data, but I'm not sure if Dataflow is the best choice for this specific scenario.
upvoted 0 times
...
Rusty
5 months ago
I like the simplicity of option C with the Pub/Sub to BigQuery subscription. That could be a good way to get the data loaded quickly without having to worry about the transformation step. I'll have to think through the pros and cons of that versus the other options.
upvoted 0 times
...
Ronnie
5 months ago
Option D with Cloud Storage and a scheduled BigQuery job seems like it could work, but I'm not sure if that fully meets the "near real-time" analysis requirement. I'd probably lean more towards A or B to get the data in faster.
upvoted 0 times
...
Cordell
5 months ago
Hmm, I'm a little unsure about this one. The requirement to use a visual interface makes me think option B with Cloud Run might be the way to go, but I'm not 100% confident that's the best approach. I'll need to review the details carefully.
upvoted 0 times
...
Sylvie
5 months ago
This looks like a pretty straightforward data pipeline setup. I think I'd go with option A - using Pub/Sub and Dataflow seems like the most direct way to get the data into BigQuery with the required transformations.
upvoted 0 times
...

Save Cancel