Exam Associate Data Practitioner topic 1 question 41 discussion

Actual exam question from Google's Associate Data Practitioner

Question #: 41
Topic #: 1

[All Associate Data Practitioner Questions]

Your organization needs to implement near real-time analytics for thousands of events arriving each second in Pub/Sub. The incoming messages require transformations. You need to configure a pipeline that processes, transforms, and loads the data into BigQuery while minimizing development time. What should you do?

A. Use a Google-provided Dataflow template to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
B. Create a Cloud Data Fusion instance and configure Pub/Sub as a source. Use Data Fusion to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
C. Load the data from Pub/Sub into Cloud Storage using a Cloud Storage subscription. Create a Dataproc cluster, use PySpark to perform transformations in Cloud Storage, and write the results to BigQuery.
D. Use Cloud Run functions to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

Show Suggested Answer

Suggested Answer: A 🗳️

by n2183712847 at Feb. 27, 2025, 4:58 p.m.

Comments

Submit Cancel

n2183712847

5 months ago

Selected Answer: A

dataflow is real-time

upvoted 1 times

...

n2183712847

5 months, 1 week ago

Selected Answer: A

The best solution to minimize development time while implementing near real-time analytics for high-volume Pub/Sub events with transformations and BigQuery loading is A. Use a Google-provided Dataflow template. Dataflow templates offer pre-built, optimized pipelines, drastically reducing development effort and time. Option B, Cloud Data Fusion, is a good visual alternative, but might require slightly more initial setup than deploying a template. Option C, Dataproc and PySpark, is significantly more complex and time-consuming. Option D, Cloud Run functions, while serverless, can become less manageable and more development effort for complex, high-volume streaming pipelines compared to dedicated pipeline services. Therefore, Option A is the most efficient and fastest path to implementation.

upvoted 1 times

...