exam questions

Exam Associate Data Practitioner All Questions

View all questions & answers for the Associate Data Practitioner exam

Exam Associate Data Practitioner topic 1 question 41 discussion

Actual exam question from Google's Associate Data Practitioner
Question #: 41
Topic #: 1
[All Associate Data Practitioner Questions]

Your organization needs to implement near real-time analytics for thousands of events arriving each second in Pub/Sub. The incoming messages require transformations. You need to configure a pipeline that processes, transforms, and loads the data into BigQuery while minimizing development time. What should you do?

  • A. Use a Google-provided Dataflow template to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
  • B. Create a Cloud Data Fusion instance and configure Pub/Sub as a source. Use Data Fusion to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
  • C. Load the data from Pub/Sub into Cloud Storage using a Cloud Storage subscription. Create a Dataproc cluster, use PySpark to perform transformations in Cloud Storage, and write the results to BigQuery.
  • D. Use Cloud Run functions to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
n2183712847
1 month, 3 weeks ago
Selected Answer: A
dataflow is real-time
upvoted 1 times
...
n2183712847
2 months ago
Selected Answer: A
The best solution to minimize development time while implementing near real-time analytics for high-volume Pub/Sub events with transformations and BigQuery loading is A. Use a Google-provided Dataflow template. Dataflow templates offer pre-built, optimized pipelines, drastically reducing development effort and time. Option B, Cloud Data Fusion, is a good visual alternative, but might require slightly more initial setup than deploying a template. Option C, Dataproc and PySpark, is significantly more complex and time-consuming. Option D, Cloud Run functions, while serverless, can become less manageable and more development effort for complex, high-volume streaming pipelines compared to dedicated pipeline services. Therefore, Option A is the most efficient and fastest path to implementation.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago