exam questions

Exam Associate Data Practitioner All Questions

View all questions & answers for the Associate Data Practitioner exam

Exam Associate Data Practitioner topic 1 question 25 discussion

Actual exam question from Google's Associate Data Practitioner
Question #: 25
Topic #: 1
[All Associate Data Practitioner Questions]

Your team is building several data pipelines that contain a collection of complex tasks and dependencies that you want to execute on a schedule, in a specific order. The tasks and dependencies consist of files in Cloud Storage, Apache Spark jobs, and data in BigQuery. You need to design a system that can schedule and automate these data processing tasks using a fully managed approach. What should you do?

  • A. Use Cloud Scheduler to schedule the jobs to run.
  • B. Use Cloud Tasks to schedule and run the jobs asynchronously.
  • C. Create directed acyclic graphs (DAGs) in Cloud Composer. Use the appropriate operators to connect to Cloud Storage, Spark, and BigQuery.
  • D. Create directed acyclic graphs (DAGs) in Apache Airflow deployed on Google Kubernetes Engine. Use the appropriate operators to connect to Cloud Storage, Spark, and BigQuery.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
n2183712847
5 months, 1 week ago
Selected Answer: C
The best fully managed solution for scheduling and automating complex data pipelines is C. Use Cloud Composer with DAGs and appropriate operators. Cloud Composer, being a fully managed Apache Airflow service, is specifically designed for orchestrating complex workflows with dependencies and offers built-in operators to connect to Cloud Storage, Spark (via Dataproc), and BigQuery. Option D (Airflow on GKE) is not fully managed and adds operational overhead. Options A (Cloud Scheduler) and B (Cloud Tasks) are not designed for complex workflow orchestration and dependency management. Therefore, Option C is the optimal choice for a fully managed, robust, and feature-rich solution for data pipeline orchestration.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...