exam questions

Exam Associate Data Practitioner All Questions

View all questions & answers for the Associate Data Practitioner exam

Exam Associate Data Practitioner topic 1 question 21 discussion

Actual exam question from Google's Associate Data Practitioner
Question #: 21
Topic #: 1
[All Associate Data Practitioner Questions]

You need to create a weekly aggregated sales report based on a large volume of data. You want to use Python to design an efficient process for generating this report. What should you do?

  • A. Create a Cloud Run function that uses NumPy. Use Cloud Scheduler to schedule the function to run once a week.
  • B. Create a Colab Enterprise notebook and use the bigframes.pandas library. Schedule the notebook to execute once a week.
  • C. Create a Cloud Data Fusion and Wrangler flow. Schedule the flow to run once a week.
  • D. Create a Dataflow directed acyclic graph (DAG) coded in Python. Use Cloud Scheduler to schedule the code to run once a week.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Rio55
1 month, 1 week ago
Selected Answer: D
Option D (Dataflow DAG coded in Python) and Option B using (bigframes.pandas in Colab Enterprise) seem to be the most suitable. Option D, Dataflow is designed for parallel processing of large datasets, making it efficient. Python is the chosen language, and Cloud Scheduler handles the weekly scheduling. Option B, using bigframes.pandas in Colab Enterprise is also a possibility for leveraging Python and BigQuery, but Dataflow is generally more robust and scalable for production ETL pipelines involving large datasets. Option A with Cloud Run might face limitations with large data and execution time. Option C doesn't directly use Python code for the process design. I would choose D.
upvoted 2 times
...
n2183712847
2 months ago
Selected Answer: B
Option B (Colab Enterprise + bigframes.pandas) and Option D (Dataflow DAG in Python) are the most efficient options for handling large data volumes and using Python. Option B is likely more straightforward and faster to implement for generating a report, especially if the analyst is familiar with Pandas and notebooks. bigframes.pandas simplifies interaction with BigQuery data within a Python environment for reporting purposes. Option D is more robust and scalable for general data pipelines and potentially more complex transformations, but might be overkill for a weekly reporting task compared to the ease of use offered by bigframes.pandas in Colab Enterprise. Given the need for an "efficient process for generating this report" and the desire to use Python, Option B provides a very efficient and relatively simpler path for creating a weekly aggregated sales report directly from BigQuery using Python's familiar Pandas-like syntax via bigframes.pandas in a scheduled Colab Enterprise notebook. Final Answer: The final answer is B B ​
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago