exam questions

Exam Associate Data Practitioner All Questions

View all questions & answers for the Associate Data Practitioner exam

Exam Associate Data Practitioner topic 1 question 30 discussion

Actual exam question from Google's Associate Data Practitioner
Question #: 30
Topic #: 1
[All Associate Data Practitioner Questions]

You are predicting customer churn for a subscription-based service. You have a 50 PB historical customer dataset in BigQuery that includes demographics, subscription information, and engagement metrics. You want to build a churn prediction model with minimal overhead. You want to follow the Google-recommended approach. What should you do?

  • A. Export the data from BigQuery to a local machine. Use scikit-learn in a Jupyter notebook to build the churn prediction model.
  • B. Use Dataproc to create a Spark cluster. Use the Spark MLlib within the cluster to build the churn prediction model.
  • C. Create a Looker dashboard that is connected to BigQuery. Use LookML to predict churn.
  • D. Use the BigQuery Python client library in a Jupyter notebook to query and preprocess the data in BigQuery. Use the CREATE MODEL statement in BigQueryML to train the churn prediction model.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
n2183712847
2 months ago
Selected Answer: D
The best and Google-recommended solution for building a churn model on a 50 PB BigQuery dataset with minimal overhead is D. Use BigQuery Python client and BigQueryML. BigQueryML enables in-database model training, eliminating data movement and minimizing overhead. This aligns with Google's best practices for BigQuery data. Option A (Local scikit-learn) is impractical due to the dataset size. Option B (Dataproc/Spark) introduces unnecessary data movement and cluster management overhead. Option C (Looker) is for BI, not ML model development. Therefore, Option D is the optimal choice for efficiency, scalability, and adherence to Google's recommendations for BigQuery-based machine learning.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago