Exam Professional Machine Learning Engineer topic 1 question 273 discussion

Actual exam question from Google's Professional Machine Learning Engineer

Question #: 273
Topic #: 1

[All Professional Machine Learning Engineer Questions]

You work for a rapidly growing social media company. Your team builds TensorFlow recommender models in an on-premises CPU cluster. The data contains billions of historical user events and 100,000 categorical features. You notice that as the data increases, the model training time increases. You plan to move the models to Google Cloud. You want to use the most scalable approach that also minimizes training time. What should you do?

A. Deploy the training jobs by using TPU VMs with TPUv3 Pod slices, and use the TPUEmbeading API
B. Deploy the training jobs in an autoscaling Google Kubernetes Engine cluster with CPUs
C. Deploy a matrix factorization model training job by using BigQuery ML
D. Deploy the training jobs by using Compute Engine instances with A100 GPUs, and use the tf.nn.embedding_lookup API

Show Suggested Answer

Suggested Answer: A 🗳️

by daidai75 at Jan. 8, 2024, 12:12 p.m.

Comments

Submit Cancel

daidai75

Highly Voted 1 year ago

Selected Answer: A

TPU (Tensor Processing Units) VMs are specialized hardware accelerators designed by Google specifically for machine learning tasks. TPUv3 Pod slices offer high scalability and are excellent for distributed training tasks. The TPUEmbedding API is optimized for handling large volumes of categorical features, which fits your scenario with 100,000 categorical features. This option is likely to offer the fastest training times due to specialized hardware and optimized APIs for large-scale machine learning tasks.

upvoted 8 times

...

omermahgoub

Most Recent 9 months ago

Selected Answer: A

Addressing Bottleneck: As data size increases, CPU-based training becomes increasingly slow. TPUs are specifically designed to address this challenge, significantly accelerating training. Large Categorical Features: TPUEmbedding API efficiently handles embedding lookups for a vast number of categorical features, a common characteristic of recommender system data.

upvoted 4 times

...

JG123

10 months, 1 week ago

Option C

upvoted 1 times

...

guilhermebutzke

11 months ago

Selected Answer: A

My Answer: A: most scalable approach that also minimizes training time: TPU using TPUEmbeading API https://www.tensorflow.org/api_docs/python/tf/tpu/experimental/embedding/TPUEmbedding

upvoted 2 times

...