exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 116 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 116
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 GB. What hardware should you choose for your models?

  • A. A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM
  • B. A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM
  • C. A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM
  • D. A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
aw_49
Highly Voted 1 year, 11 months ago
Selected Answer: D
D: use CPU when models that contain many custom TensorFlow operations written in C++ https://cloud.google.com/tpu/docs/intro-to-tpu#cpus
upvoted 7 times
...
rajshiv
Most Recent 4 months, 4 weeks ago
Selected Answer: B
I do not agree that D is correct. The option provides significant CPU resources, but it lacks GPU acceleration, which is necessary for efficiently training large deep learning models with large datasets. While CPUs can handle certain operations, they are generally much slower for training deep learning models compared to GPUs or TPUs. Choice B provides the best hardware for deep learning workload, offering 16 NVIDIA A100 GPUs with 640 GB of GPU memory, along with sufficient CPU and RAM resources to handle large datasets and complex model architectures.
upvoted 2 times
bc3f222
1 month, 2 weeks ago
TensorFlow operations written in C++, so D
upvoted 1 times
...
...
edoo
1 year, 1 month ago
Selected Answer: D
B looks like unleashing a rocket launcher to swat a fly ("early-stage experiments"). D is enough (c++).
upvoted 2 times
...
tavva_prudhvi
1 year, 9 months ago
While it is true that using CPUs can be more efficient when dealing with custom TensorFlow operations written in C++, it is important to consider the specific requirements of your models. In his case, we mentioned large batch sizes (1024 examples), large example sizes (1 MB each), and large network sizes (20 GB). 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM. While this configuration would provide a high number of vCPUs for custom TensorFlow operations, it lacks the GPU memory and overall RAM necessary to handle the large batch sizes and network sizes of your models.
upvoted 1 times
...
ciro_li
1 year, 9 months ago
B: https://cloud.google.com/tpu/docs/intro-to-tpu#cpus
upvoted 2 times
pinimichele01
1 year ago
so D, not B...
upvoted 1 times
...
...
Voyager2
1 year, 10 months ago
Selected Answer: D
D: use CPU when models that contain many custom TensorFlow operations written in C++ https://cloud.google.com/tpu/docs/intro-to-tpu#cpus
upvoted 3 times
...
LoveExams
1 year, 11 months ago
Wouldn't all PC's work here? I could do this model on my own home PC just fine.
upvoted 3 times
...
M25
1 year, 11 months ago
Selected Answer: D
“writes custom TensorFlow ops in C++” -> use CPUs when “Models that contain many custom TensorFlow operations written in C++”: https://cloud.google.com/tpu/docs/intro-to-tpu#when_to_use_tpus
upvoted 2 times
...
Antmal
2 years ago
Selected Answer: B
The best hardware for your models would be a cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM. This hardware will give you the following benefits: High GPU memory: Each A100 GPU has 40 GB of memory, which is more than enough to store the weights and embeddings of your models. Large batch sizes: With 16 GPUs per machine, you can train your models with large batch sizes, which will improve training speed. Fast CPUs: The 96 vCPUs on each machine will provide the processing power you need to run your custom TensorFlow ops in C++. Adequate RAM: The 1.4 TB of RAM on each machine will ensure that your models have enough memory to train and run. The other options are not as suitable for your needs. Option A has less GPU memory, which will slow down training. Option B has more GPU memory, but it is also more expensive. Option C has a TPU, which is a good option for some deep learning tasks, but it is not as well-suited for your needs as a GPU cluster. Option D has more vCPUs and RAM, but it does not have enough GPU memory to train your models. Therefore, the best hardware for your models is a cluster with 2 a2-megagpu-16g machines.
upvoted 4 times
...
TNT87
2 years, 1 month ago
Selected Answer: B
To determine the appropriate hardware for training the models, we need to calculate the required memory and processing power based on the size of the model and the size of the input data. Given that the batch size is 1024 and each example is 1 MB, the total size of each batch is 1024 * 1 MB = 1024 MB = 1 GB. Therefore, we need to load 1 GB of data into memory for each batch. The total size of the network is 20 GB, which means that it can fit in the memory of most modern GPUs.
upvoted 3 times
...
JeanEl
2 years, 3 months ago
Selected Answer: D
It's D
upvoted 1 times
JeanEl
2 years, 3 months ago
https://cloud.google.com/tpu/docs/tpus
upvoted 2 times
...
...
hiromi
2 years, 4 months ago
Selected Answer: D
D CPUs are recommended for TensorFlow ops written in C++ - https://cloud.google.com/tpu/docs/tensorflow-ops (Cloud TPU only supports Python)
upvoted 3 times
John_Pongthorn
2 years, 3 months ago
GPU can apply through C++ implement,but C rule out for sure.
upvoted 3 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago