You built a custom ML model using scikit-learn. Training time is taking longer than expected. You decide to migrate your model to Vertex AI Training, and you want to improve the model’s training time. What should you try out first?
A.
Train your model in a distributed mode using multiple Compute Engine VMs.
B.
Train your model using Vertex AI Training with CPUs.
C.
Migrate your model to TensorFlow, and train it using Vertex AI Training.
D.
Train your model using Vertex AI Training with GPUs.
Scikit-learn generally relies on CPU-based computations and does not natively leverage GPUs for most algorithms.
Answer A is the best first step to improve training time without sacrificing model performance
Minimal changes – You can quickly migrate your existing scikit-learn code to Vertex AI Training using CPU instances.
âś… Vertex AI prebuilt containers already support scikit-learn with CPU (no extra setup needed).
âś… Lower cost than distributed training or switching to another framework.
✅ Good for establishing a baseline – Once you see how long it takes on Vertex AI, you can decide if further optimization (like distributed training) is needed.
B) The statement asks the FIRST STEP to take. Considering:
- Scikit-learn's limited and non-universal GPU support
- Higher cost associated with GPU instances
The first sensible approach would indeed be to first migrate the model to Vertex AI using CPUs to establish a baseline training time.
This allows for a direct comparison with the existing training setup and helps determine if the improvement from CPU to GPU is necessary.
Scikit-learn is not intended to be used as a deep-learning framework and it does not provide any GPU support. (Ref: https://stackoverflow.com/questions/41567895/will-scikit-learn-utilize-gpu).
So I go with B
You decided to migrate to Vertex AI, If you have a model that requires significant computational resources and doesn't rely heavily on specialized GPU operations (like those in option D), then option B might still be a good choice. However, if your model is computationally intensive or involves complex neural network architectures I would go with D instead of B.
B is correct, because scikit only has CPU support for the following services:
- prebuilt containers for custom training (this is the case here)
- prebuilt containers for predictions and explanations
- Vertex AI Pipelines
- Vertex AI Workbench user-managed notebooks
https://cloud.google.com/vertex-ai/docs/supported-frameworks-list#scikit-learn_2
Scikit-learn doesn't natively support GPUs for training. However, many scikit-learn algorithms rely on libraries like NumPy and SciPy. These libraries can leverage GPUs if they're available on the system, potentially benefiting scikit-learn models indirectly.
B. Train your model using Vertex AI Training with CPUs.
No GPUs for ScikitLearn, but parrallelize/distribute training is a good way to increase model building
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
desertlotus1211
1Â month, 4Â weeks agovini123
2Â months, 3Â weeks agolunalongo
4Â months, 3Â weeks agorajshiv
5Â months agoTanTran04
10Â months agoAzureDP900
10Â months, 2Â weeks agoAnnaR
1Â year agoCarlose2108
1Â year, 2Â months agoguilhermebutzke
1Â year, 2Â months agob1a8fae
1Â year, 3Â months agoVMHarry
1Â year, 4Â months agovale_76_na_xxx
1Â year, 4Â months agomlx
1Â year, 4Â months ago