exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 203 discussion

An ecommerce company wants to train a large image classification model with 10,000 classes. The company runs multiple model training iterations and needs to minimize operational overhead and cost. The company also needs to avoid loss of work and model retraining.

Which solution will meet these requirements?

  • A. Create the training jobs as AWS Batch jobs that use Amazon EC2 Spot Instances in a managed compute environment.
  • B. Use Amazon EC2 Spot Instances to run the training jobs. Use a Spot Instance interruption notice to save a snapshot of the model to Amazon S3 before an instance is terminated.
  • C. Use AWS Lambda to run the training jobs. Save model weights to Amazon S3.
  • D. Use managed spot training in Amazon SageMaker. Launch the training jobs with checkpointing enabled.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Peeking
Highly Voted 1 year, 5 months ago
Selected Answer: D
https://docs.aws.amazon.com/sagemaker/latest/dg/model-managed-spot-training.html Managed spot training can optimize the cost of training models up to 90% over on-demand instances. SageMaker manages the Spot interruptions on your behalf. "Spot instances can be interrupted, causing jobs to take longer to start or finish. You can configure your managed spot training job to use checkpoints. SageMaker copies checkpoint data from a local path to Amazon S3. When the job is restarted, SageMaker copies the data from Amazon S3 back into the local path. The training job can then resume from the last checkpoint instead of restarting."
upvoted 8 times
...
Amit11011996
Highly Voted 1 year, 5 months ago
Selected Answer: D
It has to be D. With Spot training we can reduce the cost and save the model weights with checkpoint enabled.
upvoted 6 times
hichemck
1 year, 5 months ago
agree. managed spot training is also cost effective
upvoted 4 times
...
...
loict
Most Recent 8 months ago
Selected Answer: D
A. NO - D is simpler B. NO - D is simpler C. NO - D is simpler D. YES - works out-of-the-box
upvoted 1 times
...
Mickey321
9 months, 2 weeks ago
Selected Answer: D
Managed spot training in Amazon SageMaker uses Amazon EC2 Spot instances to run training jobs, which can optimize the cost of training models by up to 90% over on-demand instances 1. SageMaker manages the Spot interruptions on the company’s behalf 1. By enabling checkpointing, the company can ensure that if a Spot instance is interrupted, the training job can resume from the last checkpoint instead of restarting, avoiding loss of work and model retraining 1
upvoted 1 times
...
Gaby999
1 year ago
Selected Answer: D Use managed spot training in Amazon SageMaker. Launch the training jobs with checkpointing enabled. Managed spot training in Amazon SageMaker can help minimize operational overhead and cost by using spot instances to perform the training. This can significantly reduce the cost of training, while still achieving the same accuracy. SageMaker provides built-in checkpointing capability, which allows saving model weights and progress to Amazon S3 periodically. This ensures that even if the spot instances are terminated, the training can resume from the last saved checkpoint. Additionally, SageMaker provides a managed service, so the ecommerce company does not need to worry about managing the infrastructure, and can focus on building and tuning their model.
upvoted 1 times
...
Gaby999
1 year ago
Selected Answer: D The ML specialist should choose option D, which provides the training data to SageMaker with the least development overhead. This option involves putting the TFRecord data into an Amazon S3 bucket and pointing the SageMaker training invocation to the S3 bucket without reformatting the training data. Using SageMaker script mode is a convenient way to execute training scripts without any modification. Since the training script train.py already works with TFRecord data, it can be used as is without any changes. By storing the data in S3 and accessing it from there, the specialist can take advantage of SageMaker's built-in data distribution and parallelization capabilities, which can significantly speed up training. Rewriting the train.py script or using additional services like AWS Glue or Lambda would add unnecessary complexity and increase development overhead.
upvoted 1 times
...
AjoseO
1 year, 2 months ago
Selected Answer: D
Managed spot training in Amazon SageMaker provides a cost-effective way to run large machine learning workloads. With managed spot training, the training jobs are executed using Amazon EC2 Spot instances, which can significantly reduce the cost of training. Additionally, by launching training jobs with checkpointing enabled, the work done up to the last checkpoint is saved to Amazon S3. This ensures that the training job can be resumed from the last checkpoint in case of instance failure or termination. This minimizes the risk of data loss and avoids the need for retraining the model from scratch. Using Amazon SageMaker also reduces the operational overhead required to set up and manage the training environment.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago