exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 200 discussion

A company is building a machine learning (ML) model to classify images of plants. An ML specialist has trained the model using the Amazon SageMaker built-in Image Classification algorithm. The model is hosted using a SageMaker endpoint on an ml.m5.xlarge instance for real-time inference. When used by researchers in the field, the inference has greater latency than is acceptable. The latency gets worse when multiple researchers perform inference at the same time on their devices. Using Amazon CloudWatch metrics, the ML specialist notices that the ModelLatency metric shows a high value and is responsible for most of the response latency.

The ML specialist needs to fix the performance issue so that researchers can experience less latency when performing inference from their devices.

Which action should the ML specialist take to meet this requirement?

  • A. Change the endpoint instance to an ml.t3 burstable instance with the same vCPU number as the ml.m5.xlarge instance has.
  • B. Attach an Amazon Elastic Inference ml.eia2.medium accelerator to the endpoint instance.
  • C. Enable Amazon SageMaker Autopilot to automatically tune performance of the model.
  • D. Change the endpoint instance to use a memory optimized ML instance.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
tsangckl
Highly Voted 1 year, 5 months ago
Selected Answer: B
It's B https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-endpoint-latency/
upvoted 10 times
hichemck
1 year, 5 months ago
false. its C. in the link you shared under High ModelLatency, it states "If an endpoint is overused, it might cause higher model latency. You can add Auto scaling to an endpoint to dynamically increase and decrease the number of instances available for an instance."
upvoted 1 times
kaike_reis
9 months ago
C is the most wrong solution.
upvoted 3 times
...
Peeking
1 year, 5 months ago
Autopilot is is not autoscaling in AWS. Autopilot is for model training, Autoscaling is during inference.
upvoted 3 times
...
...
...
loict
Most Recent 8 months ago
Selected Answer: B
A. NO - that is image processing so more CPU would only provide incremental improvement B. YES - that is image processing so GPU would provide a step change; supported by the built-in algorithm C. NO - Autopilot is for training, not inference D. NO - usually inference uses little memory
upvoted 2 times
...
Mickey321
9 months, 2 weeks ago
Selected Answer: B
Attach an Amazon Elastic Inference ml.eia2.medium accelerator to the endpoint instance. Amazon Elastic Inference allows users to attach low-cost GPU-powered acceleration to Amazon EC2 and SageMaker instances or Amazon ECS tasks, to reduce the cost of running deep learning inference by up to 75%
upvoted 1 times
...
Maaayaaa
1 year ago
B is not correct anymore. After April 15, 2023, new customers will not be able to launch instances with Amazon EI accelerators in Amazon SageMaker, Amazon ECS, or Amazon EC2. (https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html)
upvoted 1 times
robotgeek
9 months, 2 weeks ago
Changes in exams apply 6 months after the change as been applied (oct 2023)
upvoted 2 times
ccpmad
9 months, 1 week ago
Freak!
upvoted 2 times
...
...
...
Mllb
1 year, 1 month ago
Selected Answer: B
It's B. Burstable instances only solves when lot of users are making inferences at the same time
upvoted 2 times
ZSun
1 year ago
bro, isn't this exactly the question is asking for?
upvoted 2 times
...
...
AjoseO
1 year, 2 months ago
Selected Answer: B
The ModelLatency metric shows that the model inference time is causing the latency issue. Amazon Elastic Inference is designed to speed up the inference process of a machine learning model without needing to deploy the model on a more powerful instance. By attaching an Elastic Inference accelerator to the endpoint instance, the ML specialist can offload the compute-intensive parts of the inference process to the accelerator, resulting in faster inference times and lower latency.
upvoted 2 times
...
It626
1 year, 4 months ago
B - https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-endpoint-latency/
upvoted 1 times
...
Peeking
1 year, 5 months ago
Selected Answer: B
Elastic Inference accelerator ( and AutoScaling but there is no autoscaling in the option). Be aware that Autopilot is is not autoscaling.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago