A company is building a machine learning (ML) model to classify images of plants. An ML specialist has trained the model using the Amazon SageMaker built-in Image Classification algorithm. The model is hosted using a SageMaker endpoint on an ml.m5.xlarge instance for real-time inference. When used by researchers in the field, the inference has greater latency than is acceptable. The latency gets worse when multiple researchers perform inference at the same time on their devices. Using Amazon CloudWatch metrics, the ML specialist notices that the ModelLatency metric shows a high value and is responsible for most of the response latency.
The ML specialist needs to fix the performance issue so that researchers can experience less latency when performing inference from their devices.
Which action should the ML specialist take to meet this requirement?
tsangckl
Highly Voted 1 year, 7 months agohichemck
1 year, 7 months agokaike_reis
10 months, 3 weeks agoPeeking
1 year, 6 months agoloict
Most Recent 9 months, 3 weeks agoMickey321
11 months agoMaaayaaa
1 year, 2 months agorobotgeek
11 months, 1 week agoccpmad
11 months agoMllb
1 year, 3 months agoZSun
1 year, 2 months agoAjoseO
1 year, 4 months agoIt626
1 year, 5 months agoPeeking
1 year, 6 months ago