A company is building a machine learning (ML) model to classify images of plants. An ML specialist has trained the model using the Amazon SageMaker built-in Image Classification algorithm. The model is hosted using a SageMaker endpoint on an ml.m5.xlarge instance for real-time inference. When used by researchers in the field, the inference has greater latency than is acceptable. The latency gets worse when multiple researchers perform inference at the same time on their devices. Using Amazon CloudWatch metrics, the ML specialist notices that the ModelLatency metric shows a high value and is responsible for most of the response latency.
The ML specialist needs to fix the performance issue so that researchers can experience less latency when performing inference from their devices.
Which action should the ML specialist take to meet this requirement?
tsangckl
Highly Voted 1 year, 5 months agohichemck
1 year, 5 months agokaike_reis
9 months agoPeeking
1 year, 5 months agoloict
Most Recent 8 months agoMickey321
9 months, 2 weeks agoMaaayaaa
1 year agorobotgeek
9 months, 2 weeks agoccpmad
9 months, 1 week agoMllb
1 year, 1 month agoZSun
1 year agoAjoseO
1 year, 2 months agoIt626
1 year, 4 months agoPeeking
1 year, 5 months ago