An ML engineer has deployed an Amazon SageMaker model to a serverless endpoint in production. The model is invoked by the InvokeEndpoint API operation.
The model's latency in production is higher than the baseline latency in the test environment. The ML engineer thinks that the increase in latency is because of model startup time.
What should the ML engineer do to confirm or deny this hypothesis?
snna4
Highly Voted 3 months, 3 weeks agoeesa
Highly Voted 4 months, 3 weeks ago67495ef
Most Recent 1 month agorebobe1958
1 month, 2 weeks agopostbox4me
1 month, 3 weeks agosnna4
3 months, 3 weeks agosnna4
3 months, 3 weeks agosnna4
3 months, 3 weeks agochris_spencer
5 months, 1 week ago