An ML engineer has deployed an Amazon SageMaker model to a serverless endpoint in production. The model is invoked by the InvokeEndpoint API operation.
The model's latency in production is higher than the baseline latency in the test environment. The ML engineer thinks that the increase in latency is because of model startup time.
What should the ML engineer do to confirm or deny this hypothesis?
snna4
1 month, 4 weeks agosnna4
2 months agosnna4
1 month, 4 weeks agosnna4
1 month, 4 weeks agoeesa
2 months, 4 weeks agochris_spencer
3 months, 2 weeks ago