You work for a small company that has deployed an ML model with autoscaling on Vertex AI to serve online predictions in a production environment. The current model receives about 20 prediction requests per hour with an average response time of one second. You have retrained the same model on a new batch of data, and now you are canary testing it, sending ~10% of production traffic to the new model. During this canary test, you notice that prediction requests for your new model are taking between 30 and 180 seconds to complete. What should you do?
sonicclasps
Highly Voted 1 year, 4 months agoBegum
Most Recent 1 month agodesertlotus1211
3 months, 2 weeks agovini123
4 months, 1 week agopotomeek
5 months, 1 week agoYushiSato
10 months, 1 week agoYushiSato
10 months, 1 week agoAnnaR
1 year, 1 month agopinimichele01
1 year, 2 months agopinimichele01
1 year, 2 months agoVipinSingla
1 year, 2 months agoAastha_Vashist
1 year, 3 months agorajshiv
6 months, 2 weeks agoCarlose2108
1 year, 3 months agoguilhermebutzke
1 year, 4 months agovaibavi
1 year, 4 months agolunalongo
6 months, 1 week agob1a8fae
1 year, 5 months agokalle_balle
1 year, 5 months agoedoo
1 year, 3 months ago