You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?
TNT87
Highly Voted 9Â months, 1Â week agojulliet
Most Recent 7Â months agoM25
7Â months, 1Â week agoM25
7Â months, 1Â week agoares81
11Â months, 2Â weeks agohiromi
12Â months agohiromi
12Â months agomil_spyro
1Â year ago