You used BigQuery ML to build a customer purchase propensity model six months ago. You want to compare the current serving data with the historical serving data to determine whether you need to retrain the model. What should you do?
The best option is C. Evaluate data drift. Option C is best because data drift specifically refers to changes in data distribution over time, directly indicating if the current serving data differs from the historical data the model was trained on, thus signaling a need for retraining. Option A (Compare models) is incorrect because comparing models doesn't directly assess changes in serving data. Option B (Evaluate skewness) is incorrect because skewness is a static data property, not a comparison of data over time. Option D (Compare confusion matrix) is incorrect because it evaluates model performance change, an indirect consequence of data drift, not the drift itself. Therefore, Option C is the most direct method to determine if data changes necessitate model retraining.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
n2183712847
1 month, 4 weeks ago