exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 39 discussion

Actual exam question from Microsoft's DP-100
Question #: 39
Topic #: 3
[All DP-100 Questions]

You are solving a classification task.
You must evaluate your model on a limited data sample by using k-fold cross-validation. You start by configuring a k parameter as the number of splits.
You need to configure the k parameter for the cross-validation.
Which value should you use?

  • A. k=1
  • B. k=10
  • C. k=0.5
  • D. k=0.9
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️
Leave One Out (LOO) cross-validation
Setting K = n (the number of observations) yields n-fold and is called leave-one out cross-validation (LOO), a special case of the K-fold approach.
LOO CV is sometimes useful but typically doesn't shake up the data enough. The estimates from each fold are highly correlated and hence their average can have high variance.
This is why the usual choice is K=5 or 10. It provides a good compromise for the bias-variance tradeoff.

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rjile
Highly Voted 3 years, 6 months ago
The choice of k is usually 5 or 10
upvoted 10 times
...
fhlos
Most Recent 11 months, 2 weeks ago
Selected Answer: B
B - ChatGPT When performing k-fold cross-validation to evaluate a classification model, you need to choose an appropriate value for the parameter k, which represents the number of splits or folds. In this scenario, the most commonly used value for k is: B. k = 10 Setting k to 10 means that the dataset will be divided into 10 equal-sized folds or subsets. The cross-validation process will then run 10 iterations, where each iteration uses 9 folds for training and 1 fold for validation. This allows for a comprehensive evaluation of the model's performance on different subsets of the data. It's worth noting that the choice of k can depend on factors such as the size of the dataset, the available computational resources, and the specific requirements of the task at hand. However, a value of k=10 is often considered a good starting point and is commonly used in practice for cross-validation. Therefore, option B (k = 10) is the appropriate value to configure the k parameter for the cross-validation in this classification task.
upvoted 1 times
...
serggar
3 years, 8 months ago
isn't this duplicated?
upvoted 3 times
gbganalyst
2 years, 11 months ago
Not at all.
upvoted 2 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...