exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 2 question 45 discussion

Actual exam question from Microsoft's DP-100
Question #: 45
Topic #: 2
[All DP-100 Questions]

HOTSPOT -
You are evaluating a Python NumPy array that contains six data points defined as follows: data = [10, 20, 30, 40, 50, 60]
You must generate the following output by using the k-fold algorithm implantation in the Python Scikit-learn machine learning library: train: [10 40 50 60], test: [20 30] train: [20 30 40 60], test: [10 50] train: [10 20 30 50], test: [40 60]
You need to implement a cross-validation to generate the output.
How should you complete the code segment? To answer, select the appropriate code segment in the dialog box in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: k-fold -

Box 2: 3 -
K-Folds cross-validator provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds (without shuffling by default).
The parameter n_splits ( int, default=3) is the number of folds. Must be at least 2.

Box 3: data -
Example: Example:
>>>
>>> from sklearn.model_selection import KFold
>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4])
>>> kf = KFold(n_splits=2)
>>> kf.get_n_splits(X)
>>> print(kf)
KFold(n_splits=2, random_state=None, shuffle=False)
>>> for train_index, test_index in kf.split(X):
... print("TRAIN:", train_index, "TEST:", test_index)
... X_train, X_test = X[train_index], X[test_index]
... y_train, y_test = y[train_index], y[test_index]
TRAIN: [2 3] TEST: [0 1]
TRAIN: [0 1] TEST: [2 3]
Reference:
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
podval
Highly Voted 4 years, 4 months ago
Proper syntax: from sklearn.model_selection import KFold
upvoted 23 times
David_Tadeu
2 years, 7 months ago
If the actual question is written with 'k-fold' instead of 'Kfold', that's just stupid.
upvoted 4 times
...
...
ljljljlj
Highly Voted 3 years, 4 months ago
On exam 2021/7/10
upvoted 5 times
...
Matt2000
Most Recent 10 months ago
from sklearn.model_selection import KFold from numpy import array import numpy as np data = array([10,20,30,40,50,60]) k_fold = KFold(n_splits=3, shuffle=True,random_state=1) for train, test in k_fold, np.split(data): print(f'train: {train}, test: {test}')
upvoted 2 times
...
Matt2000
10 months, 2 weeks ago
"-" shoud be read as "="
upvoted 1 times
...
Hisayuki
1 year, 1 month ago
You're gonna create three set of Train and Test dataset with Shuffling. So, the n_splits should be 3 in kfold. - train: [10 40 50 60], test: [20 30] - train: [20 30 40 60], test: [10 50] - train: [10 20 30 50], test: [40 60]
upvoted 3 times
...
ning
2 years, 6 months ago
Might be a typo, but overall is correct
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...