exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 59 discussion

Actual exam question from Microsoft's DP-100
Question #: 59
Topic #: 3
[All DP-100 Questions]

HOTSPOT -
You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).
The remaining 1,000 rows represent class 1 (10 percent).
The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the
Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.
You need to configure the module.
Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: 300 -
You type 300 (%), the module triples the percentage of minority cases (3000) compared to the original dataset (1000).

Box 2: 5 -
We should use 5 data rows.
Use the Number of nearest neighbors option to determine the size of the feature space that the SMOTE algorithm uses when in building new cases. A nearest neighbor is a row of data (a case) that is very similar to some target case. The distance between any two cases is measured by combining the weighted vectors of all features.
By increasing the number of nearest neighbors, you get features from more cases.
By keeping the number of nearest neighbors low, you use features that are more like those in the original sample.
Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/smote

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
azurecert2021
Highly Voted 2 years, 11 months ago
based on example on below link it looks like given answer is correct. https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/smote#examples
upvoted 7 times
...
james2033
Most Recent 8 months ago
The given answer is correct . 1000 + 300% * 1000 = 4000 1 item has 5 nearest neighbors. Question keyword 'increase the number of training examples for class 1 to 4000 by using 5 data rows'
upvoted 2 times
...
michaelmorar
1 year, 6 months ago
300% makes mathematical sense (we need to increase 1000 by 3000 to reach 4000). 5 nearest neighbours also seems to agree with the stipulation of using 5 rows. SO, answer to me is correct.
upvoted 4 times
...
azurecert2021
2 years, 11 months ago
based on example on below link it looks like given answer is correct as smote % 0 means original data set and 3000 and 4000 is to much high. even neighbors value 4000 is too high value for 10,000 rows data set. https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/smote#examples
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...