exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 24 discussion

Actual exam question from Microsoft's DP-100
Question #: 24
Topic #: 3
[All DP-100 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are creating a new experiment in Azure Machine Learning Studio.
One class has a much smaller number of observations than the other classes in the training set.
You need to select an appropriate data sampling strategy to compensate for the class imbalance.
Solution: You use the Stratified split for the sampling mode.
Does the solution meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
timosi
Highly Voted 3 years, 1 month ago
I would say the answer is correct. The question is not how to make the sample more balanced but how to deal with the unbalanced sample. And stratified approach helps to handle an unbalanced sample.
upvoted 17 times
beny
2 years, 8 months ago
Agree as well
upvoted 1 times
...
treadst0ne
2 years, 10 months ago
Totally agree.
upvoted 2 times
...
...
concernedCitizen
Highly Voted 3 years, 10 months ago
Apparently SMOTE is the only way in MSFT's mind to fix undersampled datasets
upvoted 15 times
a_1234567_
3 years, 9 months ago
It might seem so just based on a few undersampled questions :) But no. Even the auto ML has some variety suggested: https://docs.microsoft.com/en-us/azure/machine-learning/concept-manage-ml-pitfalls#handle-imbalanced-data
upvoted 8 times
...
...
umair_hanu
Most Recent 10 months ago
agreed with timosi
upvoted 1 times
...
mkk888
10 months, 2 weeks ago
there are other techniques apart from oversampling that work just as well, in those situations too you should use stratified split to make sure your test set has samples from the rarer class. so technically if you are using such a model(hyperparameter) you can achieve your goal, i'd say it should be yes but it's probably no cause microsoft has promoted SMOTE as the go to solution.
upvoted 1 times
...
krishna1818
11 months, 2 weeks ago
Selected Answer: B
Maybe SMOTE
upvoted 1 times
...
Sumit_DP100
1 year, 10 months ago
Stratified Sampling does not guarantee balanced dataset. It just makes sure the proportion of classes are sampled in equal proportion. The imbalance issue will still be there so SMOTE is the right option.
upvoted 7 times
...
FU_User
1 year, 11 months ago
Selected Answer: B
I guess the keyword is "compensate" If you have a stratified split you only guarantee that the labels are in the same proportion in test and train set (95/5 incoming data -> 95/5 training set and 95/5 testing set) This doesn't compensate for anything it just doesn't introduce a new problem on limited training data, for example not having a particular label in the training set at all in the worst case. As SMOTE generates new data points it "compensates".
upvoted 5 times
...
dija123
2 years, 4 months ago
Selected Answer: B
Given answer is correct
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago