Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 24 discussion

Actual exam question from Microsoft's DP-100

Question #: 24
Topic #: 3

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are creating a new experiment in Azure Machine Learning Studio.
One class has a much smaller number of observations than the other classes in the training set.
You need to select an appropriate data sampling strategy to compensate for the class imbalance.
Solution: You use the Stratified split for the sampling mode.
Does the solution meet the goal?

A. Yes
B. No

Show Suggested Answer

Suggested Answer: B 🗳️

by concernedCitizen at July 13, 2020, 4:42 p.m.

Comments

Submit Cancel

timosi

Highly Voted 3 years, 4 months ago

I would say the answer is correct. The question is not how to make the sample more balanced but how to deal with the unbalanced sample. And stratified approach helps to handle an unbalanced sample.

upvoted 17 times

beny

2 years, 11 months ago

Agree as well

upvoted 1 times

...

treadst0ne

3 years, 1 month ago

Totally agree.

upvoted 2 times

...

concernedCitizen

Highly Voted 4 years ago

Apparently SMOTE is the only way in MSFT's mind to fix undersampled datasets

upvoted 15 times

a_1234567_

4 years ago

It might seem so just based on a few undersampled questions :) But no. Even the auto ML has some variety suggested: https://docs.microsoft.com/en-us/azure/machine-learning/concept-manage-ml-pitfalls#handle-imbalanced-data

upvoted 8 times

...

umair_hanu

Most Recent 1 year ago

agreed with timosi

upvoted 1 times

...

mkk888

1 year, 1 month ago

there are other techniques apart from oversampling that work just as well, in those situations too you should use stratified split to make sure your test set has samples from the rarer class. so technically if you are using such a model(hyperparameter) you can achieve your goal, i'd say it should be yes but it's probably no cause microsoft has promoted SMOTE as the go to solution.

upvoted 1 times

...

krishna1818

1 year, 2 months ago

Selected Answer: B

Maybe SMOTE

upvoted 1 times

...

Sumit_DP100

2 years, 1 month ago

Stratified Sampling does not guarantee balanced dataset. It just makes sure the proportion of classes are sampled in equal proportion. The imbalance issue will still be there so SMOTE is the right option.

upvoted 7 times

...

FU_User

2 years, 2 months ago

Selected Answer: B

I guess the keyword is "compensate" If you have a stratified split you only guarantee that the labels are in the same proportion in test and train set (95/5 incoming data -> 95/5 training set and 95/5 testing set) This doesn't compensate for anything it just doesn't introduce a new problem on limited training data, for example not having a particular label in the training set at all in the worst case. As SMOTE generates new data points it "compensates".

upvoted 5 times

...

dija123

2 years, 7 months ago

Selected Answer: B

Given answer is correct

upvoted 3 times

...