Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 5 question 6 discussion

Actual exam question from Microsoft's DP-100

Question #: 6
Topic #: 5

HOTSPOT -
You are performing feature scaling by using the scikit-learn Python library for x.1 x2, and x3 features.
Original and scaled data is shown in the following image.

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer

Suggested Answer:

Box 1: StandardScaler -
The StandardScaler assumes your data is normally distributed within each feature and will scale them such that the distribution is now centred around 0, with a standard deviation of 1.
Example:

All features are now on the same scale relative to one another.

Box 2: Min Max Scaler -

Notice that the skewness of the distribution is maintained but the 3 distributions are brought into the same scale so that they overlap.

Box 3: Normalizer -
Reference:
http://benalexkeen.com/feature-scaling-with-scikit-learn/

by davo123 at May 20, 2020, 9:28 a.m.

Comments

Submit Cancel

davo123

Highly Voted 5 years, 1 month ago

Is this correct? Why not A: Standard, B: Normal, C: Min Max ?

upvoted 20 times

hendrata

5 years, 1 month ago

I agree that C in min max (look at the range of x values in C) But I think A is normal, because the sum of squares (x_1^2 + x_2^2 + x_3^2) must be = 1 in a normalized data set, that's the definition that it was used in the reference page. So that leaves B to be standard

upvoted 1 times

epgd

5 years ago

I dont think so, because: The StandardScaler assumes your data is normally distributed within each feature and will scale them such that the distribution is now centred around 0, with a standard deviation of 1. (look at the range of the x values in B)

upvoted 7 times

...

HkIsCrazY

4 years, 5 months ago

Yes! A: standard, B: Normal, C: Min Max Standard - The StandardScaler assumes your data is normally distributed within each feature and will scale them such that the distribution is now centred around 0, with a standard deviation of 1 Min Max - MinMaxScaler preserves the shape of the original distribution. It doesn’t meaningfully change the information embedded in the original data. Normal - Normalizer does transform all the features to values between -1 and 1

upvoted 4 times

...

tomiskolc

4 years, 2 months ago

I'm pretty pretty sure, that you're wrong! MinMaxScaler always(!!) between 0 and 1, Normalizer always between -1 and 1! and Standard always around 0 (with standard deviation of 1). So the correct answer is A: Standard, B: Min Max, C: Normal . (Please others dont write if you dont know)

upvoted 26 times

YipingRuan

4 years ago

But in chart B, it goes beyond 1?

upvoted 3 times

...

E_aws

Highly Voted 4 years, 2 months ago

As a mathematician I can approve that the answers are correct! :))

upvoted 17 times

...

jl420

Most Recent 8 months, 1 week ago

Graph A: Standard Scaler Graph B: Min Max Scale Graph C: Normalizer Graph A: Scaler Used: Standard Scaler The data in Graph A appears to be centered around zero with a standard deviation of one, which is characteristic of the StandardScaler. Graph B: Scaler Used: Min Max Scale The data in Graph B is scaled within a range, likely [0, 1], which is characteristic of the MinMaxScaler. Graph C: Scaler Used: Normalizer The data in Graph C has been scaled in a way that likely brings each data point to unit norm, typical of the Normalizer.

upvoted 1 times

...

deyoz

1 year, 5 months ago

These answers are correct, for sure!

upvoted 1 times

...

ZoeJ

2 years, 2 months ago

A: Standard, B: Min Max, C: Normal

upvoted 1 times

ZoeJ

2 years, 2 months ago

http://benalexkeen.com/feature-scaling-with-scikit-learn/

upvoted 1 times

...

AzureJobsTillRetire

2 years, 5 months ago

There seems to be a typo in picture B, and the number 10 on x-axis should be 1.

upvoted 3 times

...

chevyli

2 years, 10 months ago

Have the question even appeared in any exam?

upvoted 2 times

...

ning

3 years, 1 month ago

Should be correct answer, A is std deviation, so it is standardization, B is between 0 and 1, so it must be Min, Max, and C is a bit confusion, normalizer is used for rows, not columns, but since it can be the only valid answer here

upvoted 1 times

...

pancman

3 years, 3 months ago

Given answer is correct. Because, StandardScaler scales the feature so that it's mean is 0 and st. deviation is 1. MinMax Scaler sets the minimum value to 0 and max value to 1. Normalizer rescales eact data point independently of other samples (hence the shape of the feature's distribution doesn't change).

upvoted 2 times

...

dushmantha

3 years, 10 months ago

No doubt about MinMax scaler. But based on the following explanation the given answers are correct (https://datascience.stackexchange.com/questions/45900/when-to-use-standard-scaler-and-when-normalizer). Because normalizer kind of not change the shape of each distribution

upvoted 1 times

...

dev2dev

4 years, 4 months ago

Answers are correct. I verified by using the referenced link and running the script using 3 scalers.

upvoted 6 times

...

kty

4 years, 4 months ago

answer is correct StandartScaler : (-4, 4) Normalizer : (-1, 1) MinMax : (0, 1)

upvoted 3 times

...

HkIsCrazY

4 years, 5 months ago

Answers are correct

upvoted 4 times

...

Shankar_102

4 years, 5 months ago

200% answer is correct guys.

upvoted 4 times

...

ck1729

4 years, 5 months ago

min max scale = 0 to 1 and hence the answer is correct

upvoted 2 times

...

Axure92

4 years, 6 months ago

Answers are correct! https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html A - Standard; B - Min Max; C - Normalizer

upvoted 3 times

...

satishgunjal

4 years, 6 months ago

>> first one is standard scaler> after standard scaler all features will have mean close to zero, valueas are on same scale but range is larger than min max scaler >> Second is min max scaler> features are at same relative scale after min max is applied (space between them is also maintained) >> Third one is normalizer> Each point is now within 1 unit of the origin on this Cartesian co-ordinate system.

upvoted 3 times

...

Load full discussion...