Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 5 question 7 discussion

Actual exam question from Microsoft's DP-100

Question #: 7
Topic #: 5

You are determining if two sets of data are significantly different from one another by using Azure Machine Learning Studio.
Estimated values in one set of data may be more than or less than reference values in the other set of data. You must produce a distribution that has a constant
Type I error as a function of the correlation.
You need to produce the distribution.
Which type of distribution should you produce?

A. Unpaired t-test with a two-tail option
B. Unpaired t-test with a one-tail option
C. Paired t-test with a one-tail option
D. Paired t-test with a two-tail option

Show Suggested Answer

Suggested Answer: D 🗳️

by Zhuo at May 25, 2020, 8:04 a.m.

Comments

Submit Cancel

David_Tadeu

Highly Voted 3 years, 4 months ago

Selected Answer: D

"A two-tailed test is appropriate if the estimated value is greater or less than a certain range of values, for example, whether a test taker may score above or below a specific range of scores." https://en.wikipedia.org/wiki/One-_and_two-tailed_tests

upvoted 5 times

...

jl420

Most Recent 9 months, 1 week ago

Selected Answer: D

While the question does not state whether the datasets are paired, it actually seems more likely that the data is paired. Here’s why: "Estimated values in one set of data may be more than or less than reference values in the other set of data": This wording suggests a comparison between estimated values and reference values. Often, estimated values are paired with reference values to evaluate accuracy or difference. For example, you might be comparing predicted values with observed values, which is a classic paired scenario. "Constant Type I error as a function of correlation": Maintaining a constant Type I error rate as a function of correlation typically implies that the correlation between two sets of data points (e.g., estimated vs. reference) is being taken into account. This is a key aspect of a paired t-test, where the test accounts for the natural pairing and potential correlation within the pairs. That said, D is correct answer. Probably. Maybe.

upvoted 1 times

...

evangelist

1 year, 3 months ago

Selected Answer: D

If we assume that the two data sets are uncorrelated, then choose the unpaired t-test. If the two data sets are related (for example, two measurements from the same set of samples), choose a paired t-test. The question does not clearly state whether the data sets are paired, but based on common data comparison scenarios, we can infer that they are paired.

upvoted 2 times

...

ZoeJ

2 years, 3 months ago

i think the given answer is correct

upvoted 1 times

...

phdykd

2 years, 5 months ago

Option C is the correct answer: Paired t-test with a one-tail option.

upvoted 2 times

...

ning

3 years, 2 months ago

I guess two tails due to two data set are extremely negatively correlated??? Not sure whether that is the ask from the question though, cannot find any reference from azureml Purely statistically speaking, I agree with one tail paired test

upvoted 2 times

...

synapse

3 years, 5 months ago

Selected Answer: D

Answer is D Paired t test with two test. But I highly doubt this question is still being asked

upvoted 3 times

...

spaceykacey

3 years, 9 months ago

Is this question still being asked?

upvoted 1 times

spaceykacey

3 years, 9 months ago

since its from ML Studio (Classic)

upvoted 1 times

...

bdsrca

3 years, 11 months ago

A distribution that has a constant Type I error as a function of the correlation. = Paired

upvoted 2 times

...

saurabhk1

4 years, 5 months ago

It should be Unpaired as Each pair of scores is independent of every other pair. And , one tail, we are looking for only inequality(more or less)

upvoted 4 times

...

lucazav

4 years, 10 months ago

This picture is taken from Wikipedia: https://en.wikipedia.org/wiki/Student's_t-test#Unpaired_and_paired_two-sample_t-tests

upvoted 2 times

...

Zhuo

5 years, 2 months ago

Why paired?

upvoted 1 times

JUEI

5 years ago

I think the keyword is "sets" which is of "pairs" as the definition Choose a paired t-test when these conditions apply: You have a matched pairs of scores. For example, you might have two different measures per person, or matched pairs of individuals (such as a husband and wife). Each pair of scores is independent of every other pair. The sampling distribution of d is normal. A paired t-test is useful when comparing related cases. By averaging the differences between the scores of the paired cases, you can determine whether the total difference is statistically significant.

upvoted 4 times

yanbin43

4 years, 8 months ago

The key phrase is "reference values in the other set of data". It indicates that the two sets of data come from the same source hence paired.

upvoted 12 times

...

hendrata

5 years, 2 months ago

I agree it should be unpaired

upvoted 2 times

mhall1

5 years, 1 month ago

Paired because they are estimated and reference values of the same thing (or at least I took that as implied). Thus, they are related and should vary together.

upvoted 19 times

agu_elli

3 years, 11 months ago

It always will be "values of the same thing" since they are testing the same variable (hypothesis). For me, it is unpair since you can't find another link (apart from the one you are testing. An example of pair is the e before and after effect of a pharmaceutical treatment on the same group of people.

upvoted 1 times

...

snegnik

2 years, 2 months ago

Where did you find it in the description?

upvoted 1 times

...