exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 30 discussion

Actual exam question from Microsoft's DP-100
Question #: 30
Topic #: 3
[All DP-100 Questions]

You are performing a filter-based feature selection for a dataset to build a multi-class classifier by using Azure Machine Learning Studio.
The dataset contains categorical features that are highly correlated to the output label column.
You need to select the appropriate feature scoring statistical method to identify the key predictors.
Which method should you use?

  • A. Kendall correlation
  • B. Spearman correlation
  • C. Chi-squared
  • D. Pearson correlation
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Yilu
Highly Voted 4 years, 9 months ago
I think the answer should be C. Chi-squared as both label and features are categorical.
upvoted 47 times
roncil
2 years ago
agreed, chi-squared for categoric
upvoted 1 times
...
febriyanasn
4 years ago
Chi-Squared: Labels and features can be text or numeric. Use this method for computing feature importance for two categorical columns. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/filter-based-feature-selection
upvoted 3 times
...
...
Gitty
Highly Voted 4 years, 6 months ago
C is the answer. Your choice of a filter selection method depends in part on what sort of input data you have. The requirement for all Pearson Correlation, Spearman Correlation and Fisher Score methods is "features must be numeric". But for Chi Squared, the requirement is "features can be text or numeric" so you can use this method for computing feature importance for categorical columns. See the table at this link: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/filter-based-feature-selection
upvoted 15 times
Laredo
4 years, 2 months ago
i agree. Besides, the dataset here has categorical features while Pearson Corr. is for continuous variables.
upvoted 1 times
...
...
nicorg5
Most Recent 11 months ago
I think the correct is C too
upvoted 1 times
...
NullVoider_0
1 year, 1 month ago
Selected Answer: C
The best statistical method to use for filter-based feature selection in this multi-class classification scenario with categorical features is Chi-squared. The chi-squared test measures dependence between categorical variables. It will identify categorical features that have a statistically significant correlation with the label column.
upvoted 1 times
...
InversaRadice
1 year, 2 months ago
Its fun because in this answer explanation is stated: "Pearson's correlation coefficient is the test statistics that measures the statistical relationship, or association, between two __continuous variables__." so the answer can't be Pearson ... !!!
upvoted 1 times
...
fhlos
1 year, 7 months ago
Selected Answer: C
C - ChatGPT
upvoted 1 times
...
mkk888
1 year, 7 months ago
Selected Answer: C
Chi-sqaure works for categorical data the rest don't so it should be the answer
upvoted 1 times
...
krishna1818
1 year, 8 months ago
Selected Answer: C
When features as well as label are categorical values we can use chi-squared method
upvoted 1 times
...
ajay0011
1 year, 10 months ago
Selected Answer: C
C is correct
upvoted 1 times
...
phdykd
1 year, 12 months ago
C is the answer.
upvoted 1 times
...
Padilha
2 years ago
Selected Answer: C
Those "correct answers" were not made by data scientists for sure.
upvoted 1 times
...
synapse
2 years, 11 months ago
Selected Answer: C
Chi-squared. It's categorial.
upvoted 2 times
...
newuu
3 years, 1 month ago
The answer should be C. Chi-Squared Feature -> Numeric | Text Label -> Numeric | Text Pearson | Kendall | Spearman | Fisher Score Feature -> Numeric Label -> Numeric | Text https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/filter-based-feature-selection
upvoted 3 times
...
dija123
3 years, 2 months ago
Selected Answer: C
Agree with C
upvoted 1 times
...
RyanTsai
3 years, 4 months ago
agree: Chi-squared
upvoted 1 times
...
slash_nyk
3 years, 6 months ago
the answer is C
upvoted 2 times
...
rishi_ram
3 years, 8 months ago
Answer is definitely C: Chi Squared is used in Categorical features and . Pearson is used for continuous not categorical
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...