Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 6 question 3 discussion

Actual exam question from Microsoft's DP-100

Question #: 3
Topic #: 7

DRAG DROP -
You need to define an evaluation strategy for the crowd sentiment models.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Select and Place:

Show Suggested Answer

Suggested Answer:

Scenario:
Experiments for local crowd sentiment models must combine local penalty detection data.
Crowd sentiment models must identify known sounds such as cheers and known catch phrases. Individual crowd sentiment models will detect similar sounds.
Note: Evaluate the changed in correlation between model error rate and centroid distance
In machine learning, a nearest centroid classifier or nearest prototype classifier is a classification model that assigns to observations the label of the class of training samples whose mean (centroid) is closest to the observation.
Reference:
https://en.wikipedia.org/wiki/Nearest_centroid_classifier
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/sweep-clustering

by mrkalman at Sept. 8, 2020, 3:17 p.m.

Comments

Submit Cancel

mrkalman

Highly Voted 4 years, 4 months ago

does this question and answer make sense? i dont have any idea at all. could any one kindly give explain

upvoted 31 times

...

kalel249

Highly Voted 4 years, 3 months ago

The best I could gather was that: they would like to do crowd segmentation which would help them target certain people for their ad campaigns, using clustering based on videos and audios of the people in the crowd. The question wants us to create an evaluation strategy for the models they created. In the problem description, they said they noticed 47 features were not performing rightly and they would engineer 10 independent features from them before retraining our model. This gives us the first answer "Add new features for retraining...".

upvoted 15 times

...

haby

Most Recent 1 year, 1 month ago

A - This will be the first one. I think this is part of Cluster-then-classification model. Based on my exp, I will use cluster result as a new feature for later classification model, that's reason we say "Add new features for retraining supervised models". E makes sense to me as well, but have no idea for C

upvoted 1 times

haby

1 year, 1 month ago

It looks like C is kind of Error check. For example, when using KMeans, we need to plot SSE vs. k to determine which k value is better. In this case, this is a classification, but doing similar things. It switches from SSE vs. K to "Shortest Dis. from Centroid" vs. "Model Error Rate".

upvoted 1 times

...

phdykd

1 year, 5 months ago

A,C,B could be

upvoted 1 times

...

phdykd

1 year, 6 months ago

- Filter labeled cases for retraining using the shortest distance from centroids: Start by identifying the labeled cases that are closest to the centroids of their respective clusters. These would typically be the most representative samples of their classes and would form a solid base for initial model training. C- Evaluate the changes in correlation between model error rate and centroid distance: After retraining the model with the selected cases, evaluate how the model's error rate correlates with the distance of samples from the centroids. This will provide insights into how well the model is performing and whether samples farther from the centroids are more likely to be misclassified. E- Filter labeled cases for retraining using the longest distance from centroids: Based on the evaluation in step 2, it may be observed that samples farther from the centroids are not being accurately classified. To improve the model's performance on these cases, they should be included in the training set for retraining.

upvoted 1 times

...

phdykd

1 year, 11 months ago

The three actions that should be performed in sequence to define an evaluation strategy for the crowd sentiment models are: C) Evaluate the changes in correlation between model error rate and centroid distance: This step involves evaluating the correlation between the model's error rate and the distance from the centroid. It helps in identifying if the model is overfitting or underfitting the data. B) Filter labeled cases for retraining using the shortest distance from centroids: This step involves filtering the labeled cases for retraining based on the shortest distance from the centroids. This helps in selecting the cases that are closer to the centroids and are more representative of the cluster. A) Add new features for retraining supervised models: This step involves adding new features for retraining supervised models. The new features can help improve the performance of the models and capture important information from the data. Therefore, the correct order of actions is C, B, A.

upvoted 3 times

snegnik

1 year, 8 months ago

ChatGPT3.5?

upvoted 1 times

...

PremPatrick

2 years, 2 months ago

Did this appear in any of the previous exams?

upvoted 6 times

michaelmorar

1 year, 11 months ago

Writing on Friday, will let you know.

upvoted 3 times

snegnik

1 year, 8 months ago

What's the news?

upvoted 1 times

...

ning

2 years, 7 months ago

I cannot really follow this case study overall ... After compare with all options, I think the answer is logically sound ... No other comments ...

upvoted 2 times

...

jed_elhak

3 years, 4 months ago

the question is complicated but i say that's a comparisation between existing sound and new sound so the first thing 1) add new features , seconde 2)use correlation to now how much new and old feure are correlated 3)evaluate

upvoted 2 times

jed_elhak

3 years, 4 months ago

sorry i mean 3)filter based on short distance

upvoted 1 times

...

prashantjoge

3 years, 8 months ago

I couldn't make head or tails of this question. Clueless....

upvoted 6 times

...

HoustonHo

4 years, 3 months ago

no idea about this.

upvoted 4 times

...