exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 104 discussion

Actual exam question from Microsoft's DP-100
Question #: 104
Topic #: 3
[All DP-100 Questions]

DRAG DROP -
You previously deployed a model that was trained using a tabular dataset named training-dataset, which is based on a folder of CSV files.
Over time, you have collected the features and predicted labels generated by the model in a folder containing a CSV file for each month. You have created two tabular datasets based on the folder containing the inference data: one named predictions-dataset with a schema that matches the training data exactly, including the predicted label; and another named features-dataset with a schema containing all of the feature columns and a timestamp column based on the filename, which includes the day, month, and year.
You need to create a data drift monitor to identify any changing trends in the feature data since the model was trained. To accomplish this, you must define the required datasets for the data drift monitor.
Which datasets should you use to configure the data drift monitor? To answer, drag the appropriate datasets to the correct data drift monitor options. Each source may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer Hide Answer
Suggested Answer:

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
David_Tadeu
Highly Voted 2 years, 1 month ago
The answer should be Box 1. Training dataset Box 2. Features dataset because in data drift monitor, Baseline dataset = "usually the training dataset for a model". Target dataset = "... MUST have a timestamp column specified".
upvoted 18 times
Arend78
1 year, 5 months ago
Indeed, the drift monitor looks at changes (e.g. seasonal) in the inputs, and does not look at the predictions
upvoted 2 times
...
...
A_PL300
Most Recent 8 months, 1 week ago
Question like this one on Sept-4, 2022 exam
upvoted 1 times
...
bobML
8 months, 2 weeks ago
To configure a data drift monitor, you typically use a baseline dataset and a target dataset for comparison. In this scenario, you want to monitor the changing trends in the feature data since the model was trained. Here's how you should configure the data drift monitor: Baseline Dataset: Training-dataset The baseline dataset should be the dataset that represents the data at the time when the model was trained. In this case, it's the training-dataset since it is the original dataset used for training the model. Target Dataset: Features-dataset The target dataset should be the dataset that you want to monitor for data drift, which contains the features and timestamp information. In this case, it's the features-dataset because it contains the feature data that you want to compare with the baseline data. You don't need to use the predictions-dataset for configuring the data drift monitor because it contains the predicted labels, which are not relevant for monitoring data drift in the features.
upvoted 1 times
...
therealola
1 year, 11 months ago
On exam 18-06-22
upvoted 2 times
...
striver
1 year, 11 months ago
Correct answer is Box1: Training Dataset Box2: Features Dataset Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-monitor-datasets?tabs=python#create-target-dataset
upvoted 4 times
...
JTWang
2 years, 1 month ago
on exam 04/22/2022
upvoted 2 times
...
synapse
2 years, 2 months ago
1. baseline: Training dataset 2. Target: Features data set. Features dataset has a timestamp in it.
upvoted 1 times
...
AjoseO
2 years, 2 months ago
On 03 March 2022
upvoted 3 times
...
AjoseO
2 years, 2 months ago
1. Training dataset 2. Predictions dataset -> because this is the only dataset that has a timestamp column
upvoted 2 times
AjoseO
2 years, 2 months ago
Sorry. 2. Features dataset -> because this is the only dataset that has a timestamp column
upvoted 5 times
...
...
ranjsi01
2 years, 4 months ago
target dataset should be features dataset. (mandatory timestamp column in target dataset )
upvoted 1 times
...
Tsardoz
2 years, 4 months ago
I cant even find any reference to what a feature dataset is ... my vote goes to predictions dataset
upvoted 2 times
...
J_AR
2 years, 4 months ago
The target dataset should be "predictions dataset' because this is the only dataset that has a timestamp column.
upvoted 4 times
Oliverto
2 years, 4 months ago
Target dataset should be "feature-dataset". Because only the feature-dataset contains a timestamp which is mandatory "target_dataset: Required. Dataset to run either adhoc or scheduled DataDrift jobs for. Must be a time series." (https://docs.microsoft.com/en-us/python/api/azureml-datadrift/azureml.datadrift.datadriftdetector(class)?view=azure-ml-py)
upvoted 4 times
...
pancman
2 years, 1 month ago
J_AR you didn't read the question correctly. The dataset that contains the timestamp is features-dataset. Question states: "another named features-dataset with a schema containing all of the feature columns and a timestamp column"
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...