Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 104 discussion

Actual exam question from Microsoft's DP-100

Question #: 104
Topic #: 3

DRAG DROP -
You previously deployed a model that was trained using a tabular dataset named training-dataset, which is based on a folder of CSV files.
Over time, you have collected the features and predicted labels generated by the model in a folder containing a CSV file for each month. You have created two tabular datasets based on the folder containing the inference data: one named predictions-dataset with a schema that matches the training data exactly, including the predicted label; and another named features-dataset with a schema containing all of the feature columns and a timestamp column based on the filename, which includes the day, month, and year.
You need to create a data drift monitor to identify any changing trends in the feature data since the model was trained. To accomplish this, you must define the required datasets for the data drift monitor.
Which datasets should you use to configure the data drift monitor? To answer, drag the appropriate datasets to the correct data drift monitor options. Each source may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer

Suggested Answer:

by J_AR at Jan. 1, 2022, 3:36 p.m.

Comments

Submit Cancel

David_Tadeu

Highly Voted 2 years, 3 months ago

The answer should be Box 1. Training dataset Box 2. Features dataset because in data drift monitor, Baseline dataset = "usually the training dataset for a model". Target dataset = "... MUST have a timestamp column specified".

upvoted 18 times

Arend78

1 year, 7 months ago

Indeed, the drift monitor looks at changes (e.g. seasonal) in the inputs, and does not look at the predictions

upvoted 2 times

...

A_PL300

Most Recent 10 months ago

Question like this one on Sept-4, 2022 exam

upvoted 1 times

...

bobML

10 months, 1 week ago

To configure a data drift monitor, you typically use a baseline dataset and a target dataset for comparison. In this scenario, you want to monitor the changing trends in the feature data since the model was trained. Here's how you should configure the data drift monitor: Baseline Dataset: Training-dataset The baseline dataset should be the dataset that represents the data at the time when the model was trained. In this case, it's the training-dataset since it is the original dataset used for training the model. Target Dataset: Features-dataset The target dataset should be the dataset that you want to monitor for data drift, which contains the features and timestamp information. In this case, it's the features-dataset because it contains the feature data that you want to compare with the baseline data. You don't need to use the predictions-dataset for configuring the data drift monitor because it contains the predicted labels, which are not relevant for monitoring data drift in the features.

upvoted 1 times

...

therealola

2 years ago

On exam 18-06-22

upvoted 2 times

...

striver

2 years, 1 month ago

Correct answer is Box1: Training Dataset Box2: Features Dataset Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-monitor-datasets?tabs=python#create-target-dataset

upvoted 4 times

...

JTWang

2 years, 2 months ago

on exam 04/22/2022

upvoted 2 times

...

synapse

2 years, 4 months ago

1. baseline: Training dataset 2. Target: Features data set. Features dataset has a timestamp in it.

upvoted 1 times

...

AjoseO

2 years, 4 months ago

On 03 March 2022

upvoted 3 times

...

AjoseO

2 years, 4 months ago

1. Training dataset 2. Predictions dataset -> because this is the only dataset that has a timestamp column

upvoted 2 times

AjoseO

2 years, 4 months ago

Sorry. 2. Features dataset -> because this is the only dataset that has a timestamp column

upvoted 5 times

...

ranjsi01

2 years, 5 months ago

target dataset should be features dataset. (mandatory timestamp column in target dataset )

upvoted 1 times

...

Tsardoz

2 years, 6 months ago

I cant even find any reference to what a feature dataset is ... my vote goes to predictions dataset

upvoted 2 times

...

J_AR

2 years, 6 months ago

The target dataset should be "predictions dataset' because this is the only dataset that has a timestamp column.

upvoted 4 times

Oliverto

2 years, 6 months ago

Target dataset should be "feature-dataset". Because only the feature-dataset contains a timestamp which is mandatory "target_dataset: Required. Dataset to run either adhoc or scheduled DataDrift jobs for. Must be a time series." (https://docs.microsoft.com/en-us/python/api/azureml-datadrift/azureml.datadrift.datadriftdetector(class)?view=azure-ml-py)

upvoted 4 times

...

pancman

2 years, 3 months ago

J_AR you didn't read the question correctly. The dataset that contains the timestamp is features-dataset. Question states: "another named features-dataset with a schema containing all of the feature columns and a timestamp column"

upvoted 1 times

...