exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 7 discussion

Actual exam question from Microsoft's DP-100
Question #: 7
Topic #: 3
[All DP-100 Questions]

You create a datastore named training_data that references a blob container in an Azure Storage account. The blob container contains a folder named csv_files in which multiple comma-separated values (CSV) files are stored.
You have a script named train.py in a local folder named ./script that you plan to run as an experiment using an estimator. The script includes the following code to read data from the csv_files folder:

You have the following script.

You need to configure the estimator for the experiment so that the script can read the data from a data reference named data_ref that references the csv_files folder in the training_data datastore.
Which code should you use to configure the estimator?
A.

B.

C.

D.

E.

Show Suggested Answer Hide Answer
Suggested Answer: B
Besides passing the dataset through the input parameters in the estimator, you can also pass the dataset through script_params and get the data path (mounting point) in your training script via arguments. This way, you can keep your training script independent of azureml-sdk. In other words, you will be able use the same training script for local debugging and remote training on any cloud platform.
Example:
from azureml.train.sklearn import SKLearn
script_params = {
# mount the dataset on the remote compute and pass the mounted path as an argument to the training script
'--data-folder': mnist_ds.as_named_input('mnist').as_mount(),
'--regularization': 0.5
}
est = SKLearn(source_directory=script_folder,
script_params=script_params,
compute_target=compute_target,
environment_definition=env,
entry_script='train_mnist.py')
# Run the experiment
run = experiment.submit(est)
run.wait_for_completion(show_output=True)
Incorrect Answers:
A: Pandas DataFrame not used.
Reference:
https://docs.microsoft.com/es-es/azure/machine-learning/how-to-train-with-datasets

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
chaudha4
Highly Voted 3 years, 1 month ago
The use of estimator is deprecated. Use the ScriptRunConfig object with your own defined environment. Hope we don't see this question going forward !! https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.estimator.estimator?view=azure-ml-py
upvoted 13 times
scipio
3 years ago
You're right, but if you replace the estimator with the ScriptRunConfig this question still holds, as the method to pass Dataset, mount vs. download, by argument, etc.. are relevant
upvoted 5 times
...
...
vv_bb
Most Recent 6 months, 3 weeks ago
Even though the Estimator is deprecated in favor for ScriptRunConfig (google - "Migrating from Estimators to ScriptRunConfig") , I tried to understand the correct answer for the question as it is defined here. 1) For Estimator class both "script_params" and "arguments" parameters are acceptable check here - https://learn.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.estimator.estimator?view=azure-ml-py 2) So how to define which of them is valid in our case? The answer is here: (be aware for PythonScriptStep "arguments" is the same as "script_params" for Estimator) https://learn.microsoft.com/en-us/azure/machine-learning/how-to-move-data-in-out-of-pipelines?view=azureml-api-1#access-datasets-within-your-script Meaning because in our script we use the ArgParser we have to pass the dataset using the "script_params"
upvoted 3 times
...
iai
1 year ago
Shouldn't it be D.? for local compute_target not sure if as_mount will work. better as_download
upvoted 2 times
...
danishanis
1 year, 3 months ago
Answer is B. I typed the question as it is in ChatGPT and it gave the answer where the 'script_params' argument is configured to read data from 'data_ref' (and data_ref.as_mount() is being used to specify the file path in datastore) that references a 'csv_files' folder.
upvoted 2 times
...
jpalaci22
1 year, 3 months ago
Seen on the exam 20Feb2023
upvoted 3 times
...
Edriv
1 year, 5 months ago
can be A,C,E - what do you thing?
upvoted 1 times
...
ning
2 years ago
B should be correct!
upvoted 3 times
...
TheYazan
2 years, 2 months ago
on march 2022
upvoted 4 times
...
[Removed]
2 years, 3 months ago
On 20Feb2022
upvoted 4 times
...
kisskeo
2 years, 8 months ago
On Exam 01 Oct 2021
upvoted 3 times
...
ljljljlj
2 years, 11 months ago
On exam 2021/7/10
upvoted 3 times
...
sarahmoin
2 years, 12 months ago
what is the correct answer? Why its not D.
upvoted 1 times
vhx
2 years, 11 months ago
as_download, which copies the files to a temporary location on the compute where the script is being run. as_mount to stream the files directly from their source.
upvoted 3 times
iai
1 year ago
Notice however, that compute target is local, will mounting work?
upvoted 1 times
...
...
...
iuolu
3 years, 1 month ago
Nobody checked this question? The answer should be A, using to_pandas_dataframe() for tabular files instead
upvoted 2 times
chaudha4
3 years, 1 month ago
No, you are wrong. Several problems in A. 1) Parameter is being passed as named input. That is wrong since it is not being accessed using named input in t he script. 2) You convert to dataframe in the script not when you pass it. So A is definitely not the correct answer.
upvoted 10 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...