exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 3 question 47 discussion

Actual exam question from Microsoft's DP-100
Question #: 47
Topic #: 3
[All DP-100 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:
✑ /data/2018/Q1.csv
✑ /data/2018/Q2.csv
✑ /data/2018/Q3.csv
✑ /data/2018/Q4.csv
✑ /data/2019/Q1.csv
All files store data in the following format:
id,f1,f2,I
1,1,2,0
2,1,1,1
3,2,1,0
4,2,2,1
You run the following code:

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

Solution: Run the following code:

Does the solution meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Haet
Highly Voted 3 years, 1 month ago
The Answer is clearly no
upvoted 15 times
...
reddragondms
Highly Voted 1 year, 8 months ago
Its seems the question has been changed/updated since some of these comments.
upvoted 11 times
...
james2033
Most Recent 8 months ago
This question is out-of-date, obsoleted. Should be from azure.ai.ml import ... not from azureml.core import Dataset Reference: https://github.com/Azure/azure-sdk-for-python/tree/azure-ai-ml_1.11.1/sdk/ml/azure-ai-ml#authenticate-the-client
upvoted 1 times
...
PI_Team
10 months, 2 weeks ago
Selected Answer: A
It meets the requirements. See example below from Microsoft: # create tabular dataset from all csv files in the directory tabular_dataset_3 = Dataset.Tabular.from_delimited_files(path=(datastore,'weather/**/*.csv')) # create tabular dataset from multiple paths data_paths = [(datastore, 'weather/2018/11.csv'), (datastore, 'weather/2018/12.csv')] tabular_dataset_4 = Dataset.Tabular.from_delimited_files(path=data_paths) Link: https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py#azureml-data-dataset-factory-tabulardatasetfactory-from-delimited-files SaM
upvoted 1 times
...
fhlos
11 months, 3 weeks ago
Selected Answer: B
No, the solution does not meet the goal. The code provided to create the dataset and load the data into a single DataFrame is incorrect. To create a dataset named training_data and load the data from all files into a single DataFrame, you need to modify the code as follows: python Copy code from azureml.core import Dataset paths = [(data_store, 'data/2018/*.csv'), (data_store, 'data/2019/*.csv')] training_data = Dataset.Tabular.from_delimited_files(paths) data_frame = training_data.to_pandas_dataframe() Explanation: The paths variable is updated to specify the paths of all files to be included in the dataset. In this case, it includes all CSV files in the /data/2018 and /data/2019 directories. The Dataset.Tabular.from_delimited_files() method is used to create the dataset training_data by providing the paths variable. The to_pandas_dataframe() method is called on the training_data dataset to load the data from all files into a single pandas DataFrame. By making these changes, the code will create the desired dataset and load the data from all files into a single DataFrame.
upvoted 1 times
...
abhishekm94
1 year ago
Correct answer is Yes Link :: https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py&viewFallbackFrom=azure-ml-pyandhttps%3A%2F%2Flearn.microsoft.com%2Fen-us%2Fpython%2Fapi%2Fazureml-core%2Fazureml.data.tabulardataset%3Fview%3Dazure-ml-py
upvoted 1 times
...
centurion2020
1 year, 4 months ago
Selected Answer: A
Question updated as of Jan 2023.... based on https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py and https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py Answer seems to be A - YES
upvoted 9 times
...
sultanmr123
1 year, 6 months ago
yes Answer is B
upvoted 1 times
casiopa
1 year, 6 months ago
Why B? The Dataset is Tabular, and there is no need for two file paths.
upvoted 1 times
...
...
ai_lover
1 year, 9 months ago
Answer is correct
upvoted 2 times
...
YipingRuan
2 years, 10 months ago
web_path ='https://dprepdata.blob.core.windows.net/demo/Titanic.csv' titanic_ds = Dataset.Tabular.from_delimited_files(path=web_path, set_column_types={'Survived': DataType.to_bool()}) # preview the first 3 rows of titanic_ds titanic_ds.take(3).to_pandas_dataframe() https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets#set-data-schema
upvoted 1 times
...
brendal89
3 years, 2 months ago
I think the answer might be 'yes'. see this similar example for parquet files: datastore_path = [(dstore, dset_name + '/*/*/data.parquet')] dataset = Dataset.Tabular.from_parquet_files(path=datastore_path, partition_format = dset_name + '/{partition_time:yyyy/MM}/data.parquet') the partition_format argument appears optional. reference: https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/work-with-data/datasets-tutorial/timeseries-datasets/tabular-timeseries-dataset-filtering.ipynb
upvoted 3 times
l2azure
3 years, 2 months ago
Answer is 'No'. You must create a pandas dataframe which is only possible from a Dataset.Tabular object. In this case (see last line) the dataframe cannot be made since it is a Dataset.File object.
upvoted 13 times
Arend78
1 year, 6 months ago
I think they changed the question. The code now ends with a Tabular Dataset, that indeed can be user as input to as_pandas_dataframe() I think the answer is now "Yes"
upvoted 3 times
...
l2azure
3 years, 2 months ago
https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py
upvoted 2 times
...
...
...
dev2dev
3 years, 3 months ago
answer is yes.
upvoted 1 times
dev2dev
3 years, 3 months ago
no. its not calling the correct function it should be "from_delimited_files" instead of "from_files"
upvoted 9 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...