exam questions

Exam DP-201 All Questions

View all questions & answers for the DP-201 exam

Exam DP-201 topic 2 question 22 discussion

Actual exam question from Microsoft's DP-201
Question #: 22
Topic #: 2
[All DP-201 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Data Lake Storage account that contains a staging zone.
You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.
Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes an Azure Databricks notebook, and then inserts the data into the data warehouse.
Does this meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️
Use a stored procedure, not an Azure Databricks notebook to invoke the R script.
Reference:
https://docs.microsoft.com/en-US/azure/data-factory/transform-data

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Nieswurz
Highly Voted 4 years, 10 months ago
This should be the correct answer.
upvoted 27 times
andreeavi
4 years, 6 months ago
first step is to ingest data..
upvoted 1 times
...
maynard13x8
4 years, 3 months ago
I think notebooks are only interactive. It should be a job cluster. Any opinions?
upvoted 2 times
...
Bhagya123456
3 years, 10 months ago
Now your comment is ambiguous. Do you mean correct answer provided in that case 'NO' is answer or the Solution provided is correct and it will do the Job, in this case 'Yes' will be the answer...
upvoted 4 times
...
...
bakamon
Most Recent 2 years, 1 month ago
Yes, this solution meets the goal. You can use an Azure Data Factory schedule trigger to execute a pipeline that copies the data to a staging table in the data warehouse, and then uses a stored procedure to execute the R script. This will allow you to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics on a daily basis.
upvoted 1 times
...
Ssv2030
3 years, 9 months ago
The answer should be NO because: 1. we can't assume that the Azure Databricks notebook will execute/run the transform R script, it is not mentioned that Azure Databricks notebook will run the R script 2. for incremental loads in ADF, I think a tumbling trigger should be used. can someone pls confirm?
upvoted 1 times
...
MMM777
4 years, 1 month ago
Answer should be YES: ADF can trigger a Databricks notebook (not required to be user-driven): https://docs.microsoft.com/en-us/azure/data-factory/transform-data-using-databricks-notebook
upvoted 4 times
...
cadio30
4 years, 1 month ago
The answer is Yes. R script is executed in the azure databricks notebook and once the transformation is completed then the mount the Azure Synapse to load the data. Reference: https://docs.microsoft.com/en-us/azure/databricks/scenarios/databricks-extract-load-sql-data-warehouse
upvoted 1 times
...
AJMorgan591
4 years, 9 months ago
Should use a tumbling window trigger in ADF for incremental loading. https://docs.microsoft.com/en-us/azure/data-factory/solution-template-copy-new-files-lastmodifieddate
upvoted 2 times
BungyTex
4 years, 6 months ago
Don't have to, can just use a regular schedule no problem.
upvoted 1 times
...
...
avix
4 years, 10 months ago
I'm surprised as I ran R in Azure Databrick
upvoted 2 times
...
Nieswurz
4 years, 10 months ago
The solution template mentioned by Bob123456 does not fit, as --- per description --- the R script is to be run when the data is still located in the data lake. After the R based transformation, the result is to be loaded to the DWH. This type of processing would need polybase for accessing the data lake, which is not mentioned here.
upvoted 3 times
apandey
4 years, 9 months ago
Databricks notebook can use mount to access data lake. Notebook is correct answer
upvoted 2 times
...
...
Bob123456
4 years, 10 months ago
this is incorrect https://docs.microsoft.com/en-us/sql/machine-learning/tutorials/quickstart-r-create-script?view=sql-server-ver15
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...