Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Exam DP-200 topic 2 question 48 discussion

Actual exam question from Microsoft's DP-200
Question #: 48
Topic #: 2
[All DP-200 Questions]

DRAG DROP -
You have an Azure Data Lake Storage Gen2 account that contains JSON files for customers. The files contain two attributes named FirstName and LastName.
You need to copy the data from the JSON files to an Azure Synapse Analytics table by using Azure Databricks. A new column must be created that concatenates the FirstName and LastName values.
You create the following components:
✑ A destination table in Azure Synapse
✑ An Azure Blob storage container
✑ A service principal
Which five actions should you perform in sequence next in a Databricks notebook? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Select and Place:

Show Suggested Answer Hide Answer
Suggested Answer:
Step 1: Read the file into a data frame.
You can load the json files as a data frame in Azure Databricks.
Step 2: Perform transformations on the data frame.
Step 3:Specify a temporary folder to stage the data
Specify a temporary folder to use while moving data between Azure Databricks and Azure Synapse.
Step 4: Write the results to a table in Azure Synapse.
You upload the transformed data frame into Azure Synapse. You use the Azure Synapse connector for Azure Databricks to directly upload a dataframe as a table in a Azure Synapse.

Step 5: Drop the data frame -
Clean up resources. You can terminate the cluster. From the Azure Databricks workspace, select Clusters on the left. For the cluster to terminate, under Actions, point to the ellipsis (...) and select the Terminate icon.
Reference:
https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-extract-load-sql-data-warehouse

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
cadio30
Highly Voted 2 years, 11 months ago
It requires to mount the ALDS gen 2 thus the sequence is right "FHEAB".
upvoted 15 times
niwe
2 years, 11 months ago
Can you explain what is "FHEAB"?
upvoted 1 times
maciejt
2 years, 11 months ago
letter-numbers of steps to choose
upvoted 2 times
niwe
2 years, 10 months ago
Thanks!
upvoted 1 times
...
...
...
...
hoangton
Highly Voted 2 years, 10 months ago
Correct answer should be: Step 1: Mount the Data Lake Storage onto DBFS Step 2: Read the file into a data frame. Step 3: Perform transformations on the data frame. Step 4: Specify a temporary folder to stage the data Step 5: Write the results to a table in Azure Synapse.
upvoted 10 times
...
Bhagya123456
Most Recent 2 years, 8 months ago
The Answer is Perfect. Mounting is not Required. Drop Data Frame should be there. The question never mentioned that you have to use Service Principle. Had it be 6 steps I would have added Mounting Steps. But Considering only 5 steps, the below 5 steps have more priority then Mounting (not an essential).
upvoted 1 times
satyamkishoresingh
2 years, 7 months ago
why drop dataframe ?cleanup the resource is about cluster not the DF .
upvoted 1 times
...
...
vrmei
2 years, 10 months ago
Mount Data Lake Storage onto DBFS (Service Principal) Read the file into data frame Perform Transformation on the data frame Specify the temp folder to stage data write results to synapse table https://docs.microsoft.com/en-us/azure/databricks/scenarios/databricks-extract-load-sql-data-warehouse
upvoted 4 times
vrmei
2 years, 10 months ago
small correction: I don't see the mount option in ADLS account configuration in the given URL. I feel the given answer might correct. The last one should be Drop the data frame which will do cleanup ...
upvoted 1 times
...
...
unidigm
2 years, 11 months ago
Do we really need to stage the data? We could directly write the dataframe to Synapse. https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/synapse-analytics
upvoted 1 times
Rob77
2 years, 11 months ago
Yes, we do. tempDir (that stages data) MUST be specified for Synapse write method.
upvoted 2 times
...
...
Aragorn_2021
3 years ago
I would go for FHEAB. Mount the storage -> Read the file to a dataframe ->transform it further -> write the data to temporary folder in storage -> and load to DWH
upvoted 5 times
111222333
2 years, 11 months ago
Agree. Service Principal (which is given in the task) is used for mounting. Mount an Azure Data Lake Gen 2 to DBFS (Databricks File System) using a Service Principal: https://kyleake.medium.com/mount-an-adls-gen-2-to-databricks-file-system-using-a-service-principal-and-oauth-2-0-ep-5-73172dd0ddeb
upvoted 2 times
...
...
tucho
3 years ago
I agree with HEAB. But I don' know which is the missing one. I think there is no need to "drop the DF" or to "mount the DL storage"... :-( Does anybody know the right full answer?
upvoted 1 times
...
alf99
3 years ago
wrong, should be F,H,E,A,B. The DataLake storage has to be mounted onto DBFS before red the file
upvoted 2 times
DongDuong
3 years ago
Based on the provided link, I think the keyword here is "mounted". Datalake storage is not mounted onto DBFS, instead, it is called by Databricks via API. So the given answer is correct
upvoted 2 times
DongDuong
3 years ago
After revising, I think FHEAB makes more sense
upvoted 2 times
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...