Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Get Unlimited Access

Microsoft Discussions

Exam DP-200 topic 2 question 48 discussion

Actual exam question from Microsoft's DP-200

Question #: 48
Topic #: 2

[All DP-200 Questions]

DRAG DROP -
You have an Azure Data Lake Storage Gen2 account that contains JSON files for customers. The files contain two attributes named FirstName and LastName.
You need to copy the data from the JSON files to an Azure Synapse Analytics table by using Azure Databricks. A new column must be created that concatenates the FirstName and LastName values.
You create the following components:
✑ A destination table in Azure Synapse
✑ An Azure Blob storage container
✑ A service principal
Which five actions should you perform in sequence next in a Databricks notebook? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Select and Place:

Show Suggested Answer

Suggested Answer:

Step 1: Read the file into a data frame.
You can load the json files as a data frame in Azure Databricks.
Step 2: Perform transformations on the data frame.
Step 3:Specify a temporary folder to stage the data
Specify a temporary folder to use while moving data between Azure Databricks and Azure Synapse.
Step 4: Write the results to a table in Azure Synapse.
You upload the transformed data frame into Azure Synapse. You use the Azure Synapse connector for Azure Databricks to directly upload a dataframe as a table in a Azure Synapse.

Step 5: Drop the data frame -
Clean up resources. You can terminate the cluster. From the Azure Databricks workspace, select Clusters on the left. For the cluster to terminate, under Actions, point to the ellipsis (...) and select the Terminate icon.
Reference:
https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-extract-load-sql-data-warehouse

by alf99 at April 5, 2021, 5:57 p.m.

Comments

Submit Cancel

cadio30

Highly Voted 2 years, 11 months ago

It requires to mount the ALDS gen 2 thus the sequence is right "FHEAB".

upvoted 15 times

niwe

2 years, 11 months ago

Can you explain what is "FHEAB"?

upvoted 1 times

maciejt

2 years, 11 months ago

letter-numbers of steps to choose

upvoted 2 times

niwe

2 years, 10 months ago

Thanks!

upvoted 1 times

...

hoangton

Highly Voted 2 years, 10 months ago

Correct answer should be: Step 1: Mount the Data Lake Storage onto DBFS Step 2: Read the file into a data frame. Step 3: Perform transformations on the data frame. Step 4: Specify a temporary folder to stage the data Step 5: Write the results to a table in Azure Synapse.

upvoted 10 times

...

Bhagya123456

Most Recent 2 years, 8 months ago

The Answer is Perfect. Mounting is not Required. Drop Data Frame should be there. The question never mentioned that you have to use Service Principle. Had it be 6 steps I would have added Mounting Steps. But Considering only 5 steps, the below 5 steps have more priority then Mounting (not an essential).

upvoted 1 times

satyamkishoresingh

2 years, 7 months ago

why drop dataframe ?cleanup the resource is about cluster not the DF .

upvoted 1 times

...

vrmei

2 years, 10 months ago

Mount Data Lake Storage onto DBFS (Service Principal) Read the file into data frame Perform Transformation on the data frame Specify the temp folder to stage data write results to synapse table https://docs.microsoft.com/en-us/azure/databricks/scenarios/databricks-extract-load-sql-data-warehouse

upvoted 4 times

vrmei

2 years, 10 months ago

small correction: I don't see the mount option in ADLS account configuration in the given URL. I feel the given answer might correct. The last one should be Drop the data frame which will do cleanup ...

upvoted 1 times

...

unidigm

2 years, 11 months ago

Do we really need to stage the data? We could directly write the dataframe to Synapse. https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/synapse-analytics

upvoted 1 times

Rob77

2 years, 11 months ago

Yes, we do. tempDir (that stages data) MUST be specified for Synapse write method.

upvoted 2 times

...

Aragorn_2021

3 years ago

I would go for FHEAB. Mount the storage -> Read the file to a dataframe ->transform it further -> write the data to temporary folder in storage -> and load to DWH

upvoted 5 times

111222333

2 years, 11 months ago

Agree. Service Principal (which is given in the task) is used for mounting. Mount an Azure Data Lake Gen 2 to DBFS (Databricks File System) using a Service Principal: https://kyleake.medium.com/mount-an-adls-gen-2-to-databricks-file-system-using-a-service-principal-and-oauth-2-0-ep-5-73172dd0ddeb

upvoted 2 times

...

tucho

3 years ago

I agree with HEAB. But I don' know which is the missing one. I think there is no need to "drop the DF" or to "mount the DL storage"... :-( Does anybody know the right full answer?

upvoted 1 times

...

alf99

3 years ago

wrong, should be F,H,E,A,B. The DataLake storage has to be mounted onto DBFS before red the file

upvoted 2 times

DongDuong

3 years ago

Based on the provided link, I think the keyword here is "mounted". Datalake storage is not mounted onto DBFS, instead, it is called by Databricks via API. So the given answer is correct

upvoted 2 times

DongDuong

3 years ago

After revising, I think FHEAB makes more sense

upvoted 2 times

...

Unlimited Access

Exam DP-200 topic 2 question 48 discussion

Comments

cadio30

niwe

maciejt

niwe

hoangton

Bhagya123456

satyamkishoresingh

vrmei

vrmei

unidigm

Rob77

Aragorn_2021

111222333

tucho

alf99

DongDuong

DongDuong

Get IT Certification

New Version GCP Professional Cloud Architect Certificate & Helpful Information

The 5 Most In-Demand Project Management Certifications of 2019