Exam DP-300 All Questions

View all questions & answers for the DP-300 exam

Exam DP-300 topic 1 question 12 discussion

Actual exam question from Microsoft's DP-300

Question #: 12
Topic #: 1

You have an Azure Synapse Analytics Apache Spark pool named Pool1.
You plan to load JSON files from an Azure Data Lake Storage Gen2 container into the tables in Pool1. The structure and data types vary by file.
You need to load the files into the tables. The solution must maintain the source data types.
What should you do?

A. Load the data by using PySpark.
B. Load the data by using the OPENROWSET Transact-SQL command in an Azure Synapse Analytics serverless SQL pool.
C. Use a Get Metadata activity in Azure Data Factory.
D. Use a Conditional Split transformation in an Azure Synapse data flow.

Show Suggested Answer

Suggested Answer: A 🗳️
Synapse notebooks support four Apache Spark languages:
PySpark (Python)
Spark (Scala)

Spark SQL -
.NET Spark (C#)
Note: Bring data to a notebook.
You can load data from Azure Blob Storage, Azure Data Lake Store Gen 2, and SQL pool as shown in the code samples below.
Read a CSV from Azure Data Lake Store Gen2 as a Spark DataFrame. from pyspark.sql import SparkSession from pyspark.sql.types import * account_name = "Your account name" container_name = "Your container name" relative_path = "Your path" adls_path = 'abfss://%s@%s.dfs.core.windows.net/%s' % (container_name, account_name, relative_path) df1 = spark.read.option('header', 'true') \
.option('delimiter', ',') \
.csv(adls_path + '/Testfile.csv')
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-development-using-notebooks

by HichemZe at Aug. 23, 2021, 12:20 p.m.

Comments

Submit Cancel

HichemZe

Highly Voted 1 year, 9 months ago

Question FOR DP-203 , Not For DBA (DP-300)

upvoted 32 times

GeoFlux121

1 year ago

Thanks HichemZe! I was about to have a panic attack regarding several of these questions before seeing your very helpful response!

upvoted 6 times

...

Mphorish

1 year, 9 months ago

we get it ,you've made your point..... no need to have the same comment for every question .

upvoted 20 times

aprilson24

1 year, 7 months ago

ya right, why keeps on commenting for every question. lol

upvoted 2 times

...

Zonq

1 year, 6 months ago

Maybe to indicate that this exact question is for other exam?

upvoted 10 times

ramelas

1 year, 6 months ago

it is not from other exam. These questions are on dp-300...........

upvoted 3 times

...

Backy

Highly Voted 11 months, 2 weeks ago

Answer is A If you want to load into Spark pool then use Spark itself OPENROWSET is for the source, here the issue is the target meaning Spark

upvoted 8 times

...

Ciupaz

Most Recent 7 months, 1 week ago

Azure Sinapse Analytics is out of scope of the DP-300 exam.

upvoted 3 times

...

captainpike

1 year, 8 months ago

How using a serverless SQL pool can be the right answer if the question states to "have an Azure Synapse Analytics Apache Spark pool named Pool1."? Yes, It can be copied from serverless to Apache Spark pool but that's a heck of speculation. I am going to stick with PySpark (https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-development-using-notebooks?tabs=classical#set-a-primary-language)

upvoted 3 times

o2091

1 year, 6 months ago

Is the A correct? what do you think?

upvoted 1 times

ramelas

1 year, 5 months ago

A is correct, when you create native parquet tables in spark they are automaticly available in serverless sql pools as tables

upvoted 3 times

...