exam questions

Exam DP-300 All Questions

View all questions & answers for the DP-300 exam

Exam DP-300 topic 1 question 12 discussion

Actual exam question from Microsoft's DP-300
Question #: 12
Topic #: 1
[All DP-300 Questions]

You have an Azure Synapse Analytics Apache Spark pool named Pool1.
You plan to load JSON files from an Azure Data Lake Storage Gen2 container into the tables in Pool1. The structure and data types vary by file.
You need to load the files into the tables. The solution must maintain the source data types.
What should you do?

  • A. Load the data by using PySpark.
  • B. Load the data by using the OPENROWSET Transact-SQL command in an Azure Synapse Analytics serverless SQL pool.
  • C. Use a Get Metadata activity in Azure Data Factory.
  • D. Use a Conditional Split transformation in an Azure Synapse data flow.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️
Synapse notebooks support four Apache Spark languages:
PySpark (Python)
Spark (Scala)

Spark SQL -
.NET Spark (C#)
Note: Bring data to a notebook.
You can load data from Azure Blob Storage, Azure Data Lake Store Gen 2, and SQL pool as shown in the code samples below.
Read a CSV from Azure Data Lake Store Gen2 as a Spark DataFrame. from pyspark.sql import SparkSession from pyspark.sql.types import * account_name = "Your account name" container_name = "Your container name" relative_path = "Your path" adls_path = 'abfss://%s@%s.dfs.core.windows.net/%s' % (container_name, account_name, relative_path) df1 = spark.read.option('header', 'true') \
.option('delimiter', ',') \
.csv(adls_path + '/Testfile.csv')
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-development-using-notebooks

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
HichemZe
Highly Voted 1 year, 9 months ago
Question FOR DP-203 , Not For DBA (DP-300)
upvoted 32 times
GeoFlux121
1 year ago
Thanks HichemZe! I was about to have a panic attack regarding several of these questions before seeing your very helpful response!
upvoted 6 times
...
Mphorish
1 year, 9 months ago
we get it ,you've made your point..... no need to have the same comment for every question .
upvoted 20 times
aprilson24
1 year, 7 months ago
ya right, why keeps on commenting for every question. lol
upvoted 2 times
...
Zonq
1 year, 6 months ago
Maybe to indicate that this exact question is for other exam?
upvoted 10 times
ramelas
1 year, 6 months ago
it is not from other exam. These questions are on dp-300...........
upvoted 3 times
...
...
...
...
Backy
Highly Voted 11 months, 2 weeks ago
Answer is A If you want to load into Spark pool then use Spark itself OPENROWSET is for the source, here the issue is the target meaning Spark
upvoted 8 times
...
Ciupaz
Most Recent 7 months, 1 week ago
Azure Sinapse Analytics is out of scope of the DP-300 exam.
upvoted 3 times
...
captainpike
1 year, 8 months ago
How using a serverless SQL pool can be the right answer if the question states to "have an Azure Synapse Analytics Apache Spark pool named Pool1."? Yes, It can be copied from serverless to Apache Spark pool but that's a heck of speculation. I am going to stick with PySpark (https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-development-using-notebooks?tabs=classical#set-a-primary-language)
upvoted 3 times
o2091
1 year, 6 months ago
Is the A correct? what do you think?
upvoted 1 times
ramelas
1 year, 5 months ago
A is correct, when you create native parquet tables in spark they are automaticly available in serverless sql pools as tables
upvoted 3 times
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...