exam questions

Exam DP-500 All Questions

View all questions & answers for the DP-500 exam

Exam DP-500 topic 1 question 69 discussion

Actual exam question from Microsoft's DP-500
Question #: 69
Topic #: 1
[All DP-500 Questions]

You are creating an external table by using an Apache Spark pool in Azure Synapse Analytics. The table will contain more than 20 million rows partitioned by date. The table will be shared with the SQL engines.
You need to minimize how long it takes for a serverless SQL pool to execute a query data against the table.
In which file format should you recommend storing the table data?

  • A. CSV
  • B. Delta
  • C. JSON
  • D. Apache Parquet
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
SamuComqi
1 year, 10 months ago
Selected Answer: D
I took the exam a few days ago (14/8/2023) and I passed the exam with a score of 915. My answer was: Apache Parquet
upvoted 1 times
...
Albeeliu
2 years, 2 months ago
From Chatgpt: By using Apache Parquet format for the external table, you can minimize the query execution time for serverless SQL pools in Azure Synapse Analytics because: Columnar storage: Apache Parquet stores data in a columnar format, which allows for highly efficient and fast data access. This means that queries against the external table can be executed faster because only the relevant columns are read. Compression: Apache Parquet uses a highly efficient compression algorithm, which reduces the size of the data on disk. Smaller data size means less data to transfer, which results in faster query execution time. Partitioning: Apache Parquet supports partitioning, which allows you to subdivide the external table into smaller, more manageable files. When querying the table, only the relevant partitions are scanned, which makes query execution faster. Overall, by using Apache Parquet for the external table, you can significantly reduce the amount of time it takes for a serverless SQL pool to execute a query against the table, making it a more performant solution for analyzing large datasets.
upvoted 1 times
...
Hongzu13
2 years, 3 months ago
Selected Answer: D
Well, this link doesn't give the answer directly, but MS indirectly states that you should use Apache Parquet files for your SQL serverless pool. https://learn.microsoft.com/en-us/azure/synapse-analytics/get-started-analyze-sql-on-demand
upvoted 2 times
...
louisaok
2 years, 5 months ago
Selected Answer: D
D is correct
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...