exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 5 question 3 discussion

Actual exam question from Microsoft's DP-203
Question #: 3
Topic #: 5
[All DP-203 Questions]

HOTSPOT -
You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.
What should you include in the solution? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: Sales date -
Scenario: Contoso requirements for data integration include:
✑ Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month. Boundary values must belong to the partition on the right.
Box 2: An Azure Synapse Analytics Dedicated SQL pool
Scenario: Contoso requirements for data integration include:
✑ Ensure that data storage costs and performance are predictable.
The size of a dedicated SQL pool (formerly SQL DW) is determined by Data Warehousing Units (DWU).
Dedicated SQL pool (formerly SQL DW) stores data in relational tables with columnar storage. This format significantly reduces the data storage costs, and improves query performance.
Synapse analytics dedicated sql pool
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-overview-what-is

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Jerrie86
Highly Voted 1 year, 4 months ago
Partition is different than distribution. Distribution=ProductID and partition by Date. Distribution: When you store a table on Azure DW you are storing it amongst 60 nodes. Your table data is distributed across these nodes (using Hash distribution or Round Robin distribution depending on your needs). You can also choose to have your table (preferably a very small table) replicated across these nodes. Paritition : Partitioning is completely divorced from this concept of distribution. When we partition a table we decide which rows belong into which partitions based on some scheme ( like date in this case) Chunk of records for that date range gets its own space in the backend behind the scenes. we can partition data based on anything as long as we know how the data is in our system. And when we put both in use together, all the partitions are horizontally partitioned so that the incoming data is divided into 60 nodes to provide extreme parallelization to the queries. https://www.linkedin.com/pulse/partitioning-distribution-azure-synapse-analytics-swapnil-mule
upvoted 20 times
...
DataEngDP
Most Recent 8 months, 3 weeks ago
Load the sales transaction dataset to Azure Synapse Analytics---HERE you have the answer on where to store the "transactional"data---ONLY POSSIBILITY is Azure Synapse Analytics Dedicated SQL Pool.
upvoted 2 times
...
kkk5566
9 months, 1 week ago
Partition by date &dedicated pool
upvoted 3 times
...
gerrie1979
1 year, 7 months ago
As far as I see it, we need to distribute the fact table accross the 60 distributions of a dedicated sql pool which means using NO date key (because of MPP) so using the productId key and within each distribution we need to partition the data by the date column so that data can quickly be deleted and queried by all 60 distributions at once
upvoted 2 times
Jerrie86
1 year, 4 months ago
First question is partition not distribution. So Date is correct
upvoted 3 times
...
...
sensaint
1 year, 7 months ago
I would partition by ProductID since joins and filtering must be optimized for that column
upvoted 1 times
mokrani
1 year, 7 months ago
Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month. Boundary values must belong to the partition on the right. Also we will delete data using sales date I think distribution = ProductID , Partition = Sales_date
upvoted 16 times
sensaint
1 year, 5 months ago
Correct. Forget above statement. Partition should be Sales Date!!
upvoted 7 times
...
...
Igor85
1 year, 5 months ago
don't confuse partitions and distribution for hash-distributed table
upvoted 5 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...