exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 5 question 1 discussion

Actual exam question from Microsoft's DP-203
Question #: 1
Topic #: 5
[All DP-203 Questions]

HOTSPOT -
You need to design a data storage structure for the product sales transactions. The solution must meet the sales transaction dataset requirements.
What should you include in the solution? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: Hash -
Scenario:
Ensure that queries joining and filtering sales transaction records based on product ID complete as quickly as possible.
A hash distributed table can deliver the highest query performance for joins and aggregations on large tables.
Box 2: Set the distribution column to the sales date.
Scenario: Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month. Boundary values must belong to the partition on the right.
Reference:
https://rajanieshkaushikk.com/2020/09/09/how-to-choose-right-data-distribution-strategy-for-azure-synapse/

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Jerrie86
Highly Voted 2 years, 3 months ago
This case study was in my exam and I scored 970. I chose productid.
upvoted 60 times
RoyP654
1 year, 10 months ago
Good Job, Congrats!
upvoted 8 times
...
jongert
1 year, 3 months ago
Congrats! The answer is productid, since ms documentation states NOT to distribute by a date column. When doing so, all data for a given date is partitioned into one distribution. When processing, this hinders parallelism.
upvoted 4 times
...
...
Julia01
Highly Voted 2 years, 7 months ago
Id choose product id as well since it will be used in joins "Ensure that queries joining and filtering sales transaction records based on product ID complete as quickly as possible."
upvoted 21 times
mokrani
2 years, 6 months ago
Why not sales date for distribution column ? Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month. Boundary values must belong to the partition on the right...
upvoted 1 times
kl8585
2 years, 5 months ago
because it's asking about distribution, not partition. The requirements say "ensure that queries joining and filtering sales transaction records based on product ID complete as quikly as possible". The best way to do so is hash distrinuting on product ID, this way all rows with the same product id will be on the same node and there will be no data shuffling, hence fast queries
upvoted 16 times
...
...
...
imatheushenrique
Most Recent 1 month, 2 weeks ago
Hash and productId. Consider using the round-robin distribution for your table in the following scenarios: When getting started as a simple starting point since it is the default If there is no obvious joining key; If there is no good candidate column for hash distributing the table; If the table does not share a common join key with other tables ;If the join is less significant than other joins in the query; When the table is a temporary staging table
upvoted 1 times
...
7082935
9 months ago
I'll repeat advice I read from another question: NEVER set distribution on a DATE column. However, partition on DATE is good.
upvoted 2 times
...
kkk5566
1 year, 8 months ago
Hash and Distrubution on Product ID
upvoted 3 times
...
XiltroX
2 years, 5 months ago
In MS's own documentation, it is not recommended to use a date column for distribution. Therefore, the second option should be ProductID
upvoted 8 times
pavankr
1 year, 11 months ago
So then why this guy is misleading us?? I find lot of answers misleading us.
upvoted 3 times
...
...
OldSchool
2 years, 5 months ago
Hash and Distrubution on Product ID, never make distribution on Date.: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute#choose-a-distribution-column-with-data-that-distributes-evenly Partition on Date as explained here: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition
upvoted 12 times
kornat
2 years ago
True! ! !
upvoted 2 times
...
...
berend1
2 years, 6 months ago
Partition column: date, distribution column: ProductID
upvoted 5 times
...
greenlever
2 years, 7 months ago
I think so, Set distribution to Product ID
upvoted 3 times
...
pangas2567
2 years, 7 months ago
Why not Set distribution to Product ID? With the date as the distribution column we lose the advantage of using all 60 nodes, right?
upvoted 9 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago