exam questions

Exam DP-201 All Questions

View all questions & answers for the DP-201 exam

Exam DP-201 topic 20 question 5 discussion

Actual exam question from Microsoft's DP-201
Question #: 5
Topic #: 20
[All DP-201 Questions]

HOTSPOT -
You need to design the storage for the Health Insights data platform.
Which types of tables should you include in the design? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: Hash-distributed tables -
The new Health Insights application must be built on a massively parallel processing (MPP) architecture that will support the high performance of joins on large fact tables.
Hash-distributed tables improve query performance on large fact tables.
Box 2: Round-robin distributed tables
A round-robin distributed table distributes table rows evenly across all distributions. The assignment of rows to distributions is random.
Scenario:
ADatum identifies the following requirements for the Health Insights application:
✑ The new Health Insights application must be built on a massively parallel processing (MPP) architecture that will support the high performance of joins on large fact tables.
Reference:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute
Design Azure data storage solutions

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
kz_data
Highly Voted 4 years, 2 months ago
Data Dimension should be replicated tables
upvoted 28 times
maynard13x8
4 years, 2 months ago
Round robin improve data loading, which is needed. Data from Interface and Review must be loading in less than 15 minutes.
upvoted 2 times
...
...
joegei
Highly Voted 4 years, 1 month ago
Round Robin is used to increase staging performance. It doesn''t make sense to use this on Dimension tables who tend to be relatively small and don't change often. So load performance is less of an issue. Query performance however will benefit from replicated tables. Therefore I would go for replicated tables as well
upvoted 7 times
...
arpit_dataguy
Most Recent 3 years, 11 months ago
If the table size is < 2 GB, we should always go with Replicated Tables. As this is cached in all the nodes therefore reduces data movement across nodes.
upvoted 1 times
...
savin
4 years ago
Round Robin is bad option for Dim tables. Idea is to use replicated tables for dimensions to improve performance
upvoted 1 times
...
felmasri
4 years, 2 months ago
I agree, unless the part says used with multiple fact tables' joins means something else
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...