exam questions

Exam DP-200 All Questions

View all questions & answers for the DP-200 exam

Exam DP-200 topic 5 question 5 discussion

Actual exam question from Microsoft's DP-200
Question #: 5
Topic #: 5
[All DP-200 Questions]

You implement an enterprise data warehouse in Azure Synapse Analytics.
You have a large fact table that is 10 terabytes (TB) in size.
Incoming queries use the primary key Sale Key column to retrieve data as displayed in the following table:

You need to distribute the large fact table across multiple nodes to optimize performance of the table.
Which technology should you use?

  • A. hash distributed table with clustered ColumnStore index
  • B. hash distributed table with clustered index
  • C. heap table with distribution replicate
  • D. round robin distributed table with clustered index
  • E. round robin distributed table with clustered ColumnStore index
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️
Hash-distributed tables improve query performance on large fact tables.
Columnstore indexes can achieve up to 100x better performance on analytics and data warehousing workloads and up to 10x better data compression than traditional rowstore indexes.
Incorrect Answers:
D, E: Round-robin tables are useful for improving loading speed.
Reference:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-query-performance

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Hinzzz
3 years, 11 months ago
Clustered indexes may outperform clustered columnstore tables when a single row needs to be quickly retrieved. For queries where a single or very few row lookup is required to perform with extreme speed, consider a clustered index or nonclustered secondary index. The answer could be B as it is sales key-based rows retrieval.
upvoted 3 times
hello_there_
3 years, 11 months ago
I agree under the assumption that the table is only used for row retrieval by primary key. It isn't explicitly said that this is the only use though. If this really is the only use of the table, it would make more sense to put it in a CosmosDB with table API. Still, I think that B is the answer they want to hear. Otherwise, why would they have added that queries will use row retrieval by primary key?
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...