exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 4 question 2 discussion

Actual exam question from Microsoft's DP-203
Question #: 2
Topic #: 4
[All DP-203 Questions]

You have an Azure Synapse Analytics dedicated SQL pool that contains a large fact table. The table contains 50 columns and 5 billion rows and is a heap.
Most queries against the table aggregate values from approximately 100 million rows and return only two columns.
You discover that the queries against the fact table are very slow.
Which type of index should you add to provide the fastest query times?

  • A. nonclustered columnstore
  • B. clustered columnstore
  • C. nonclustered
  • D. clustered
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
damaldon
Highly Voted 3 years, 10 months ago
correct!
upvoted 32 times
...
Miris
Highly Voted 3 years, 10 months ago
correct
upvoted 14 times
...
imatheushenrique
Most Recent 1 month, 2 weeks ago
Selected Answer: B
B. Clustered Columnstore Index Since your Azure Synapse Analytics dedicated SQL pool contains a large fact table with 5 billion rows stored as a heap, and queries typically aggregate values from 100 million rows, the best indexing strategy is a Clustered Columnstore Index (CCI). Columnstore indexes are optimized for large tables and analytical workloads (OLAP). Clustered Columnstore Index (CCI) compresses the data and stores it in columnar format, which significantly improves query performance on aggregations. CCI provides the fastest query times for scenarios where: A large number of rows is scanned. Queries perform aggregations and filtering. The query returns only a few columns from the large dataset
upvoted 1 times
...
j0n4
4 months, 4 weeks ago
Selected Answer: B
CORRECTO
upvoted 1 times
...
jsav1
1 year, 1 month ago
Selected Answer: B
clustered columnstore index
upvoted 1 times
...
jppdks
1 year, 1 month ago
Selected Answer: D
Clustered index * Tables with up to 100 million rows * Large tables (more than 100 million rows) with only 1-2 columns heavily used
upvoted 1 times
rlnd2000
1 year ago
I think you did not read correctly, the queries are aggregating from 100 million rows out of 5 billion, this suggests that there is some form of selection criteria applied, typically through a WHERE clause, to filter the data down to a subset that is relevant to the query. A clustered columnstore index would help in this scenario by efficiently compressing and storing the data in a columnar format, which is optimal for performing large-scale aggregations on a subset of columns. This type of index would indeed support the query patterns described, by enabling faster aggregation and filtering operations on large datasets
upvoted 2 times
...
...
Azure_2023
1 year, 3 months ago
Selected Answer: B
B. clustered columnstore. A clustered columnstore index is the most efficient type of index for querying large fact tables with a high proportion of aggregation queries. This is because a clustered columnstore index stores data in a columnar format, which is much more efficient for aggregation queries than a row-based format. Additionally, a clustered columnstore index stores data in a compressed format, which further reduces the amount of data that needs to be scanned. In this scenario, the fact table contains 50 columns and 5 billion rows, and most queries aggregate values from approximately 100 million rows and return only two columns. This indicates that the queries are primarily interested in summarizing the data in the table, rather than scanning the entire table for specific rows. Therefore, a clustered columnstore index is the most appropriate choice for improving the performance of these queries.
upvoted 1 times
...
AlfredPennyworth
1 year, 4 months ago
Clustered index * Tables with up to 100 million rows * Large tables (more than 100 million rows) with only 1-2 columns heavily used
upvoted 2 times
...
kkk5566
1 year, 8 months ago
Selected Answer: D
f your table size is less than the recommended 60 million rows for clustered columnstore indexing, consider using heap or clustered index tables.
upvoted 2 times
kkk5566
1 year, 8 months ago
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/cheat-sheet
upvoted 2 times
axantroff
1 year, 3 months ago
Where did you get this "60 million rows" number? I only see the following > Clustered columnstore index Great fit for... Large tables (more than 100 million rows) And it seems like our case
upvoted 2 times
...
...
...
[Removed]
1 year, 8 months ago
Selected Answer: C
It asks which index to add answer c
upvoted 1 times
...
[Removed]
1 year, 8 months ago
Selected Answer: A
A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a heap. Question says already it's a heap table and asks what to add ? So answer is A
upvoted 2 times
...
matiandal
1 year, 9 months ago
why not a NCCI - why not A ? Nonclustered columnstore index on a disk-based heap or B-tree index Use for: 1) An OLTP workload that has some analytics queries. You can drop B-tree indexes created for analytics and replace them with one nonclustered columnstore index. 2) Many traditional OLTP workloads that perform Extract Transform and Load (ETL) operations to move data to a separate data warehouse. You can eliminate ETL and a separate data warehouse by creating a nonclustered columnstore index on some of the OLTP tables. NCCI is an additional index that requires 10% more storage on average. R: https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-design-guidance?view=sql-server-ver16#choose-the-best-columnstore-index-for-your-needs Enjoy !
upvoted 1 times
Matt2000
1 year, 8 months ago
it is a currently a heap. thus clustered columnstore makes most sense.
upvoted 1 times
...
...
[Removed]
1 year, 10 months ago
Selected Answer: B
Only 2 columns returned
upvoted 1 times
...
auwia
1 year, 10 months ago
Selected Answer: B
B of course, there are a few scenarios where clustered columnstore may not be a good option: Columnstore tables do not support varchar(max), nvarchar(max), and varbinary(max). Consider heap or clustered index instead. Columnstore tables may be less efficient for transient data. Consider heap and perhaps even temporary tables. Small tables with less than 60 million rows. Consider heap tables.
upvoted 1 times
...
vctrhugo
1 year, 10 months ago
Selected Answer: B
B. clustered columnstore index. Given the large fact table with 50 columns and 5 billion rows, and the fact that most queries aggregate values from approximately 100 million rows and return only two columns, a clustered columnstore index would be the most suitable choice. Clustered columnstore indexes are designed for large-scale data warehousing scenarios and provide excellent compression and query performance for analytical workloads. A clustered columnstore index stores the data in columnar format, enabling efficient data compression and batch-based query execution. It allows for significant query performance improvements, especially for aggregations and large-scale data retrieval.
upvoted 2 times
...
mamahani
2 years ago
im really baffled by all the answers here; noone is even considering clustered index, which is what microsoft is recommending for this particular user case scenario; https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/cheat-sheet#index-your-table for a table up to 100 mln records and using heavily 1-2 columns and performing queries with lots of joins and aggregations (group by clause) microsoft recommends clustered index; why is this recommendation not applicable here? could someone explain?
upvoted 4 times
mamahani
2 years ago
ignore pls; instead of reading watch out if....i read just if, must have been tired?; so clustered index is NOT good when group by operations; its good if you need to retrieve 1 single row or few rows (but aggregate is not just few rows -> its many many rows aggregating to 1 row, which is not the same); by this i believe its indeed clustered columnstore index so the given answer is correct
upvoted 5 times
...
...
AHUI
2 years ago
Selected Answer: B
correct
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago