Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 4 question 2 discussion

Actual exam question from Microsoft's DP-203

Question #: 2
Topic #: 4

You have an Azure Synapse Analytics dedicated SQL pool that contains a large fact table. The table contains 50 columns and 5 billion rows and is a heap.
Most queries against the table aggregate values from approximately 100 million rows and return only two columns.
You discover that the queries against the fact table are very slow.
Which type of index should you add to provide the fastest query times?

A. nonclustered columnstore
B. clustered columnstore
C. nonclustered
D. clustered

Show Suggested Answer

Suggested Answer: B 🗳️

by Miris at June 8, 2021, 7:18 p.m.

Comments

Submit Cancel

damaldon

Highly Voted 3 years, 10 months ago

correct!

upvoted 32 times

...

Miris

Highly Voted 3 years, 10 months ago

correct

upvoted 14 times

...

imatheushenrique

Most Recent 1 month, 2 weeks ago

Selected Answer: B

B. Clustered Columnstore Index Since your Azure Synapse Analytics dedicated SQL pool contains a large fact table with 5 billion rows stored as a heap, and queries typically aggregate values from 100 million rows, the best indexing strategy is a Clustered Columnstore Index (CCI). Columnstore indexes are optimized for large tables and analytical workloads (OLAP). Clustered Columnstore Index (CCI) compresses the data and stores it in columnar format, which significantly improves query performance on aggregations. CCI provides the fastest query times for scenarios where: A large number of rows is scanned. Queries perform aggregations and filtering. The query returns only a few columns from the large dataset

upvoted 1 times

...

j0n4

4 months, 4 weeks ago

Selected Answer: B

CORRECTO

upvoted 1 times

...

jsav1

1 year, 1 month ago

Selected Answer: B

clustered columnstore index

upvoted 1 times

...

jppdks

1 year, 1 month ago

Selected Answer: D

Clustered index * Tables with up to 100 million rows * Large tables (more than 100 million rows) with only 1-2 columns heavily used

upvoted 1 times

rlnd2000

1 year ago

I think you did not read correctly, the queries are aggregating from 100 million rows out of 5 billion, this suggests that there is some form of selection criteria applied, typically through a WHERE clause, to filter the data down to a subset that is relevant to the query. A clustered columnstore index would help in this scenario by efficiently compressing and storing the data in a columnar format, which is optimal for performing large-scale aggregations on a subset of columns. This type of index would indeed support the query patterns described, by enabling faster aggregation and filtering operations on large datasets

upvoted 2 times

...

Azure_2023

1 year, 3 months ago

Selected Answer: B

B. clustered columnstore. A clustered columnstore index is the most efficient type of index for querying large fact tables with a high proportion of aggregation queries. This is because a clustered columnstore index stores data in a columnar format, which is much more efficient for aggregation queries than a row-based format. Additionally, a clustered columnstore index stores data in a compressed format, which further reduces the amount of data that needs to be scanned. In this scenario, the fact table contains 50 columns and 5 billion rows, and most queries aggregate values from approximately 100 million rows and return only two columns. This indicates that the queries are primarily interested in summarizing the data in the table, rather than scanning the entire table for specific rows. Therefore, a clustered columnstore index is the most appropriate choice for improving the performance of these queries.

upvoted 1 times

...

AlfredPennyworth

1 year, 4 months ago

Clustered index * Tables with up to 100 million rows * Large tables (more than 100 million rows) with only 1-2 columns heavily used

upvoted 2 times

...

kkk5566

1 year, 8 months ago

Selected Answer: D

f your table size is less than the recommended 60 million rows for clustered columnstore indexing, consider using heap or clustered index tables.

upvoted 2 times

kkk5566

1 year, 8 months ago

https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/cheat-sheet

upvoted 2 times

axantroff

1 year, 3 months ago

Where did you get this "60 million rows" number? I only see the following > Clustered columnstore index Great fit for... Large tables (more than 100 million rows) And it seems like our case

upvoted 2 times

...

[Removed]

1 year, 8 months ago

Selected Answer: C

It asks which index to add answer c

upvoted 1 times

...

[Removed]

1 year, 8 months ago

Selected Answer: A

A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a heap. Question says already it's a heap table and asks what to add ? So answer is A

upvoted 2 times

...

matiandal

1 year, 9 months ago

why not a NCCI - why not A ? Nonclustered columnstore index on a disk-based heap or B-tree index Use for: 1) An OLTP workload that has some analytics queries. You can drop B-tree indexes created for analytics and replace them with one nonclustered columnstore index. 2) Many traditional OLTP workloads that perform Extract Transform and Load (ETL) operations to move data to a separate data warehouse. You can eliminate ETL and a separate data warehouse by creating a nonclustered columnstore index on some of the OLTP tables. NCCI is an additional index that requires 10% more storage on average. R: https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-design-guidance?view=sql-server-ver16#choose-the-best-columnstore-index-for-your-needs Enjoy !

upvoted 1 times

Matt2000

1 year, 8 months ago

it is a currently a heap. thus clustered columnstore makes most sense.

upvoted 1 times

...

[Removed]

1 year, 10 months ago

Selected Answer: B

Only 2 columns returned

upvoted 1 times

...

auwia

1 year, 10 months ago

Selected Answer: B

B of course, there are a few scenarios where clustered columnstore may not be a good option: Columnstore tables do not support varchar(max), nvarchar(max), and varbinary(max). Consider heap or clustered index instead. Columnstore tables may be less efficient for transient data. Consider heap and perhaps even temporary tables. Small tables with less than 60 million rows. Consider heap tables.

upvoted 1 times

...

vctrhugo

1 year, 10 months ago

Selected Answer: B

B. clustered columnstore index. Given the large fact table with 50 columns and 5 billion rows, and the fact that most queries aggregate values from approximately 100 million rows and return only two columns, a clustered columnstore index would be the most suitable choice. Clustered columnstore indexes are designed for large-scale data warehousing scenarios and provide excellent compression and query performance for analytical workloads. A clustered columnstore index stores the data in columnar format, enabling efficient data compression and batch-based query execution. It allows for significant query performance improvements, especially for aggregations and large-scale data retrieval.

upvoted 2 times

...

mamahani

2 years ago

im really baffled by all the answers here; noone is even considering clustered index, which is what microsoft is recommending for this particular user case scenario; https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/cheat-sheet#index-your-table for a table up to 100 mln records and using heavily 1-2 columns and performing queries with lots of joins and aggregations (group by clause) microsoft recommends clustered index; why is this recommendation not applicable here? could someone explain?

upvoted 4 times

mamahani

2 years ago

ignore pls; instead of reading watch out if....i read just if, must have been tired?; so clustered index is NOT good when group by operations; its good if you need to retrieve 1 single row or few rows (but aggregate is not just few rows -> its many many rows aggregating to 1 row, which is not the same); by this i believe its indeed clustered columnstore index so the given answer is correct

upvoted 5 times

...

AHUI

2 years ago

Selected Answer: B

correct

upvoted 3 times

...

Load full discussion...

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 4 question 2 discussion

Comments

damaldon

Miris

imatheushenrique

j0n4

jsav1

jppdks

rlnd2000

Azure_2023

AlfredPennyworth

kkk5566

kkk5566

axantroff

[Removed]

[Removed]

matiandal

Matt2000

[Removed]

auwia

vctrhugo

mamahani

mamahani

AHUI

SY0-701