Exam DP-200 All Questions

View all questions & answers for the DP-200 exam

Exam DP-200 topic 1 question 55 discussion

Actual exam question from Microsoft's DP-200

Question #: 55
Topic #: 1

HOTSPOT -
You have a SQL pool in Azure Synapse.
You plan to load data from Azure Blob storage to a staging table. Approximately 1 million rows of data will be loaded daily. The table will be truncated before each daily load.
You need to create the staging table. The solution must minimize how long it takes to load the data to the staging table.
How should you configure the table? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer

Suggested Answer:

Box 1: Hash -
Hash-distributed tables improve query performance on large fact tables. hey can have very large numbers of rows and still achieve high performance.
Incorrect:
Round-robin tables are useful for improving loading speed.

Box 2: Clustered columnstore -
When creating partitions on clustered columnstore tables, it is important to consider how many rows belong to each partition. For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed.

Box 3: Date -
Table partitions enable you to divide your data into smaller groups of data. In most cases, table partitions are created on a date column.
Partition switching can be used to quickly remove or replace a section of a table.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute

by LongBao at April 4, 2021, 7:33 a.m.

Comments

Submit Cancel

mamhh

Highly Voted 4 years, 2 months ago

Round-Robin Heap None

upvoted 68 times

...

LongBao

Highly Voted 4 years, 2 months ago

From https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-overview: "Use round-robin for the staging table. The load with CTAS is fast. Once the data is in the staging table, use INSERT...SELECT to move the data to production tables." This is a staging table, not a fact, so I think the answer is Round robin, Heap, Date.

upvoted 10 times

maciejt

4 years ago

If Round Robin, then you don't choose column to partition on, it can only be None

upvoted 2 times

uzairahm

2 years, 11 months ago

As @TessieB has indicated quoting MS Docs "Partitioning is also supported on all distribution types, including both hash or round robin distributed." https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition

upvoted 1 times

...

dark_one

Most Recent 1 year, 1 month ago

round robin, heap, none https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/data-loading-best-practices#load-to-a-staging-table

upvoted 1 times

...

saranya23

3 years, 10 months ago

Round-Robin , Heap , None

upvoted 2 times

...

elimey

3 years, 10 months ago

round-robin, haep, non Why should someone hash distribute a stage table?

upvoted 1 times

...

TessieB

3 years, 11 months ago

I think the answer should be: Round robin Heap Date See here: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition Partitioning and distribution are two different things! Partitioning is supported by all types of distribution and by all types of indexes! Partitioning can speed up the loading process and it's often done by using a date column. In this question it's not clear however, if the Date column is the right fit for the loading process, but since it is often used like that, I'm gonna go and say that Date might be the correct answer! :)

upvoted 2 times

...

alok1988

4 years ago

Round-Robin , Heap , None

upvoted 4 times

...

Saravjeet

4 years ago

I think it should be Round-RObin, Heap and None. Refer the link https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-data-loading-guidance?view=azure-sqldw-latest

upvoted 4 times

...

Qrm_1972

4 years, 1 month ago

The correct answer is : Round-Robin Clustered columnstore Data Table partitions enable you to divide your data into smaller groups of data. In most cases, table partitions are created on a date column. Partitioning is supported on all dedicated SQL pool table types; including clustered columnstore, clustered index, and heap. Partitioning is also supported on all distribution types, including both hash or round robin distributed. https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition

upvoted 2 times

maciejt

4 years ago

Round RObin is random equal distribution, it doesn't include choosing a column to partition on

upvoted 1 times

...

robin_examtopics

4 years, 1 month ago

The answer should be Round-Robin/ Heap/ None. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/guidance-for-loading-data#loading-to-a-staging-table “To achieve the fastest loading speed for moving data into a dedicated SQL pool table, load data into a staging table. Define the staging table as a heap and use round-robin for the distribution option.”

upvoted 7 times

...

itmemememe

4 years, 1 month ago

If you are partitioning by Date and using Clustered Column Store, Would that be faster than using a Heap, None?

upvoted 1 times

...

cadio30

4 years, 1 month ago

Round-Robin Heap None https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-index

upvoted 3 times

...

cadio30

4 years, 1 month ago

round-robin heap none Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-index

upvoted 2 times

...

jamorey

4 years, 1 month ago

Anyone know the correct answer?, I'm between (Round-Robin, Heap and None) the other option (Round-Robin, Heap, Date)

upvoted 1 times

Garnew

4 years, 1 month ago

I'll go with "Round-Robin, Heap and None", Round-Robin is the best and obvious choice, then Heap as there's no need for indexing since it's just for loading purposes, and then None as Round-Robin does not support partitioning.

upvoted 1 times

...

Pairon

4 years, 2 months ago

I think that the first is round robin (fastest way to load data) and the third box should be None, since round robin doesn't need partitioning

upvoted 4 times

...

alf99

4 years, 2 months ago

Answer should be "round-robin"

upvoted 7 times

...

princy18

4 years, 2 months ago

Consider using the round-robin distribution for your table in the following scenarios: When getting started as a simple starting point since it is the default If there is no obvious joining key If there is no good candidate column for hash distributing the table If the table does not share a common join key with other tables If the join is less significant than other joins in the query When the table is a temporary staging table

upvoted 3 times

...