Exam DP-300 All Questions

View all questions & answers for the DP-300 exam

Exam DP-300 topic 1 question 15 discussion

Actual exam question from Microsoft's DP-300

Question #: 15
Topic #: 1

DRAG DROP -
You plan to create a table in an Azure Synapse Analytics dedicated SQL pool.
Data in the table will be retained for five years. Once a year, data that is older than five years will be deleted.
You need to ensure that the data is distributed evenly across partitions. The solutions must minimize the amount of time required to delete old data.
How should you complete the Transact-SQL statement? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all.
You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer

Suggested Answer:

Box 1: HASH -

Box 2: OrderDateKey -
In most cases, table partitions are created on a date column.
A way to eliminate rollbacks is to use Metadata Only operations like partition switching for data management. For example, rather than execute a DELETE statement to delete all rows in a table where the order_date was in October of 2001, you could partition your data early. Then you can switch out the partition with data for an empty partition from another table.
Reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-azure-sql-data-warehouse https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool

by Mladen_66 at Aug. 28, 2021, 7:11 p.m.

Comments

Submit Cancel

learnazureportal

Highly Voted 10 months, 2 weeks ago

The provided answer is correct.

upvoted 10 times

U_C

10 months, 1 week ago

I agree. Here is an example from MS: CREATE TABLE myTable ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_shipinstruct char(25), l_shipmode char(10), l_comment varchar(44)) WITH ( DISTRIBUTION = HASH (l_orderkey), CLUSTERED COLUMNSTORE INDEX, PARTITION ( l_shipdate RANGE RIGHT FOR VALUES ( '1992-01-01','1992-02-01','1992-03-01','1992-04-01','1992-05-01', '1992-06-01','1992-07-01','1992-08-01','1992-09-01','1992-10-01', '1992-11-01','1992-12-01','1993-01-01','1993-02-01','1993-03-01' )) );

upvoted 5 times

U_C

3 years, 9 months ago

FYI DISTRIBUTION = HASH ( distribution_column_name ) Assigns each row to one distribution by hashing the value stored in distribution_column_name. The algorithm is deterministic, which means it always hashes the same value to the same distribution. The distribution column should be defined as NOT NULL because all rows that have NULL are assigned to the same distribution. DISTRIBUTION = ROUND_ROBIN Distributes the rows evenly across all the distributions in a round-robin fashion. This behavior is the default for Azure Synapse Analytics. DISTRIBUTION = REPLICATE Stores one copy of the table on each Compute node. For Azure Synapse Analytics the table is stored on a distribution database on each Compute node. For Analytics Platform System (PDW), the table is stored in a SQL Server filegroup that spans the Compute node. This behavior is the default for Analytics Platform System (PDW).

upvoted 4 times

...

Mladen_66

Highly Voted 3 years, 11 months ago

DISTRIBUTION = ROUND_ROBIN Distributes the rows evenly across all the distributions in a round-robin fashion. This behavior is the default for Azure Synapse Analytics.

upvoted 6 times

ramelas

3 years, 7 months ago

distribution in a column just in hash

upvoted 2 times

...

o2091

Most Recent 3 years, 8 months ago

is part of DP-300?

upvoted 3 times

...

matongax

3 years, 9 months ago

from what i read fact = hash dimensions = round robin

upvoted 2 times

...

captainpike

3 years, 10 months ago

(distribution_column_name) is only available in HASH https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-azure-sql-data-warehouse?view=aps-pdw-2016-au7

upvoted 5 times

...

Aggie0702

3 years, 10 months ago

By default, tables are Round Robin distributed. This default makes it easy for users to start creating tables without having to decide how their tables should be distributed. Round Robin tables may perform sufficiently for some workloads. But, in most cases, a distribution column provides better performance. The most common example of a table distributed by a column outperforming a Round Robin table is when two large fact tables are joined.

upvoted 1 times

...

Aggie0702

3 years, 10 months ago

A round-robin distributed table distributes table rows evenly across all distributions. The assignment of rows to distributions is random. Unlike hash-distributed tables, rows with equal values are not guaranteed to be assigned to the same distribution.

upvoted 1 times

...

ovokpus

3 years, 11 months ago

Is this a DP-300 question?

upvoted 5 times

o2091

3 years, 8 months ago

I would say no, what do you think?

upvoted 2 times

...

maple580122

3 years, 11 months ago

The question requires to distribute evenly across "partitions". So it should be HASH.

upvoted 3 times

...