exam questions

Exam DP-300 All Questions

View all questions & answers for the DP-300 exam

Exam DP-300 topic 1 question 15 discussion

Actual exam question from Microsoft's DP-300
Question #: 15
Topic #: 1
[All DP-300 Questions]

DRAG DROP -
You plan to create a table in an Azure Synapse Analytics dedicated SQL pool.
Data in the table will be retained for five years. Once a year, data that is older than five years will be deleted.
You need to ensure that the data is distributed evenly across partitions. The solutions must minimize the amount of time required to delete old data.
How should you complete the Transact-SQL statement? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all.
You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: HASH -

Box 2: OrderDateKey -
In most cases, table partitions are created on a date column.
A way to eliminate rollbacks is to use Metadata Only operations like partition switching for data management. For example, rather than execute a DELETE statement to delete all rows in a table where the order_date was in October of 2001, you could partition your data early. Then you can switch out the partition with data for an empty partition from another table.
Reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-azure-sql-data-warehouse https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
learnazureportal
Highly Voted 8 months, 2 weeks ago
The provided answer is correct.
upvoted 10 times
U_C
8 months, 2 weeks ago
I agree. Here is an example from MS: CREATE TABLE myTable ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_shipinstruct char(25), l_shipmode char(10), l_comment varchar(44)) WITH ( DISTRIBUTION = HASH (l_orderkey), CLUSTERED COLUMNSTORE INDEX, PARTITION ( l_shipdate RANGE RIGHT FOR VALUES ( '1992-01-01','1992-02-01','1992-03-01','1992-04-01','1992-05-01', '1992-06-01','1992-07-01','1992-08-01','1992-09-01','1992-10-01', '1992-11-01','1992-12-01','1993-01-01','1993-02-01','1993-03-01' )) );
upvoted 5 times
U_C
3 years, 7 months ago
FYI DISTRIBUTION = HASH ( distribution_column_name ) Assigns each row to one distribution by hashing the value stored in distribution_column_name. The algorithm is deterministic, which means it always hashes the same value to the same distribution. The distribution column should be defined as NOT NULL because all rows that have NULL are assigned to the same distribution. DISTRIBUTION = ROUND_ROBIN Distributes the rows evenly across all the distributions in a round-robin fashion. This behavior is the default for Azure Synapse Analytics. DISTRIBUTION = REPLICATE Stores one copy of the table on each Compute node. For Azure Synapse Analytics the table is stored on a distribution database on each Compute node. For Analytics Platform System (PDW), the table is stored in a SQL Server filegroup that spans the Compute node. This behavior is the default for Analytics Platform System (PDW).
upvoted 4 times
...
...
...
Mladen_66
Highly Voted 3 years, 9 months ago
DISTRIBUTION = ROUND_ROBIN Distributes the rows evenly across all the distributions in a round-robin fashion. This behavior is the default for Azure Synapse Analytics.
upvoted 6 times
ramelas
3 years, 6 months ago
distribution in a column just in hash
upvoted 2 times
...
...
o2091
Most Recent 3 years, 6 months ago
is part of DP-300?
upvoted 3 times
...
matongax
3 years, 7 months ago
from what i read fact = hash dimensions = round robin
upvoted 2 times
...
captainpike
3 years, 8 months ago
(distribution_column_name) is only available in HASH https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-azure-sql-data-warehouse?view=aps-pdw-2016-au7
upvoted 5 times
...
Aggie0702
3 years, 8 months ago
By default, tables are Round Robin distributed. This default makes it easy for users to start creating tables without having to decide how their tables should be distributed. Round Robin tables may perform sufficiently for some workloads. But, in most cases, a distribution column provides better performance. The most common example of a table distributed by a column outperforming a Round Robin table is when two large fact tables are joined.
upvoted 1 times
...
Aggie0702
3 years, 8 months ago
A round-robin distributed table distributes table rows evenly across all distributions. The assignment of rows to distributions is random. Unlike hash-distributed tables, rows with equal values are not guaranteed to be assigned to the same distribution.
upvoted 1 times
...
ovokpus
3 years, 9 months ago
Is this a DP-300 question?
upvoted 5 times
o2091
3 years, 6 months ago
I would say no, what do you think?
upvoted 2 times
...
...
maple580122
3 years, 9 months ago
The question requires to distribute evenly across "partitions". So it should be HASH.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...