exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 94 discussion

A retail company stores transactions, store locations, and customer information tables in four reserved ra3.4xlarge Amazon Redshift cluster nodes. All three tables use even table distribution.

The company updates the store location table only once or twice every few years.

A data engineer notices that Redshift queues are slowing down because the whole store location table is constantly being broadcast to all four compute nodes for most queries. The data engineer wants to speed up the query performance by minimizing the broadcasting of the store location table.

Which solution will meet these requirements in the MOST cost-effective way?

  • A. Change the distribution style of the store location table from EVEN distribution to ALL distribution.
  • B. Change the distribution style of the store location table to KEY distribution based on the column that has the highest dimension.
  • C. Add a join column named store_id into the sort key for all the tables.
  • D. Upgrade the Redshift reserved node to a larger instance size in the same instance family.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
andrologin
9 months, 3 weeks ago
Selected Answer: A
ALL distribution is optimal for slowly changing dimension tables and generally small in size to allow for optimal joins.
upvoted 2 times
...
bakarys
10 months, 1 week ago
Selected Answer: A
The most cost-effective solution to speed up the query performance by minimizing the broadcasting of the store location table would be: A. Change the distribution style of the store location table from EVEN distribution to ALL distribution. In Amazon Redshift, the ALL distribution style replicates the entire table to all nodes in the cluster, which eliminates the need to redistribute the data when executing a query. This can significantly improve query performance. Given that the store location table is updated only once or twice every few years, the overhead of maintaining the replicated data would be minimal. This makes it a cost-effective solution for improving the query performance.
upvoted 2 times
...
PGGuy
10 months, 2 weeks ago
Selected Answer: A
Changing the distribution style of the store location table to ALL distribution (A) is the most cost-effective solution. It directly addresses the issue of broadcasting by ensuring the entire table is available on each node, significantly improving join performance without incurring substantial additional costs.
upvoted 4 times
...
tgv
10 months, 3 weeks ago
Selected Answer: A
Using ALL distribution means the table is replicated to all nodes, eliminating the need for broadcasting during queries. Since the store location table is updated infrequently, this will significantly speed up queries without incurring frequent update costs.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago