Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 24 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 24
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

A company maintains an Amazon Redshift provisioned cluster that the company uses for extract, transform, and load (ETL) operations to support critical analysis tasks. A sales team within the company maintains a Redshift cluster that the sales team uses for business intelligence (BI) tasks.
The sales team recently requested access to the data that is in the ETL Redshift cluster so the team can perform weekly summary analysis tasks. The sales team needs to join data from the ETL cluster with data that is in the sales team's BI cluster.
The company needs a solution that will share the ETL cluster data with the sales team without interrupting the critical analysis tasks. The solution must minimize usage of the computing resources of the ETL cluster.
Which solution will meet these requirements?

A. Set up the sales team BI cluster as a consumer of the ETL cluster by using Redshift data sharing.
B. Create materialized views based on the sales team's requirements. Grant the sales team direct access to the ETL cluster.
C. Create database views based on the sales team's requirements. Grant the sales team direct access to the ETL cluster.
D. Unload a copy of the data from the ETL cluster to an Amazon S3 bucket every week. Create an Amazon Redshift Spectrum table based on the content of the ETL cluster.

Show Suggested Answer

Suggested Answer: A 🗳️

by [deleted] at Jan. 21, 2024, 2:47 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

arvehisa

Highly Voted 1 year, 2 months ago

Selected Answer: A

A: redshift data sharing: https://docs.aws.amazon.com/redshift/latest/dg/data_sharing_intro.html With data sharing, you can securely and easily share live data across Amazon Redshift clusters. B: materialized view is only within 1 redshift cluster, across different tables

upvoted 5 times

...

lucas_rfsb

Highly Voted 1 year, 2 months ago

Selected Answer: D

In my opinion using Redshift Data Sharing will consume less resources. 'D' envolves using a S3 bucket.

upvoted 5 times

lucas_rfsb

1 year, 2 months ago

Sorry I wanted to select A but did D

upvoted 7 times

...

motk123

Most Recent 8 months, 2 weeks ago

Seems that the performance of the critical ETL cluster should not be affected when using data sharing, so the answer is likely A: https://docs.aws.amazon.com/redshift/latest/dg/data_sharing_intro.html Supporting different kinds of business-critical workloads – Use a central extract, transform, and load (ETL) cluster that shares data with multiple business intelligence (BI) or analytic clusters. This approach provides read workload isolation and chargeback for individual workloads. You can size and scale your individual workload compute according to the workload-specific requirements of price and performance. https://docs.aws.amazon.com/redshift/latest/dg/considerations.html The performance of the queries on shared data depends on the compute capacity of the consumer clusters.

upvoted 2 times

...

wimalik

9 months ago

A as Redshift data sharing allows you to share live data across Redshift clusters without having to duplicate the data. This feature enables the sales team to access the data from the ETL cluster directly without interrupting the critical analysis tasks or overloading the ETL cluster's resources. The sales team can join this shared data with their own data in the BI cluster efficiently.

upvoted 1 times

...

San_Juan

9 months, 1 week ago

Selected Answer: D

"The solution must minimize usage of the computing resources of the ETL cluster." That is key. You shouldn't use ETL cluster, so unload data to S3 and run queries in a separate Redshift Spectrum database. ETL cluster do nothing meanwhile.

upvoted 1 times

...

VerRi

1 year ago

Selected Answer: A

Typetical Redshift data sharing use case

upvoted 3 times

...

valuedate

1 year ago

key words: "weekly" "The solution must minimize usage of the computing resources of the ETL cluster." Answer:D

upvoted 2 times

...

d8945a1

1 year, 1 month ago

Selected Answer: A

Typical usecase of datasharing in Redshift. The question mentions that - 'team needs to join data from the ETL cluster with data that is in the sales team's BI cluster.' This is possible with datashare.

upvoted 4 times

...

jasango

1 year, 2 months ago

Selected Answer: D

The spectrum table is accessed from the sales cluster with zero impact on the ETL cluster.

upvoted 3 times

...

certplan

1 year, 2 months ago

Options A, B, and C involve granting the sales team direct access to the ETL cluster, which could potentially impact the performance of the ETL cluster and interfere with its critical analysis tasks. Option D provides a more isolated and scalable approach by leveraging Amazon S3 and Redshift Spectrum for data sharing while minimizing the usage of the ETL cluster's computing resources. https://docs.aws.amazon.com/redshift/latest/dg/c-using-spectrum-sharing-data.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-design-tables.html

upvoted 1 times

...

certplan

1 year, 2 months ago

Overall, while both options offer ways to share data between the ETL and BI clusters, Option D offers a more robust and scalable solution that minimizes the impact on the ETL cluster's resources and provides greater flexibility and independence for the sales team's analysis tasks. By unloading a copy of the data from the ETL cluster to Amazon S3 and leveraging Redshift Spectrum for querying, the solution aligns with AWS best practices for managing data and resource usage in Amazon Redshift clusters. It ensures that critical analysis tasks are not interrupted while providing the sales team with the necessary access to perform their analysis tasks efficiently.

upvoted 2 times

...

GiorgioGss

1 year, 2 months ago

Selected Answer: A

Initially I would go with B but that definitely will use more resource.

upvoted 5 times

...

[Removed]

1 year, 4 months ago

Selected Answer: A

To share data between Redshift clusters and meet the requirements of sharing ETL cluster data with the sales team without interrupting critical analysis tasks and minimizing the usage of the ETL cluster's computing resources, Redshift Data Sharing is the way to go https://docs.aws.amazon.com/redshift/latest/dg/data_sharing_intro.html "Supporting different kinds of business-critical workloads – Use a central extract, transform, and load (ETL) cluster that shares data with multiple business intelligence (BI) or analytic clusters. This approach provides read workload isolation and chargeback for individual workloads. You can size and scale your individual workload compute according to the workload-specific requirements of price and performance"

upvoted 4 times

...