exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 24 discussion

A company maintains an Amazon Redshift provisioned cluster that the company uses for extract, transform, and load (ETL) operations to support critical analysis tasks. A sales team within the company maintains a Redshift cluster that the sales team uses for business intelligence (BI) tasks.
The sales team recently requested access to the data that is in the ETL Redshift cluster so the team can perform weekly summary analysis tasks. The sales team needs to join data from the ETL cluster with data that is in the sales team's BI cluster.
The company needs a solution that will share the ETL cluster data with the sales team without interrupting the critical analysis tasks. The solution must minimize usage of the computing resources of the ETL cluster.
Which solution will meet these requirements?

  • A. Set up the sales team BI cluster as a consumer of the ETL cluster by using Redshift data sharing.
  • B. Create materialized views based on the sales team's requirements. Grant the sales team direct access to the ETL cluster.
  • C. Create database views based on the sales team's requirements. Grant the sales team direct access to the ETL cluster.
  • D. Unload a copy of the data from the ETL cluster to an Amazon S3 bucket every week. Create an Amazon Redshift Spectrum table based on the content of the ETL cluster.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
arvehisa
Highly Voted 1 year, 2 months ago
Selected Answer: A
A: redshift data sharing: https://docs.aws.amazon.com/redshift/latest/dg/data_sharing_intro.html With data sharing, you can securely and easily share live data across Amazon Redshift clusters. B: materialized view is only within 1 redshift cluster, across different tables
upvoted 5 times
...
lucas_rfsb
Highly Voted 1 year, 2 months ago
Selected Answer: D
In my opinion using Redshift Data Sharing will consume less resources. 'D' envolves using a S3 bucket.
upvoted 5 times
lucas_rfsb
1 year, 2 months ago
Sorry I wanted to select A but did D
upvoted 7 times
...
...
motk123
Most Recent 8 months, 2 weeks ago
Seems that the performance of the critical ETL cluster should not be affected when using data sharing, so the answer is likely A: https://docs.aws.amazon.com/redshift/latest/dg/data_sharing_intro.html Supporting different kinds of business-critical workloads – Use a central extract, transform, and load (ETL) cluster that shares data with multiple business intelligence (BI) or analytic clusters. This approach provides read workload isolation and chargeback for individual workloads. You can size and scale your individual workload compute according to the workload-specific requirements of price and performance. https://docs.aws.amazon.com/redshift/latest/dg/considerations.html The performance of the queries on shared data depends on the compute capacity of the consumer clusters.
upvoted 2 times
...
wimalik
9 months ago
A as Redshift data sharing allows you to share live data across Redshift clusters without having to duplicate the data. This feature enables the sales team to access the data from the ETL cluster directly without interrupting the critical analysis tasks or overloading the ETL cluster's resources. The sales team can join this shared data with their own data in the BI cluster efficiently.
upvoted 1 times
...
San_Juan
9 months, 1 week ago
Selected Answer: D
"The solution must minimize usage of the computing resources of the ETL cluster." That is key. You shouldn't use ETL cluster, so unload data to S3 and run queries in a separate Redshift Spectrum database. ETL cluster do nothing meanwhile.
upvoted 1 times
...
VerRi
1 year ago
Selected Answer: A
Typetical Redshift data sharing use case
upvoted 3 times
...
valuedate
1 year ago
key words: "weekly" "The solution must minimize usage of the computing resources of the ETL cluster." Answer:D
upvoted 2 times
...
d8945a1
1 year, 1 month ago
Selected Answer: A
Typical usecase of datasharing in Redshift. The question mentions that - 'team needs to join data from the ETL cluster with data that is in the sales team's BI cluster.' This is possible with datashare.
upvoted 4 times
...
jasango
1 year, 2 months ago
Selected Answer: D
The spectrum table is accessed from the sales cluster with zero impact on the ETL cluster.
upvoted 3 times
...
certplan
1 year, 2 months ago
Options A, B, and C involve granting the sales team direct access to the ETL cluster, which could potentially impact the performance of the ETL cluster and interfere with its critical analysis tasks. Option D provides a more isolated and scalable approach by leveraging Amazon S3 and Redshift Spectrum for data sharing while minimizing the usage of the ETL cluster's computing resources. https://docs.aws.amazon.com/redshift/latest/dg/c-using-spectrum-sharing-data.html https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-design-tables.html
upvoted 1 times
...
certplan
1 year, 2 months ago
Overall, while both options offer ways to share data between the ETL and BI clusters, Option D offers a more robust and scalable solution that minimizes the impact on the ETL cluster's resources and provides greater flexibility and independence for the sales team's analysis tasks. By unloading a copy of the data from the ETL cluster to Amazon S3 and leveraging Redshift Spectrum for querying, the solution aligns with AWS best practices for managing data and resource usage in Amazon Redshift clusters. It ensures that critical analysis tasks are not interrupted while providing the sales team with the necessary access to perform their analysis tasks efficiently.
upvoted 2 times
...
GiorgioGss
1 year, 2 months ago
Selected Answer: A
Initially I would go with B but that definitely will use more resource.
upvoted 5 times
...
[Removed]
1 year, 4 months ago
Selected Answer: A
To share data between Redshift clusters and meet the requirements of sharing ETL cluster data with the sales team without interrupting critical analysis tasks and minimizing the usage of the ETL cluster's computing resources, Redshift Data Sharing is the way to go https://docs.aws.amazon.com/redshift/latest/dg/data_sharing_intro.html "Supporting different kinds of business-critical workloads – Use a central extract, transform, and load (ETL) cluster that shares data with multiple business intelligence (BI) or analytic clusters. This approach provides read workload isolation and chargeback for individual workloads. You can size and scale your individual workload compute according to the workload-specific requirements of price and performance"
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...