exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 76 discussion

A company wants to research user turnover by analyzing the past 3 months of user activities. With millions of users, 1.5 TB of uncompressed data is generated each day. A 30-node Amazon Redshift cluster with 2.56 TB of solid state drive (SSD) storage for each node is required to meet the query performance goals.
The company wants to run an additional analysis on a year's worth of historical data to examine trends indicating which features are most popular. This analysis will be done once a week.
What is the MOST cost-effective solution?

  • A. Increase the size of the Amazon Redshift cluster to 120 nodes so it has enough storage capacity to hold 1 year of data. Then use Amazon Redshift for the additional analysis.
  • B. Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in Apache Parquet format partitioned by date. Then use Amazon Redshift Spectrum for the additional analysis.
  • C. Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in Apache Parquet format partitioned by date. Then provision a persistent Amazon EMR cluster and use Apache Presto for the additional analysis.
  • D. Resize the cluster node type to the dense storage node type (DS2) for an additional 16 TB storage capacity on each individual node in the Amazon Redshift cluster. Then use Amazon Redshift for the additional analysis.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ramozo
Highly Voted 3 years, 8 months ago
B. Redshift Spectrum. "Amazon Redshift Spectrum executes queries across thousands of parallelized nodes to deliver fast results, regardless of the complexity of the query or the amount of data. " https://aws.amazon.com/redshift/features/
upvoted 33 times
awssp12345
3 years, 8 months ago
Agree!
upvoted 1 times
...
lui
3 years, 8 months ago
why 30 node can save 90 days data?
upvoted 1 times
Paitan
3 years, 7 months ago
You are right, 30 nodes cannot save 90 days uncompressed data. But we can always compress the data while storing in Redshift. So that will definitely reduce the storage requirement and can be managed by the 30 nodes.
upvoted 5 times
...
Huy
3 years, 7 months ago
https://aws.amazon.com/blogs/aws/data-compression-improvements-in-amazon-redshift/
upvoted 2 times
...
...
...
pk349
Most Recent 2 years, 1 month ago
B: I passed the test
upvoted 1 times
...
cloudlearnerhere
2 years, 7 months ago
Selected Answer: B
Correct answer is B as the data can be stored in Redshift for 90 days for analyzing the past 3 months of user activities. Data older than 90 days can be moved to S3 and analyzed using Redshift Spectrum once a week. This provides the most cost-effective solution. Using Amazon Redshift Spectrum, you can efficiently query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to run very fast against large datasets. Much of the processing occurs in the Redshift Spectrum layer, and most of the data remains in Amazon S3. Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster. Options A & D are wrong as increasing the size of the Redshift cluster would not be cost-effective. Option C is wrong as analyzing data using a persistent EMR cluster would not be cost-effective.
upvoted 4 times
...
rocky48
2 years, 10 months ago
Selected Answer: B
B is the right answer
upvoted 2 times
...
jrheen
3 years, 1 month ago
Answer - B
upvoted 2 times
...
aws2019
3 years, 6 months ago
B is right
upvoted 2 times
...
lostsoul07
3 years, 7 months ago
B is the right answer
upvoted 3 times
...
Paitan
3 years, 7 months ago
B for sure.
upvoted 3 times
...
testtaker3434
3 years, 8 months ago
Agree its B
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...