Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 23 discussion

Exam question from Amazon's AWS Certified Data Analytics - Specialty

Question #: 23
Topic #: 1

[All AWS Certified Data Analytics - Specialty Questions]

A company wants to optimize the cost of its data and analytics platform. The company is ingesting a number of .csv and JSON files in Amazon S3 from various data sources. Incoming data is expected to be 50 GB each day. The company is using Amazon Athena to query the raw data in Amazon S3 directly. Most queries aggregate data from the past 12 months, and data that is older than 5 years is infrequently queried. The typical query scans about 500 MB of data and is expected to return results in less than 1 minute. The raw data must be retained indefinitely for compliance requirements.
Which solution meets the company's requirements?

A. Use an AWS Glue ETL job to compress, partition, and convert the data into a columnar data format. Use Athena to query the processed dataset. Configure a lifecycle policy to move the processed data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after object creation. Configure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after object creation.
B. Use an AWS Glue ETL job to partition and convert the data into a row-based data format. Use Athena to query the processed dataset. Configure a lifecycle policy to move the data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after object creation. Configure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after object creation.
C. Use an AWS Glue ETL job to compress, partition, and convert the data into a columnar data format. Use Athena to query the processed dataset. Configure a lifecycle policy to move the processed data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after the object was last accessed. Configure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after the last date the object was accessed.
D. Use an AWS Glue ETL job to partition and convert the data into a row-based data format. Use Athena to query the processed dataset. Configure a lifecycle policy to move the data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after the object was last accessed. Configure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after the last date the object was accessed.

Show Suggested Answer

Suggested Answer: A 🗳️

by azi_2021 at April 20, 2022, 4:09 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

astalavista1

Highly Voted 3 years, 3 months ago

Selected Answer: A

Agree with answer A as C&D was eliminated due to the last accessed rather than created for Lifecycle policy. By compressing you save cost and converting to columnar data, performance is increased.

upvoted 11 times

...

cloudlearnerhere

Highly Voted 2 years, 8 months ago

Selected Answer: A

Correct answer is A as columnar data format store data efficiently by employing column-wise compression and enables split and parallel processing. Storing processed data in S3 in SA-IA and moving raw data in Glacier would help reduce costs. Option B & D is wrong as it is recommended to use columnar data format for processing. Options C is wrong as lifecycle rules are based on Object creation data and not last date when the object was accessed.

upvoted 6 times

...

GLam123

Most Recent 1 year, 8 months ago

Selected Answer: A

columnar and based on object creation time

upvoted 1 times

...

NikkyDicky

1 year, 11 months ago

Selected Answer: A

A make sense

upvoted 1 times

...

pk349

2 years, 2 months ago

A: I passed the test

upvoted 2 times

...

Arka_01

2 years, 10 months ago

Selected Answer: A

It should be based on object creation, not based on object access

upvoted 1 times

...

rocky48

3 years ago

Selected Answer: A

Answer is A

upvoted 1 times

...

ru4aws

3 years ago

Selected Answer: A

should be 5 years after object creation to Infrequent for processed data and 7 days after object creation to glacier for raw data There is no point of counting days from "Last accessed"

upvoted 2 times

...

dushmantha

3 years ago

Selected Answer: A

Columnar data is a way of optimizing (eleminate B, D). And the lifecycle policy should be assigned after object creation (eleminate C). Ans is A

upvoted 1 times

...

Bik000

3 years, 2 months ago

Selected Answer: A

Answer is A

upvoted 1 times

...

azi_2021

3 years, 3 months ago

ans should be A

upvoted 2 times

astalavista1

3 years, 3 months ago

Agree with answer A as C&D was eliminated due to the last accessed rather than created for Lifecycle policy. By compressing you save cost and converting to columnar data, performance is increased.

upvoted 1 times

...