Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 85 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 85
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

An online retail company stores Application Load Balancer (ALB) access logs in an Amazon S3 bucket. The company wants to use Amazon Athena to query the logs to analyze traffic patterns.

A data engineer creates an unpartitioned table in Athena. As the amount of the data gradually increases, the response time for queries also increases. The data engineer wants to improve the query performance in Athena.

Which solution will meet these requirements with the LEAST operational effort?

A. Create an AWS Glue job that determines the schema of all ALB access logs and writes the partition metadata to AWS Glue Data Catalog.
B. Create an AWS Glue crawler that includes a classifier that determines the schema of all ALB access logs and writes the partition metadata to AWS Glue Data Catalog.
C. Create an AWS Lambda function to transform all ALB access logs. Save the results to Amazon S3 in Apache Parquet format. Partition the metadata. Use Athena to query the transformed data.
D. Use Apache Hive to create bucketed tables. Use an AWS Lambda function to transform all ALB access logs.

Show Suggested Answer

Suggested Answer: B 🗳️

by tgv at June 15, 2024, 9:20 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

PGGuy

Highly Voted 1 year ago

Selected Answer: B

Creating an AWS Glue crawler (Option B) is the most straightforward and least operationally intensive approach to automatically determine the schema, partition the data, and keep the AWS Glue Data Catalog updated. This ensures Athena queries are optimized without requiring extensive manual management or additional processing steps.

upvoted 5 times

...

andrologin

Most Recent 11 months, 3 weeks ago

Selected Answer: C

AWS Crawler with classifiers allow you to determine the schema pattern on files/data that can then be used to partition the data for Athena query optimization

upvoted 1 times

...

PGGuy

1 year ago

upvoted 2 times

...

tgv

1 year ago

Selected Answer: B

An AWS Glue crawler can automatically determine the schema of the logs, infer partitions, and update the Glue Data Catalog. Crawlers can be scheduled to run at intervals, minimizing manual intervention.

upvoted 4 times

...