Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 98 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 98
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

A company wants to use machine learning (ML) to perform analytics on data that is in an Amazon S3 data lake. The company has two data transformation requirements that will give consumers within the company the ability to create reports.

The company must perform daily transformations on 300 GB of data that is in a variety format that must arrive in Amazon S3 at a scheduled time. The company must perform one-time transformations of terabytes of archived data that is in the S3 data lake. The company uses Amazon Managed Workflows for Apache Airflow (Amazon MWAA) Directed Acyclic Graphs (DAGs) to orchestrate processing.

Which combination of tasks should the company schedule in the Amazon MWAA DAGs to meet these requirements MOST cost-effectively? (Choose two.)

A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema.
B. For daily incoming data, use Amazon Athena to scan and identify the schema.
C. For daily incoming data, use Amazon Redshift to perform transformations.
D. For daily and archived data, use Amazon EMR to perform data transformations.
E. For archived data, use Amazon SageMaker to perform data transformations.

Show Suggested Answer

Suggested Answer: AD 🗳️

by tgv at June 15, 2024, 10:46 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Ja13

Highly Voted 11 months, 1 week ago

A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema. D. For daily and archived data, use Amazon EMR to perform data transformations. Here's why: A. AWS Glue crawlers are well-suited for scanning and identifying the schema of data in S3. They are cost-effective and efficient for daily incoming data. D. Amazon EMR is a cost-effective solution for performing large-scale data transformations. It can handle both the daily transformations of 300 GB of data and the one-time transformations of terabytes of archived data efficiently.

upvoted 5 times

...

andrologin

Most Recent 10 months, 2 weeks ago

Selected Answer: AD

Glue crawlers for identifying the schema, EMR to run batch processing on the data

upvoted 2 times

...

HunkyBunky

11 months, 1 week ago

A / D - Looks good for me

upvoted 1 times

...

Ja13

11 months, 1 week ago

Selected Answer: AD

According to ChatGPT

upvoted 2 times

...

tgv

11 months, 3 weeks ago

Selected Answer: AD

A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema. This is cost-effective and simplifies the process of managing metadata. D. For daily and archived data, use Amazon EMR to perform data transformations. EMR is suitable for both large-scale and regular transformations, offering flexibility and cost efficiency.

upvoted 3 times

...