exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 98 discussion

A company wants to use machine learning (ML) to perform analytics on data that is in an Amazon S3 data lake. The company has two data transformation requirements that will give consumers within the company the ability to create reports.

The company must perform daily transformations on 300 GB of data that is in a variety format that must arrive in Amazon S3 at a scheduled time. The company must perform one-time transformations of terabytes of archived data that is in the S3 data lake. The company uses Amazon Managed Workflows for Apache Airflow (Amazon MWAA) Directed Acyclic Graphs (DAGs) to orchestrate processing.

Which combination of tasks should the company schedule in the Amazon MWAA DAGs to meet these requirements MOST cost-effectively? (Choose two.)

  • A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema.
  • B. For daily incoming data, use Amazon Athena to scan and identify the schema.
  • C. For daily incoming data, use Amazon Redshift to perform transformations.
  • D. For daily and archived data, use Amazon EMR to perform data transformations.
  • E. For archived data, use Amazon SageMaker to perform data transformations.
Show Suggested Answer Hide Answer
Suggested Answer: AD 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Ja13
Highly Voted 11 months, 1 week ago
A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema. D. For daily and archived data, use Amazon EMR to perform data transformations. Here's why: A. AWS Glue crawlers are well-suited for scanning and identifying the schema of data in S3. They are cost-effective and efficient for daily incoming data. D. Amazon EMR is a cost-effective solution for performing large-scale data transformations. It can handle both the daily transformations of 300 GB of data and the one-time transformations of terabytes of archived data efficiently.
upvoted 5 times
...
andrologin
Most Recent 10 months, 2 weeks ago
Selected Answer: AD
Glue crawlers for identifying the schema, EMR to run batch processing on the data
upvoted 2 times
...
HunkyBunky
11 months, 1 week ago
A / D - Looks good for me
upvoted 1 times
...
Ja13
11 months, 1 week ago
Selected Answer: AD
According to ChatGPT
upvoted 2 times
...
tgv
11 months, 3 weeks ago
Selected Answer: AD
A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema. This is cost-effective and simplifies the process of managing metadata. D. For daily and archived data, use Amazon EMR to perform data transformations. EMR is suitable for both large-scale and regular transformations, offering flexibility and cost efficiency.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...