exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 230 discussion

A sales company uses AWS Glue ETL to collect, process, and ingest data into an Amazon S3 bucket. The AWS Glue pipeline creates a new file in the S3 bucket every hour. File sizes vary from 200 KB to 300 KB. The company wants to build a sales prediction model by using data from the previous 5 years. The historic data includes 44,000 files.

The company builds a second AWS Glue ETL pipeline by using the smallest worker type. The second pipeline retrieves the historic files from the S3 bucket and processes the files for downstream analysis. The company notices significant performance issues with the second ETL pipeline.

The company needs to improve the performance of the second pipeline.

Which solution will meet this requirement MOST cost-effectively?

  • A. Use a larger worker type.
  • B. Increase the number of workers in the AWS Glue ETL jobs.
  • C. Use the AWS Glue DynamicFrame grouping option.
  • D. Enable AWS Glue auto scaling.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rdiaz
1 month, 2 weeks ago
Selected Answer: C
AWS Glue DynamicFrame grouping allows you to group multiple small files into larger partitions in-memory before processing. • When processing tens of thousands of small files (as in this case with 44,000 files), grouping improves performance dramatically by reducing I/O overhead and optimizing Spark shuffle operations. • This solution does not require increasing costs (no larger worker types or scaling), so it is the most cost-effective approach.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...