exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 92 discussion

A company has developed several AWS Glue extract, transform, and load (ETL) jobs to validate and transform data from Amazon S3. The ETL jobs load the data into Amazon RDS for MySQL in batches once every day. The ETL jobs use a DynamicFrame to read the S3 data.

The ETL jobs currently process all the data that is in the S3 bucket. However, the company wants the jobs to process only the daily incremental data.

Which solution will meet this requirement with the LEAST coding effort?

  • A. Create an ETL job that reads the S3 file status and logs the status in Amazon DynamoDB.
  • B. Enable job bookmarks for the ETL jobs to update the state after a run to keep track of previously processed data.
  • C. Enable job metrics for the ETL jobs to help keep track of processed objects in Amazon CloudWatch.
  • D. Configure the ETL jobs to delete processed objects from Amazon S3 after each run.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
tgv
Highly Voted 10 months, 3 weeks ago
Selected Answer: B
AWS Glue job bookmarks are designed to handle incremental data processing by automatically tracking the state.
upvoted 8 times
...
andrologin
Most Recent 9 months, 3 weeks ago
Selected Answer: B
AWS Glue Bookmarks can be used to pin where the data processing last stopped hence help with incremental processing.
upvoted 1 times
...
HunkyBunky
10 months ago
Selected Answer: B
B - bookmarks is a key
upvoted 1 times
...
bakarys
10 months, 1 week ago
Selected Answer: B
The solution that will meet this requirement with the least coding effort is Option B: Enable job bookmarks for the ETL jobs to update the state after a run to keep track of previously processed data. AWS Glue job bookmarks help ETL jobs to keep track of data that has already been processed during previous runs. By enabling job bookmarks, the ETL jobs can skip the processed data and only process the new, incremental data. This feature is designed specifically for this use case and requires minimal coding effort. Options A, C, and D would require additional coding and operational effort. Option A would require creating a new ETL job and managing a DynamoDB table. Option C would involve setting up job metrics and CloudWatch, which doesn’t directly address processing incremental data. Option D would involve deleting data from S3 after processing, which might not be desirable if the original data needs to be retained. Therefore, Option B is the most suitable solution.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago