Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 topic 1 question 109 discussion

Exam question from Amazon's AWS Certified Machine Learning Engineer - Associate MLA-C01

Question #: 109
Topic #: 1

[All AWS Certified Machine Learning Engineer - Associate MLA-C01 Questions]

An ML engineer needs to merge and transform data from two sources to retrain an existing ML model. One data source consists of .csv files that are stored in an Amazon S3 bucket. Each .csv file consists of millions of records. The other data source is an Amazon Aurora DB cluster.

The result of the merge process must be written to a second S3 bucket. The ML engineer needs to perform this merge-and-transform task every week.

Which solution will meet these requirements with the LEAST operational overhead?

A. Create a transient Amazon EMR cluster every week. Use the cluster to run an Apache Spark job to merge and transform the data.
B. Create a weekly AWS Glue job that uses the Apache Spark engine. Use DynamicFrame native operations to merge and transform the data.
C. Create an AWS Lambda function that runs Apache Spark code every week to merge and transform the data. Configure the Lambda function to connect to the initial S3 bucket and the DB cluster.
D. Create an AWS Batch job that runs Apache Spark code on Amazon EC2 instances every week. Configure the Spark code to save the data from the EC2 instances to the second S3 bucket.

Show Suggested Answer

Suggested Answer: B 🗳️

by ygn4ei at March 20, 2025, 4:27 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

AgboolaKun

2 weeks, 3 days ago

Selected Answer: B

The best solution is to create a weekly AWS Glue job that uses the Apache Spark engine with DynamicFrame native operations to merge and transform the data, as AWS Glue is a fully managed ETL service that provides built-in schedulers, native integration with S3 and Aurora, and requires minimal operational overhead compared to managing EMR clusters, Lambda functions, or AWS Batch jobs.

upvoted 1 times

...

Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 All Questions

View all questions & answers for the AWS Certified Machine Learning Engineer - Associate MLA-C01 exam

Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 topic 1 question 109 discussion

Comments

AgboolaKun

SY0-701