Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 213 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 213
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A company is building a pipeline that periodically retrains its machine learning (ML) models by using new streaming data from devices. The company's data engineering team wants to build a data ingestion system that has high throughput, durable storage, and scalability. The company can tolerate up to 5 minutes of latency for data ingestion. The company needs a solution that can apply basic data transformation during the ingestion process.

Which solution will meet these requirements with the MOST operational efficiency?

A. Configure the devices to send streaming data to an Amazon Kinesis data stream. Configure an Amazon Kinesis Data Firehose delivery stream to automatically consume the Kinesis data stream, transform the data with an AWS Lambda function, and save the output into an Amazon S3 bucket.
B. Configure the devices to send streaming data to an Amazon S3 bucket. Configure an AWS Lambda function that is invoked by S3 event notifications to transform the data and load the data into an Amazon Kinesis data stream. Configure an Amazon Kinesis Data Firehose delivery stream to automatically consume the Kinesis data stream and load the output back into the S3 bucket.
C. Configure the devices to send streaming data to an Amazon S3 bucket. Configure an AWS Glue job that is invoked by S3 event notifications to read the data, transform the data, and load the output into a new S3 bucket.
D. Configure the devices to send streaming data to an Amazon Kinesis Data Firehose delivery stream. Configure an AWS Glue job that connects to the delivery stream to transform the data and load the output into an Amazon S3 bucket.

Show Suggested Answer

Suggested Answer: A 🗳️

by wolfsong at Feb. 18, 2023, 3:21 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

loict

9 months, 3 weeks ago

Selected Answer: A

A. YES - Kinesis/Kafka acts as buffer for ingestion, Firehose provides good integration with Lambda (tranformation) & S3 (storage) B. NO - no point to save the data twice in S3 (raw and transformed) C. NO - since we do single-record transformation Glue/Spark is overkill D. NO - since we do single-record transformation Glue/Spark is overkill; further, we can reasonably expect devices to produce Kafka events but deploying a Firehose client API seem complicated

upvoted 1 times

...

Mickey321

10 months, 2 weeks ago

Selected Answer: A

answer A

upvoted 1 times

...

ADVIT

11 months, 3 weeks ago

Selected Answer: A

AWS Glue cannot get data from Kinesis Firehose, only from Kinesis Data Stream. It's not D. https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html

upvoted 1 times

...

DS2021

1 year, 3 months ago

Selected Answer: D

It is D

upvoted 1 times

avland

1 year, 2 months ago

Glue can't read from Firehose. It's A.

upvoted 1 times

...

DS2021

1 year, 3 months ago

It is C

upvoted 1 times

ParkXD

1 year, 2 months ago

Option C uses AWS Glue, which can perform data transformation and load data into S3 buckets. However, Glue may not be the most efficient option for this use case, as it requires setting up a Glue job, which can introduce additional latency.

upvoted 3 times

ParkXD

1 year, 2 months ago

Option A uses Amazon Kinesis data stream, which is optimized for high throughput, durable storage, and scalability.

upvoted 1 times

...

sevosevo

1 year, 3 months ago

Why not C?

upvoted 1 times

...

Valcilio

1 year, 3 months ago

Selected Answer: A

Firehose can take just at a maximum of 5 minutes, then it's the best solution for transformations.

upvoted 2 times

...

GiyeonShin

1 year, 4 months ago

Selected Answer: A

A general architecture for (near)real - time ingesting & processing data: Kinesis Data Streams - Kinesis Data Firehose - (If needs etl, lambda) - S3(Redshift, ...)

upvoted 2 times

...

AjoseO

1 year, 4 months ago

Selected Answer: A

This solution provides a highly scalable and efficient way to ingest streaming data from devices with high throughput and durable storage by using Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose. By configuring an AWS Lambda function to transform the data during the ingestion process, the solution also applies basic data transformation with low latency. Additionally, Amazon S3 provides highly durable and scalable storage for the transformed data, which can be easily accessed by downstream processes such as machine learning model training.

upvoted 2 times

...

wolfsong

1 year, 4 months ago

A: https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html

upvoted 2 times

...