exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 213 discussion

A company is building a pipeline that periodically retrains its machine learning (ML) models by using new streaming data from devices. The company's data engineering team wants to build a data ingestion system that has high throughput, durable storage, and scalability. The company can tolerate up to 5 minutes of latency for data ingestion. The company needs a solution that can apply basic data transformation during the ingestion process.

Which solution will meet these requirements with the MOST operational efficiency?

  • A. Configure the devices to send streaming data to an Amazon Kinesis data stream. Configure an Amazon Kinesis Data Firehose delivery stream to automatically consume the Kinesis data stream, transform the data with an AWS Lambda function, and save the output into an Amazon S3 bucket.
  • B. Configure the devices to send streaming data to an Amazon S3 bucket. Configure an AWS Lambda function that is invoked by S3 event notifications to transform the data and load the data into an Amazon Kinesis data stream. Configure an Amazon Kinesis Data Firehose delivery stream to automatically consume the Kinesis data stream and load the output back into the S3 bucket.
  • C. Configure the devices to send streaming data to an Amazon S3 bucket. Configure an AWS Glue job that is invoked by S3 event notifications to read the data, transform the data, and load the output into a new S3 bucket.
  • D. Configure the devices to send streaming data to an Amazon Kinesis Data Firehose delivery stream. Configure an AWS Glue job that connects to the delivery stream to transform the data and load the output into an Amazon S3 bucket.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
loict
8 months ago
Selected Answer: A
A. YES - Kinesis/Kafka acts as buffer for ingestion, Firehose provides good integration with Lambda (tranformation) & S3 (storage) B. NO - no point to save the data twice in S3 (raw and transformed) C. NO - since we do single-record transformation Glue/Spark is overkill D. NO - since we do single-record transformation Glue/Spark is overkill; further, we can reasonably expect devices to produce Kafka events but deploying a Firehose client API seem complicated
upvoted 1 times
...
Mickey321
8 months, 4 weeks ago
Selected Answer: A
answer A
upvoted 1 times
...
ADVIT
10 months, 1 week ago
Selected Answer: A
AWS Glue cannot get data from Kinesis Firehose, only from Kinesis Data Stream. It's not D. https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html
upvoted 1 times
...
DS2021
1 year, 1 month ago
Selected Answer: D
It is D
upvoted 1 times
avland
1 year ago
Glue can't read from Firehose. It's A.
upvoted 1 times
...
...
DS2021
1 year, 1 month ago
It is C
upvoted 1 times
ParkXD
1 year, 1 month ago
Option C uses AWS Glue, which can perform data transformation and load data into S3 buckets. However, Glue may not be the most efficient option for this use case, as it requires setting up a Glue job, which can introduce additional latency.
upvoted 3 times
ParkXD
1 year, 1 month ago
Option A uses Amazon Kinesis data stream, which is optimized for high throughput, durable storage, and scalability.
upvoted 1 times
...
...
...
sevosevo
1 year, 1 month ago
Why not C?
upvoted 1 times
...
Valcilio
1 year, 2 months ago
Selected Answer: A
Firehose can take just at a maximum of 5 minutes, then it's the best solution for transformations.
upvoted 2 times
...
GiyeonShin
1 year, 2 months ago
Selected Answer: A
A general architecture for (near)real - time ingesting & processing data: Kinesis Data Streams - Kinesis Data Firehose - (If needs etl, lambda) - S3(Redshift, ...)
upvoted 2 times
...
AjoseO
1 year, 2 months ago
Selected Answer: A
This solution provides a highly scalable and efficient way to ingest streaming data from devices with high throughput and durable storage by using Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose. By configuring an AWS Lambda function to transform the data during the ingestion process, the solution also applies basic data transformation with low latency. Additionally, Amazon S3 provides highly durable and scalable storage for the transformed data, which can be easily accessed by downstream processes such as machine learning model training.
upvoted 2 times
...
wolfsong
1 year, 2 months ago
A: https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago