exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 62 discussion

A Data Scientist wants to gain real-time insights into a data stream of GZIP files.
Which solution would allow the use of SQL to query the stream with the LEAST latency?

  • A. Amazon Kinesis Data Analytics with an AWS Lambda function to transform the data.
  • B. AWS Glue with a custom ETL script to transform the data.
  • C. An Amazon Kinesis Client Library to transform the data and save it to an Amazon ES cluster.
  • D. Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
cybe001
Highly Voted 3 years ago
A is correct. Kinesis Data Analytics can use lamda to convert GZIP and can run SQL on the converted data. https://aws.amazon.com/about-aws/whats-new/2017/10/amazon-kinesis-analytics-can-now-pre-process-data-prior-to-running-sql-queries/
upvoted 44 times
...
VB
Highly Voted 3 years ago
A is correct: https://aws.amazon.com/about-aws/whats-new/2017/10/amazon-kinesis-analytics-can-now-pre-process-data-prior-to-running-sql-queries/ "To get started, simply select an AWS Lambda function from the Kinesis Analytics application source page in the AWS Management console. Your Kinesis Analytics application will automatically process your raw data records using the Lambda function, and send transformed data to your SQL code for further processing. Kinesis Analytics provides Lambda blueprints for common use cases like converting GZIP ..."
upvoted 17 times
...
ef12052
Most Recent 1 month ago
Selected Answer: A
Use Amazon Kinesis Data Analytics if you need SQL-based processing and advanced analytics capabilities for streaming data. Use Amazon Kinesis Data Firehose if your primary requirement is to deliver, transform, and load streaming data into various AWS destinations with simplified configurations, but not for SQL-based processing.
upvoted 1 times
...
Denise123
8 months, 2 weeks ago
Selected Answer: D
If gaining real-time insights involves complex analytics or custom processing, Amazon Kinesis Data Analytics with AWS Lambda is likely a more suitable choice. If the requirements can be met with simpler data transformations, Amazon Kinesis Data Firehose might provide a more straightforward and potentially lower-latency solution. In other words, if this data is in GZIP files and the processing requirements are relatively simple, Amazon Kinesis Data Firehose might be a more straightforward and efficient choice. GZIP files typically contain compressed data, and if our primary objective is to ingest, transform, and load this data into other AWS services for real-time insights, Kinesis Data Firehose provides a managed and streamlined solution that can handle GZIP compression.
upvoted 1 times
Denise123
8 months, 2 weeks ago
The answer can be A , please comment if you have more clarity. After searching more, I also found out the following: (I have missed the SQL requirement in the question) Use Amazon Kinesis Data Analytics if you need SQL-based processing and advanced analytics capabilities for streaming data. Use Amazon Kinesis Data Firehose if your primary requirement is to deliver, transform, and load streaming data into various AWS destinations with simplified configurations, but not for SQL-based processing.
upvoted 1 times
...
...
Selected Answer: A
A is correct, why D xiyarsan sen?
upvoted 1 times
...
Mickey321
1 year, 2 months ago
Selected Answer: A
A is correct
upvoted 1 times
...
kaike_reis
1 year, 3 months ago
Selected Answer: A
"allow the use ohttps://www.examtopics.com/exams/amazon/aws-certified-machine-learning-specialty/view/13/#f SQL to query the stream with the LEAST latency?" Well, the only solution that presents SQL query is (A). It's a description of KDA.
upvoted 2 times
...
Nadia0012
1 year, 7 months ago
Selected Answer: A
the term "lease latency" is the the hidden point. with Glue we can have near real-time but Kinesis data analytics will give you real-time transformation with internal lambda
upvoted 3 times
...
Valcilio
1 year, 7 months ago
Selected Answer: A
A is correct, with KDA you can run sql queries in the data during the streaming (real-time SQL queries).
upvoted 2 times
...
bakarys
1 year, 8 months ago
Selected Answer: D
D. Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket would be the best solution for allowing the use of SQL to query the stream with the least latency. Amazon Kinesis Data Firehose can be configured to transform the data before writing it to Amazon S3 in real-time. Once the data is in S3, it can be queried using SQL with Amazon Athena, which is a serverless query service that allows running standard SQL queries against data stored in Amazon S3. This approach provides the lowest latency compared to other options and requires minimal setup and maintenance.
upvoted 3 times
akgarg00
11 months ago
Query has to be run on stream so firehose not possible.
upvoted 1 times
...
...
OssamaAbdelatif
1 year, 11 months ago
Selected Answer: A
A is correct.
upvoted 1 times
...
AddiWei
2 years, 8 months ago
And somehow "transformation" is added to the answer as a requirement when it clearly was not part of the requirement from the question.
upvoted 2 times
...
apprehensive_scar
2 years, 9 months ago
AAAAAAA
upvoted 1 times
...
HalloSpencer
2 years, 12 months ago
what about "LEAST latency"?
upvoted 4 times
...
Erso
3 years ago
A is correct. you can pre-process data prior to running SQL queries with Kinesis Data Analytics and Lambda (more or less) is always a best practice :)
upvoted 3 times
...
JayK
3 years, 1 month ago
Answer is B. Kinesis Data Analytics does not do any transformation, it is only for querying. Glue ETL can have scripts that can transform the data
upvoted 2 times
SophieSu
2 years, 12 months ago
so you need lambda
upvoted 1 times
...
am7
3 years, 1 month ago
But we need to run SQL on real time stream data.
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago