A Machine Learning Specialist needs to be able to ingest streaming data and store it in Apache Parquet files for exploration and analysis. Which of the following services would both ingest and store this data in the correct format?
the answer is C. as the main point of the question is data transformation to Parquet format which is done by Kinesis Data Firehose not Data Stream. Coming to the data store the data store in Kinesis Data Stream is only for couple of days so it does not serve the purpose here
Amazon Kinesis Data Firehose is a fully managed service that can automatically load streaming data into data stores and analytics tools.
It can ingest real-time streaming data such as application logs, website clickstreams, and IoT telemetry data, and then store it in the correct format, such as Apache Parquet files, for exploration and analysis.
This makes it a suitable option for the requirement described in the question.
B) Only Amazon Kinesis Data Streams can store and Ingest data. We don't need to apply any transformation; the question asks to ingest and store data in Apache Parquet format, There is no assumption that the data coming in a different format than parquet.
It appears all agree that the answer is between Firehose and Analytics. Kinesis Firehose is used for ingestion. Both firehose and analytics can store, only firehose can ingest. https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html shows firehose can store parquet to S3
It appears all agree that the answer is between Firehose and Analytics. Data Streams handle stuff like event data, clickstream etc. Its not interested in special format, the focus is speed. The question did not talk of transformation, only ingestion. Kinesis Firehose is used for ingestion. Both firehose and analytics can store, only firehose can ingest. https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html shows firehose can store parquet to S3
Amazon Kinesis Data Firehose can convert the format of your input data from JSON to Apache Parquet or Apache ORC before storing the data in Amazon S3.
https://github.com/awsdocs/amazon-kinesis-data-firehose-developer-guide/blob/master/doc_source/record-format-conversion.md
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
JayK
Highly Voted 3 years, 7 months agoshammous
8 months, 3 weeks agoeganilovic
Highly Voted 3 years, 5 months agoearthMover
Most Recent 1 year, 11 months agokaike_reis
1 year, 9 months agoGOSD
2 years agoValcilio
2 years, 1 month agoDS2021
2 years, 2 months agoAjoseO
2 years, 2 months agoThai_Xuan
3 years, 6 months agoweslleylc
3 years, 6 months agojoe3232
2 years, 3 months agoIn
3 years, 6 months agoGeeBeeEl
3 years, 6 months agoGeeBeeEl
3 years, 6 months agoUrban_Life
3 years, 6 months agoCMMC
3 years, 7 months agoErso
3 years, 7 months agoBigEv
3 years, 7 months agorsimham
3 years, 7 months agocloud_trail
3 years, 6 months ago