exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 136 discussion

A web retail company wants to implement a near-real-time clickstream analytics solution. The company wants to analyze the data with an open-source package.
The analytics application will process the raw data only once, but other applications will need immediate access to the raw data for up to 1 year.
Which solution meets these requirements with the LEAST amount of operational effort?

  • A. Use Amazon Kinesis Data Streams to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Kinesis data stream. Set the retention period of the Kinesis data stream to 8.760 hours.
  • B. Use Amazon Kinesis Data Streams to collect the data. Use Amazon Kinesis Data Analytics with Apache Flink to process the data in real time. Set the retention period of the Kinesis data stream to 8,760 hours.
  • C. Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Amazon MSK stream. Set the log retention hours to 8,760.
  • D. Use Amazon Kinesis Data Streams to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Kinesis data stream. Create an Amazon Kinesis Data Firehose delivery stream to store the data in Amazon S3. Set an S3 Lifecycle policy to delete the data after 365 days.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rb39
Highly Voted 3 years, 1 month ago
Selected Answer: D
store data in S3 not in Kinesis
upvoted 10 times
MWL
3 years ago
The question requires to store "raw data" for 1 year, but for the "processed data".
upvoted 8 times
...
...
astalavista1
Highly Voted 3 years, 1 month ago
Selected Answer: D
A & B out as you cannot store data for more than 7 days in KDS. C - Possibly but won't be cost-effective. D - Cost-Effective with least amount of operational overhead.
upvoted 8 times
astalavista1
3 years, 1 month ago
Plus C only stores the log data, not the data itself.
upvoted 1 times
...
jrheen
3 years ago
KDS can store 1 Year data: https://aws.amazon.com/blogs/big-data/retaining-data-streams-up-to-one-year-with-amazon-kinesis-data-streams/
upvoted 12 times
...
...
MLCL
Most Recent 1 year, 9 months ago
Selected Answer: B
B is the right answer, Kinesis Data Stream can keep raw data up to 365 days. Anytime they ask about click-stream analytics it should go to Kinesis Data Analytics.
upvoted 4 times
...
pk349
2 years ago
B: I passed the test
upvoted 2 times
...
akashm99101001com
2 years, 2 months ago
Selected Answer: B
analytics application will process the raw data only once so why store in S3?
upvoted 2 times
...
AwsNewPeople
2 years, 2 months ago
Selected Answer: B
The solution that meets the requirements with the least amount of operational effort is Option B: Use Amazon Kinesis Data Streams to collect the data. Use Amazon Kinesis Data Analytics with Apache Flink to process the data in real-time. Set the retention period of the Kinesis data stream to 8,760 hours. This solution uses Amazon Kinesis Data Streams to collect the data and processes it in real-time using Amazon Kinesis Data Analytics with Apache Flink. This allows for near-real-time clickstream analytics without the need for additional data processing or storage. Additionally, the retention period of the Kinesis data stream can be set to 8,760 hours (1 year), which allows other applications to have immediate access to the raw data for up to 1 year without the need for additional storage or processing. This solution requires the least amount of operational effort as it does not require additional steps for data processing or storage.
upvoted 6 times
...
np2021
2 years, 2 months ago
For those debating B and D, I am going with B. Least operational overhead, and the giveway is the hours : "The maximum value of a stream's retention period is 8760 hours (365 days)." https://docs.aws.amazon.com/kinesis/latest/APIReference/API_IncreaseStreamRetentionPeriod.html. Be wary of the attempt in the question to use 8.760 hours which i think is sneaky.
upvoted 2 times
...
murali12180
2 years, 3 months ago
Selected Answer: B
Focus on "LEAST amount of operational effort" so you should eliminate EMR from the picture. D recommends EMR and it is operational overhead. B - make sense. Since it can keep data for 365 days and there is no operational overhead.
upvoted 2 times
...
Chelseajcole
2 years, 4 months ago
Selected Answer: D
It says the raw data need to be accessed by other applications, if we put in the Kinesis Stream, how other application can access it? Besides, KDS is not good for store data. S3 is the right choice.
upvoted 2 times
...
nadavw
2 years, 5 months ago
Selected Answer: B
A Kinesis data stream stores records for 24 hours by default, up to 365 days (8,760 hours). https://aws.amazon.com/blogs/big-data/retaining-data-streams-up-to-one-year-with-amazon-kinesis-data-streams/
upvoted 1 times
learnazureportal
2 years, 4 months ago
KDS can keep between 24 hours to 7 days.
upvoted 1 times
np2021
2 years, 2 months ago
Incorrect. https://docs.aws.amazon.com/kinesis/latest/APIReference/API_IncreaseStreamRetentionPeriod.html
upvoted 1 times
...
...
...
IvanHuang
2 years, 5 months ago
Selected Answer: C
C. Using Amazon Managed Streaming for Apache Kafka (Amazon MSK) to collect the data, Amazon EMR with Apache Flink to process the data, and setting the log retention hours to 8,760 would meet the requirements with the least amount of operational effort. Amazon MSK is a fully managed service that makes it easy to set up, maintain, and scale Apache Kafka clusters. Amazon EMR can be used to process data from an Amazon MSK stream in real time, and the log retention hours can be set to 8,760 to retain the data for up to 1 year. This solution would require minimal effort to set up and maintain, and would allow other applications to access the raw data for up to 1 year.
upvoted 1 times
...
thuyeinaung
2 years, 5 months ago
Selected Answer: B
B for {{ LEAST amount of operational effort }}
upvoted 3 times
nadavw
2 years, 5 months ago
A Kinesis data stream stores records for 24 hours by default, up to 365 days (8,760 hours). https://aws.amazon.com/blogs/big-data/retaining-data-streams-up-to-one-year-with-amazon-kinesis-data-streams/
upvoted 1 times
...
...
b33f
2 years, 6 months ago
Selected Answer: B
I vote for B. I think C with EMR requires more operational effort.
upvoted 1 times
...
rav009
2 years, 7 months ago
Selected Answer: B
KDS can keep the data for one year.
upvoted 1 times
...
APIsche
2 years, 9 months ago
Selected Answer: B
Answer is B, KDS can natively store raw data for up to 1 year
upvoted 3 times
...
rocky48
2 years, 10 months ago
Selected Answer: B
Answer-B
upvoted 1 times
rocky48
2 years, 5 months ago
Kinesis data stream stores records for 24 hours by default, up to 365 days (8,760 hours).
upvoted 1 times
...
...
ru4aws
2 years, 10 months ago
Selected Answer: B
Retention and Minimal Operational KDS(8760 hours (365 days)) + KDA(Flink available out of the box) >> KDS + EMR with Flink + S3
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...