exam questions

Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 2 question 9 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty
Question #: 9
Topic #: 2
[All AWS Certified Big Data - Specialty Questions]

An organization has added a clickstream to their website to analyze traffic. The website is sending each page request with the PutRecord API call to an Amazon
Kinesis stream by using the page name as the partition key. During peak spikes in website traffic, a support engineer notices many events in the application logs.
ProvisionedThroughputExcededException
What should be done to resolve the issue in the MOST cost-effective way?

  • A. Create multiple Amazon Kinesis streams for page requests to increase the concurrency of the clickstream.
  • B. Increase the number of shards on the Kinesis stream to allow for more throughput to meet the peak spikes in traffic.
  • C. Modify the application to use on the Kinesis Producer Library to aggregate requests before sending them to the Kinesis stream.
  • D. Attach more consumers to the Kinesis stream to process records in parallel, improving the performance on the stream. B
Show Suggested Answer Hide Answer
Suggested Answer: Explanation 🗳️
Reference:
https://aws.amazon.com/kinesis/data-streams/faqs/

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
yuriy_ber
Highly Voted 3 years, 7 months ago
I'm also stumbled upon this questions - it looks very obvious that they are not aggregating because they use PutRecord API. Furthermore they have only peak spikes, so if they increase number of shards they would have constantly higher costs for only occasional spikes. It's definitely C, additionally it would also be possible to implement compression using KPL.
upvoted 8 times
...
VB
Highly Voted 3 years, 7 months ago
But is B a most cost-effective way?.. the price we pay depends on the shards .. can it be C?
upvoted 6 times
...
DerekKey
Most Recent 3 years, 6 months ago
C - this is how you avoid ProvisionedThroughputExcededException
upvoted 1 times
...
matthew95
3 years, 6 months ago
It should be C, because batching increase throughput and decrease cost
upvoted 2 times
...
k115
3 years, 6 months ago
C is the right answer
upvoted 2 times
...
winset
3 years, 6 months ago
B or C?
upvoted 1 times
...
emailtorajivk
3 years, 6 months ago
The request rate for the stream is too high, or the requested data is too large for the available throughput. Reduce the frequency or size of your requests. For more information, https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html So using the producer library will decrease the frequency
upvoted 1 times
...
Bulti
3 years, 6 months ago
Answer is C: Batching (both turned on by default) increase throughput, decrease cost: •Collect Records and Write to multiple shards in the same PutRecords API call •Aggregate increased latency •Capability to store multiple records in one record (go over 1000 records per second limit) •Increase payload size and improve throughput (maximize 1MB/s limit)
upvoted 4 times
...
susan8840
3 years, 6 months ago
agreed B. input data increased so to increase shards. Your data blob, partition key, and data stream name are required parameters of a PutRecord or PutRecords call. The size of your data blob (before Base64 encoding) and partition key will be counted against the data throughput of your Amazon Kinesis data stream, which is determined by the number of shards within the data stream.
upvoted 1 times
...
san2020
3 years, 6 months ago
my selection C
upvoted 6 times
...
aewis
3 years, 6 months ago
C ! https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html
upvoted 1 times
...
richardxyz
3 years, 6 months ago
C is correct; KPL supports Aggregation, storing multiple records within a single Kinesis Data Streams record and with this feature, we can go beyond 1000 records per second per shard.
upvoted 3 times
...
kalpanareddy
3 years, 6 months ago
I will go with B https://aws.amazon.com/kinesis/data-streams/faqs/
upvoted 1 times
RamNelluru
3 years, 6 months ago
B may not work because partition key is page name. Even if you increase the number of shards still there may be a hot partition because of single page sending more puts. Since this information is not provided, C may be right answer.
upvoted 3 times
...
...
am7
3 years, 7 months ago
Due to frequent checkpointing its giving the errors. When the data will be aggregated the checkpointing will get reduced and in turn solve the problem in the most effective way.
upvoted 1 times
...
s3an
3 years, 7 months ago
C I think. https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html "The KPL is an easy-to-use, highly configurable library that helps you write to a Kinesis data stream. It acts as an intermediary between your producer application code and the Kinesis Data Streams API actions. The KPL performs the following primary tasks: Writes to one or more Kinesis data streams with an automatic and configurable retry mechanism Collects records and uses PutRecords to write multiple records to multiple shards per request Aggregates user records to increase payload size and improve throughput". Please, anyone on this thread who passed this exam already? That way we can all be on same page with such, to know which answers are correct
upvoted 2 times
...
Zire
3 years, 7 months ago
My choice is C. While B is a way to resolve the throughput issue, aggregation proposed by C would be more cost-effective.
upvoted 3 times
cybe001
3 years, 7 months ago
You also get provisionedthroughputexceeded for volume of data also. So aggregation won't solve the issue. Answer is B
upvoted 4 times
d00ku
3 years, 7 months ago
Aggregation solves volume issues, Batching solves throughput issues.. seems C.
upvoted 3 times
...
...
...
jlpl
3 years, 7 months ago
B ? Thoughts
upvoted 2 times
mattyb123
3 years, 7 months ago
Correct. https://aws.amazon.com/kinesis/data-streams/faqs/ Q: What happens if the capacity limits of an Amazon Kinesis data stream are exceeded while the data producer adds data to the data stream? The capacity limits of an Amazon Kinesis data stream are defined by the number of shards within the data stream. The limits can be exceeded by either data throughput or the number of PUT records. While the capacity limits are exceeded, the put data call will be rejected with a ProvisionedThroughputExceeded exception. If this is due to a temporary rise of the data stream’s input data rate, retry by the data producer will eventually lead to completion of the requests. If this is due to a sustained rise of the data stream’s input data rate, you should increase the number of shards within your data stream to provide enough capacity for the put data calls to consistently succeed. In both cases, Amazon CloudWatch metrics allow you to learn about the change of the data stream’s input data rate and the occurrence of ProvisionedThroughputExceeded exceptions.
upvoted 5 times
ME2000
3 years, 7 months ago
B is the answer More on "ProvisionedThroughputExceededException"... https://docs.aws.amazon.com/streams/latest/dev/troubleshooting-consumers.html https://docs.aws.amazon.com/streams/latest/dev/kinesis-low-latency.html https://any-api.com/amazonaws_com/kinesis/docs/Definitions/ProvisionedThroughputExceededException
upvoted 1 times
Corram
3 years, 6 months ago
C is correct. ProvisionedThroughputExceededException can be caused either by too large data volume or too many requests. Since PutRecord API is used, each record gets sent on its own, making too many requests highly probable. Thus, C should help and it is obviously more cost effective than B.
upvoted 5 times
...
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago