Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 1 question 8 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty

Question #: 8
Topic #: 1

[All AWS Certified Big Data - Specialty Questions]

A web-hosting company is building a web analytics tool to capture clickstream data from all of the websites hosted within its platform and to provide near-real-time business intelligence. This entire system is built on
AWS services. The web-hosting company is interested in using Amazon Kinesis to collect this data and perform sliding window analytics.
What is the most reliable and fault-tolerant technique to get each website to send data to Amazon Kinesis with every click?

A. After receiving a request, each web server sends it to Amazon Kinesis using the Amazon Kinesis PutRecord API. Use the sessionID as a partition key and set up a loop to retry until a success response is received.
B. After receiving a request, each web server sends it to Amazon Kinesis using the Amazon Kinesis Producer Library .addRecords method.
C. Each web server buffers the requests until the count reaches 500 and sends them to Amazon Kinesis using the Amazon Kinesis PutRecord API.
D. After receiving a request, each web server sends it to Amazon Kinesis using the Amazon Kinesis PutRecord API. Use the exponential back-off algorithm for retries until a successful response is received.

Show Suggested Answer

Suggested Answer: A 🗳️

by kn at July 14, 2019, 4:55 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

AdamSmith

Highly Voted 3 years, 8 months ago

The KPL provides the following features out of the box: Batching of puts using PutRecords (the Collector in the architecture diagram) Tracking of record age and enforcement of maximum buffering times (all components) Per-shard record aggregation (the Aggregator) Retries in case of errors, with ability to distinguish between retryable and non-retryable errors (the Retrier) Per-shard rate limiting to prevent excessive and pointless spamming (the Limiter) Useful metrics and a highly efficient CloudWatch client (not shown in diagram) https://aws.amazon.com/blogs/big-data/implementing-efficient-and-reliable-producers-with-the-amazon-kinesis-producer-library/ the answer should be B

upvoted 6 times

srirampc

3 years, 8 months ago

PutRecords is right. Not sure if there a .addrecords

upvoted 2 times

...

jove

Most Recent 3 years, 7 months ago

Keywords : near-real-time, most reliable and fault-tolerant technique Answer is B : Kinesis Producer Library

upvoted 1 times

...

hdesai

3 years, 7 months ago

B is right answer. There seems to be typo in question - it must be addUserRecord method. https://aws.amazon.com/blogs/big-data/implementing-efficient-and-reliable-producers-with-the-amazon-kinesis-producer-library/

upvoted 1 times

...

jsr2017

3 years, 7 months ago

D https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html

upvoted 2 times

...

agm84

3 years, 7 months ago

The answer should be B however KPL does not have a method named .addRecords. It looks like typo, as the KPL method is addUserRecord.

upvoted 4 times

...

kkyong

3 years, 7 months ago

D is the correct answer A,is worng need to setup back of retry . B. dont' exist addrecords api C.it is anti-patern

upvoted 4 times

...

Debi_mishra

3 years, 7 months ago

A is correct. B - there is no addrecords in KPL. D - There is no backoff algorithm in Kinesis Agent.

upvoted 1 times

...

Bulti

3 years, 8 months ago

After researching further, I think B is the right answer. KPL has built in fault tolerance with configurable retry mechanism and there is no need to write custom fault-tolerant logic to do so. Besides KPL allows us to batch the records ( aggregation and collection) out of the box without having to write custom code to achieve scalability.

upvoted 2 times

Corram

3 years, 7 months ago

There is no addRecords method in KPL, so B is wrong.

upvoted 2 times

...

Bulti

3 years, 8 months ago

D is the right answer. You need to do a back off with retry. You can do it exponentially. This usually happens due to a hot partition. To ensure fault tolerance if using an PutRecord API you will need to handle ProvisionedThroughputExceeded exceptions and the way to do that is to use backoff with retry mechanism. Option A seems to suggest we are doing a retry but in a loop. However if we don't do a back-off (meaning wait for a certain duration) before we retry, the system would continue to fail.

upvoted 4 times

Anjoy

3 years, 7 months ago

Agree with D. Reference: https://aws.amazon.com/blogs/big-data/implementing-efficient-and-reliable-producers-with-the-amazon-kinesis-producer-library/

upvoted 1 times

...

YashBindlish

3 years, 8 months ago

B is the Correct Answer

upvoted 2 times

Corram

3 years, 7 months ago

There is no .addRecords method in KPL.

upvoted 1 times

MihirB

3 years, 7 months ago

Is it that option B implies . addUserRecord() method when it refers to .addRecord() method, because if that is the case, the correct answer should be B.

upvoted 1 times

...

axlrose

3 years, 8 months ago

Answer is B. KPL is used when High performance, long-running producers Automated and configurable retry mechanism Sync and Async API(better perf for Async) 100B records (high volume of data)

upvoted 1 times

...

san2020

3 years, 8 months ago

my selection A

upvoted 1 times

...

hailiang

3 years, 8 months ago

why not B? seems to me you need to reinvent the wheel in A while kpl can do all that already

upvoted 2 times

Corram

3 years, 7 months ago

There is no .addRecords method in KPL.

upvoted 2 times

...

d00ku

3 years, 8 months ago

exponential back-off algorithm appears in answer D -> this seems to be the correct one.

upvoted 3 times

d00ku

3 years, 8 months ago

replying to myself -> there is no exponential back-off algorithm for KPL. It uses a more aggressive strategy to do retries. A is correct.

upvoted 5 times

...

M2

3 years, 8 months ago

A looks correct.

upvoted 2 times

...

exams

3 years, 8 months ago

A is correct

upvoted 4 times

Corram

3 years, 7 months ago

The loop in A might create spamming due to excessive retries,. For example, KPL has a Rate limiting feature to deal with this, but our manual soultion here does not. Therefore A looks wrong to me.

upvoted 1 times

...

kn

3 years, 9 months ago

Option A is correct because there is concept of back-off algorithm in Kinesis https://docs.aws.amazon.com/streams/latest/dev/kinesis-producer-adv-retries-rate-limiting.html

upvoted 4 times

Jialu

3 years, 8 months ago

A is the correct answer

upvoted 3 times

...