Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 93 discussion

Exam question from Amazon's AWS Certified Data Analytics - Specialty

Question #: 93
Topic #: 1

[All AWS Certified Data Analytics - Specialty Questions]

A company wants to provide its data analysts with uninterrupted access to the data in its Amazon Redshift cluster. All data is streamed to an Amazon S3 bucket with Amazon Kinesis Data Firehose. An AWS Glue job that is scheduled to run every 5 minutes issues a COPY command to move the data into Amazon Redshift.
The amount of data delivered is uneven throughout the day, and cluster utilization is high during certain periods. The COPY command usually completes within a couple of seconds. However, when load spike occurs, locks can exist and data can be missed. Currently, the AWS Glue job is configured to run without retries, with timeout at 5 minutes and concurrency at 1.
How should a data analytics specialist configure the AWS Glue job to optimize fault tolerance and improve data availability in the Amazon Redshift cluster?

A. Increase the number of retries. Decrease the timeout value. Increase the job concurrency.
B. Keep the number of retries at 0. Decrease the timeout value. Increase the job concurrency.
C. Keep the number of retries at 0. Decrease the timeout value. Keep the job concurrency at 1.
D. Keep the number of retries at 0. Increase the timeout value. Keep the job concurrency at 1.

Show Suggested Answer

Suggested Answer: A 🗳️

by VikG12 at May 3, 2021, 6:06 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Dr_Kiko

Highly Voted 3 years, 7 months ago

A is wrong, read that locks means: Locking is a protection mechanism that controls how many sessions can access a table at the same time. To solve a locking problem, identify the session (PID) that is holding the lock and then terminate the session. If the session doesn't terminate, reboot your cluster. So... If you increase concurrency, you increase a number of sessions that makes things worse. Hence C

upvoted 17 times

yellowdev

2 years, 1 month ago

Please explain how decreasing timeout help in this case?

upvoted 1 times

...

dushmantha

2 years, 10 months ago

You are only suggesting to decrease the timeout value. How does that help if data is missing to be copied? The best way to solve this problem is to increase re tries, wait less for a re try to complete and concurrently copy data to redshift. So ans is "A".

upvoted 2 times

...

VikG12

Highly Voted 3 years, 8 months ago

Not sure but should be A for number of re-tries and increased concurrency.

upvoted 11 times

...

MLCL

Most Recent 1 year, 10 months ago

Selected Answer: D

D makes sense, increasing timeout leaves time for locks to resolve, also, job concurrency should be 1 for COPY commands, it doesnt make sense to have multiple ones running.

upvoted 2 times

...

mawsman

2 years, 2 months ago

A is the least wrong, since it theoretically will increase fault tolerance. But it's an incorrect by design. Increasing concurrency will lead to an increase in locks, leading to more timeouts, leading to more retries, so once we reach the max retries, some data still might be missed. The right answer would be to set concurrency to 1 and increase retries while decreasing timeouts. B, C and D: Keep the number of retries at 0. - whatever the other configuration you set, when the job fails due to a lock that job will never be retried and data will be missed 100%.

upvoted 5 times

mawsman

2 years, 2 months ago

Actually, timeout jobs aren't retried as per the docs. So even A is wrong from a retry point of view. The only thing that will allow us to wait for the lock to resolve themselves is an increase in timeout then - so really all answers are bad, but D is less bad then. Also the default timeout is 48 hours, so the fact that the job was set to 5 min timeout is short.

upvoted 4 times

...

rags1482

2 years, 2 months ago

Option C is the best solution to optimize fault tolerance and improve data availability. Keeping the number of retries at 0 will ensure that the job does not attempt to retry if it fails, which may cause further locks and missed data. Decreasing the timeout value will ensure that the job fails quickly if it is unable to complete the COPY command. Keeping the job concurrency at 1 will ensure that the job runs on a single node, which will reduce the chances of locks occurring.

upvoted 1 times

...

akashm99101001com

2 years, 2 months ago

Selected Answer: A

ets go in reverse to answer this. D is out - Glue job that is scheduled to run every 5 minutes so timeout can't increase C is out - locks will still exist when spike is high B is out - jobs will fail due to decreased timeout when there is no concurrency.

upvoted 2 times

...

akashm99101001com

2 years, 2 months ago

Selected Answer: A

lets go in reverse to answer this. D is out - Glue job that is scheduled to run every 5 minutes C is out - locks will still exist B is out - jobs will due to decreased timeout when there is no concurrency.

upvoted 2 times

...

Gabba

2 years, 3 months ago

Selected Answer: A

Locks generally happen in case of DDL statements. Refer this link - https://docs.aws.amazon.com/redshift/latest/dg/r_LOCK.html Question mentions about Glue job reading data every 5 mins and inserting it in Redshift. In case of lock present on table, the job would timeout. If we don't retry, there is obviously data loss. Job generally completes in few seconds so decreasing timeout definitely makes sense and everyone agree. We definitely need to increase retry count so if lock exist, job will retry with same data insertion in hope that lock is released. Even if lock is not released in next attempt and by the time next glue job sequence will pick the data for insertion and if we don't increase concurrency, initial failed job would lead to data loss. So important to increase concurrency as well so that previous instance and current instance of glue job can run together.

upvoted 4 times

...

henom

2 years, 6 months ago

Selected Answer: D

Suggested Concurrency is 1

upvoted 1 times

...

rav009

2 years, 10 months ago

D. Increasing the timeout is not conflict with "typically finish with a few seconds", and it can be helpful in the peak time

upvoted 4 times

...

dushmantha

2 years, 10 months ago

Selected Answer: A

The best way to solve this problem is to increase re tries, wait less for a re try to complete and concurrently copy data to redshift. So ans is "A".

upvoted 2 times

...

rocky48

2 years, 10 months ago

Selected Answer: A

upvoted 1 times

...

certificationJunkie

3 years ago

A looks right. Timeout should be reduced and retries should be increased. Also, concurrency should be increased so that the copy operation completes quickly. Concurrency will not cause locking as the copy command running in concurrency mode should be part of the same transaction. Hence there is no question of locking.

upvoted 3 times

...

Shammy45

3 years ago

Selected Answer: C

Increasing number of retries will add stress on locks when Copy command completes in few seconds, there is no need to wait till 5 min rather, TimeOut should be decreased. Concurrency at 1 is perfect

upvoted 2 times

certificationJunkie

3 years ago

if a copy command completes, system will not wait for 5 mins and there is no timeout scenario

upvoted 1 times

...

MWL

3 years ago

Selected Answer: A

Agree with NTP's first answer, and added some comments. The question requires "Data avaibility" and "fault tolerant", so data loss should be avoid. I aggree with your previoud answer: 1. COPY finish within a few seconds, so decrease timeout and increase retries. It can reduce the table lock as you said. 2. increase concurrent jobs, it can execute COPY and QUERY at the same time, if only they are not on the same table. And if only most of the COPY can finish in seconds, there will be no LOCK because of this.

upvoted 2 times

...

CHRIS12722222

3 years, 2 months ago

If locks exists, reads/writes are made to wait till the session holding the lock completes Ref: https://docs.aws.amazon.com/redshift/latest/dg/r_LOCK.html Therefore, increasing timeout will help. Keeping concurrency at 1 is recommended to avoid multiple COPY command and prevent multiple instances of glue task. Maybe answer = D

upvoted 2 times

CHRIS12722222

3 years, 2 months ago

increased timeout will help because longer timeout means more waiting for glue job which means more chance for session holding the lock to complete before timeout occurs.

upvoted 1 times

...

npt

3 years, 5 months ago

A COPY commands typically finish within a few seconds -> decrease timeout to 30 seconds. If there is a lock, this task will fail and release a lock, but we still loose the data. So we need to add more retries, e.g. 2 retries. If we add so many retires, there could be concurrent jobs, concurrent COPY is not recommended but this case of lock happens many times continuously is rare

upvoted 1 times

npt

3 years, 5 months ago

Change to C Number of retries - Specify the number of times, from 0 to 10, that AWS Glue should automatically restart the job if it fails. Jobs that reach the timeout limit are not restarted. https://docs.aws.amazon.com/glue/latest/dg/add-job.html So retry will not help in case of timeout. With only reducing timeout, we can only reduce the locks and the amount of data lost, it's ok because the question does not require no data lost

upvoted 3 times

MWL

3 years ago

The question requires "Data avaibility" and "fault tolerant", so data loss should be avoid. I aggree with your previoud answer: 1. COPY finish within a few seconds, so decrease timeout and increase retries. It can reduce the table lock as you said. 2. increase concurrent jobs, it can execute COPY and QUERY at the same time, if only they are not on the same table. And if only most of the COPY can finish in seconds, there will be no LOCK because of this.

upvoted 1 times

...

Load full discussion...