Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 38 discussion

Exam question from Amazon's AWS Certified Data Analytics - Specialty

Question #: 38
Topic #: 1

[All AWS Certified Data Analytics - Specialty Questions]

A financial company uses Apache Hive on Amazon EMR for ad-hoc queries. Users are complaining of sluggish performance.
A data analyst notes the following:
✑ Approximately 90% of queries are submitted 1 hour after the market opens.
Hadoop Distributed File System (HDFS) utilization never exceeds 10%.

Which solution would help address the performance issues?

A. Create instance fleet configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metric. Create an automatic scaling policy to scale in the instance fleet based on the CloudWatch CapacityRemainingGB metric.
B. Create instance fleet configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automatic scaling policy to scale in the instance fleet based on the CloudWatch YARNMemoryAvailablePercentage metric.
C. Create instance group configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metric. Create an automatic scaling policy to scale in the instance groups based on the CloudWatch CapacityRemainingGB metric.
D. Create instance group configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automatic scaling policy to scale in the instance groups based on the CloudWatch YARNMemoryAvailablePercentage metric.

Show Suggested Answer

Suggested Answer: D 🗳️

by testtaker3434 at Aug. 9, 2020, 2:08 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Shraddha

Highly Voted 3 years, 6 months ago

Ans D A and B = wrong, instance fleet does not support auto scaling. C = wrong, HDFS utilization never exceeds 10% no scaling will never happen.

upvoted 23 times

lakediver

3 years, 5 months ago

The following are two commonly used metrics for automatic scaling: YarnMemoryAvailablePercentage: This is the percentage of remaining memory that's available for YARN. ContainerPendingRatio: This is the ratio of pending containers to allocated containers. You can use this metric to scale a cluster based on container-allocation behavior for varied loads. This is useful for performance tuning.

upvoted 3 times

...

lakediver

3 years, 5 months ago

Agree For further reference see https://aws.amazon.com/premiumsupport/knowledge-center/auto-scaling-in-amazon-emr/

upvoted 2 times

...

ariane_tateishi

Highly Voted 3 years, 6 months ago

D should be the right answer. Considering the following links: the first link is possible to see that the right metric to this requirement is YARNMemoryAvailablePercentage, because the HDFS never is over 10%. The second link explain that if you will use auto scaling so you should use instance group. https://docs.aws.amazon.com/emr/latest/ManagementGuide/UsingEMR_ViewingMetrics.html https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-instances-guidelines.html

upvoted 7 times

...

monkeydba

Most Recent 1 year, 6 months ago

"Managed scaling is available for clusters composed of either instance groups or instance fleets." https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-scaling.html#:~:text=Managed%20scaling%20is%20available%20for%20clusters%20composed%20of%20either%20instance%20groups%20or%20instance%20fleets.

upvoted 1 times

...

pk349

2 years ago

D: I passed the test

upvoted 1 times

...

srirnag

2 years, 3 months ago

YARNMemoryAvailablePercentage-> is for CPU intensive workload, CapacityRemainingGB-> for Capacity intensive workload, Instance fleet is ruled out. Hence, D

upvoted 1 times

...

cloudlearnerhere

2 years, 6 months ago

Selected Answer: D

Correct answer is D as instance group configurations for core and task nodes can be used to scale as per the YARNMemoryAvailablePercentage metric. options A & B are incorrect because an Instance Fleet doesn’t have an automatic scaling policy. Only an Instance Group has this feature. Option C is incorrect as the CapacityRemainingGB metric is just the amount of remaining HDFS disk capacity and this does not exceed 10% for each run. The cluster will not scale-in or scale-out if you choose this metric.

upvoted 4 times

cloudlearnerhere

2 years, 6 months ago

CloudWatch metrics that you can use for automatic scaling in Amazon EMR, The following are two commonly used metrics for automatic scaling: YarnMemoryAvailablePercentage: This is the percentage of remaining memory that's available for YARN. ContainerPendingRatio: This is the ratio of pending containers to allocated containers. You can use this metric to scale a cluster based on container-allocation behavior for varied loads. This is useful for performance tuning. For the given use case, the correct solution should support automatic scaling. You can set up automatic scaling in Amazon EMR for an instance group, adding and removing instances automatically based on the value of an Amazon CloudWatch metric that you specify. The metric YARNMemoryAvailablePercentage represents the percentage of remaining memory available to YARN (YARNMemoryAvailablePercentage = MemoryAvailableMB / MemoryTotalMB). This value is useful for scaling cluster resources based on YARN memory usage.

upvoted 2 times

...

Arka_01

2 years, 8 months ago

Selected Answer: D

Instance Fleet cannot take part in Auto-Scaling. CapacityRemainingGB is not the parameter to refer as "(HDFS) utilization never exceeds 10%". So the answer is D.

upvoted 1 times

...

rocky48

2 years, 10 months ago

Selected Answer: D

upvoted 1 times

...

Ramshizzle

2 years, 11 months ago

Answer should be D like others have said. However, I think it would be even better to use Instance fleets and EMR Managed auto scaling, but this is not an option here.

upvoted 1 times

...

Bik000

3 years ago

Selected Answer: D

Answer is D

upvoted 1 times

...

jrheen

3 years ago

Answer-D

upvoted 1 times

...

ShilaP

3 years, 2 months ago

D is the right answer.

upvoted 1 times

...

aws2019

3 years, 6 months ago

Option D is the right choice.

upvoted 1 times

...

Billhardy

3 years, 6 months ago

Ans D

upvoted 1 times

...

Naresh_Dulam

3 years, 7 months ago

Answer is D over B. Because Spot instance fleet support "managed" auto scaling and managed auto scaling can't use Cloud watch metric like YARNMemoryAvailablePercentage. Managed auto scaling scaled depends load on the cluster.

upvoted 4 times

...

lostsoul07

3 years, 7 months ago

D is the right answer

upvoted 1 times

...

BillyC

3 years, 7 months ago

D IS Correct for my

upvoted 3 times

...

Load full discussion...