exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 5 discussion

A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs from the job run show no error codes. The data analyst wants to improve the job execution time without overprovisioning.
Which actions should the data analyst take?

  • A. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the executor- cores job parameter.
  • B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter.
  • C. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the spark.yarn.executor.memoryOverhead job parameter.
  • D. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the num- executors job parameter.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Donell
Highly Voted 3 years, 6 months ago
Answer: B B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter. Similar question is there in Jon Bonso's practice exam.
upvoted 15 times
...
cloudlearnerhere
Highly Voted 2 years, 6 months ago
Correct answer is B as job metrics can be used to estimate the number of DPUs needed. Options A & D are wrong as Job bookmarks help AWS Glue maintain state information and prevent the reprocessing of old data. Options A & D are wrong as Job bookmarks help AWS Glue maintain state information and prevent the reprocessing of old data.
upvoted 6 times
...
gofavad926
Most Recent 1 year, 7 months ago
Selected Answer: B
B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter.
upvoted 1 times
...
NikkyDicky
1 year, 9 months ago
Selected Answer: B
It's a B
upvoted 1 times
...
pk349
2 years ago
B: I passed the test
upvoted 1 times
...
AwsNewPeople
2 years, 2 months ago
B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter. The data analyst should enable job metrics in AWS Glue to estimate the number of data processing units (DPUs) and profile the job to understand its resource requirements. Based on the profiled metrics, the data analyst should increase the value of the maximum capacity job parameter. This parameter controls the maximum number of DPUs that the job can use. By increasing the maximum capacity, the job can use more resources and complete faster without overprovisioning. Enabling job bookmarks can help with incremental processing but will not directly improve job execution time. Increasing the value of the executor-cores job parameter or the spark.yarn.executor.memoryOverhead job parameter may improve performance, but these parameters depend on the specific job requirements and are not directly related to the job's resource utilization. Similarly, increasing the num-executors job parameter will not directly improve job execution time.
upvoted 5 times
...
rocky48
2 years, 9 months ago
Selected Answer: B
Answer: B
upvoted 2 times
...
killohotel
3 years, 6 months ago
정답:B 북마크는 상태를 표시하는것이기때문에 A,D는 제외. 메트릭을 통해서 dpu적정 개수를 추정하고, 최대용량을 늘리면 됩니다. 오버헤드 파라미터는 에러가 발생했을때 수정해야하지만, 문제는 3시간동안 러닝상태이기때문에 에러가 난것은 아니므로 C 제외 https://docs.aws.amazon.com/ko_kr/glue/latest/dg/monitor-debug-capacity.html#monitor-debug-capacity-fix
upvoted 2 times
teo2157
1 year, 4 months ago
Clear like water
upvoted 1 times
...
...
Donell
3 years, 6 months ago
I suggest taking Jon Bonso's practice exams too.
upvoted 1 times
...
Huy
3 years, 6 months ago
This question is quite confused because for Glue version 2.0 jobs, you cannot instead specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers. However since A, D (no such parameters) and C (memoryOverhead not help in this case) are wrong, the best choice is B
upvoted 3 times
...
Shraddha
3 years, 6 months ago
Ans B A and D = wrong, the name “bookmark” suggests persistency of a in-progress state, and so it is, used to track processed data, not for scaling. C = wrong, although you can set this parameter, the job metrics won’t help you, and this parameter won’t help long job running times because that was due to lack of computational power not memory.
upvoted 1 times
...
gunjan4392
3 years, 6 months ago
B is correct.
upvoted 1 times
...
Exia
3 years, 6 months ago
B. A, D. Bookmark is not used for monitoring ETL job status.
upvoted 1 times
...
lostsoul07
3 years, 6 months ago
B is the right answer
upvoted 1 times
...
Draco31
3 years, 7 months ago
B. B and C can make sense but C will be right only if the job returned the error spark.yarn.executor.memoryOverhead. If no error, then the job is just taking too long so increase the max capacity For AWS Glue version 2.0 jobs, you cannot instead specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers. https://docs.aws.amazon.com/glue/latest/dg/add-job.html
upvoted 4 times
...
BillyC
3 years, 7 months ago
b is correct!
upvoted 1 times
...
Paitan
3 years, 7 months ago
Option B for sure. We can eliminate the two options with Bookmarks and spark.yarn.executor.memoryOverhead has nothing to do with Glue.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago