Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Exam Certified Data Engineer Professional topic 1 question 25 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 25
Topic #: 1
[All Certified Data Engineer Professional Questions]

A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.
Which situation is causing increased duration of the overall job?

  • A. Task queueing resulting from improper thread pool assignment.
  • B. Spill resulting from attached volume storage being too small.
  • C. Network latency due to some cluster nodes being in different regions from the source data
  • D. Skew caused by more data being assigned to a subset of spark-partitions.
  • E. Credential validation errors while pulling data from an external system.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️


Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
2 months ago
Selected Answer: D
D is correct
upvoted 1 times
2 months, 3 weeks ago
D is the correct answer
upvoted 2 times
Community vote distribution
A (35%)
C (25%)
B (20%)
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

Loading ...