exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 3 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 3
Topic #: 1
[All Certified Data Engineer Professional Questions]

When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?

  • A. Cluster: New Job Cluster;
    Retries: Unlimited;
    Maximum Concurrent Runs: Unlimited
  • B. Cluster: New Job Cluster;
    Retries: None;
    Maximum Concurrent Runs: 1
  • C. Cluster: Existing All-Purpose Cluster;
    Retries: Unlimited;
    Maximum Concurrent Runs: 1
  • D. Cluster: New Job Cluster;
    Retries: Unlimited;
    Maximum Concurrent Runs: 1
  • E. Cluster: Existing All-Purpose Cluster;
    Retries: None;
    Maximum Concurrent Runs: 1
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
8605246
Highly Voted 2 years ago
the answer given is correct: Maximum concurrent runs: Set to 1. There must be only one instance of each query concurrently active. Retries: Set to Unlimited. https://docs.databricks.com/en/structured-streaming/query-recovery.html
upvoted 11 times
...
363c4c5
Most Recent 1 month ago
Selected Answer: D
New Job Cluster: Using a new job cluster ensures that the compute resources are appropriately sized and dedicated to the job, which can help in managing costs and performance more effectively than using an existing all-purpose cluster. Retries: Unlimited: Setting retries to unlimited ensures that the job will automatically recover from failures by retrying until it succeeds. Maximum Concurrent Runs: 1: Limiting the maximum concurrent runs to 1 prevents multiple instances of the job from running simultaneously, which can help in controlling costs and avoiding resource contention. Databricks recommends using jobs compute instead of all-purpose compute when scheduling workflows, as it helps in managing resources more efficiently and reduces costs. https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/production https://learn.microsoft.com/en-us/azure/databricks/jobs/continuous
upvoted 1 times
...
79f0e18
1 month, 1 week ago
Selected Answer: D
When running Structured Streaming jobs in production, you want: Automatic failure recovery → Requires setting Retries: Unlimited Efficient cost control → Use a New Job Cluster, which auto-terminates after job completion Concurrency control → Maximum Concurrent Runs: 1 prevents overlapping runs, which can corrupt streaming state or double-process data
upvoted 1 times
...
KadELbied
3 months, 1 week ago
Selected Answer: D
Suretly d
upvoted 1 times
...
codebender
4 months, 1 week ago
Selected Answer: D
Cant be all purpose general compute
upvoted 1 times
...
EelkeV
6 months, 1 week ago
Selected Answer: D
Job cluster autoterminates, and you want retries for recover
upvoted 1 times
...
akashdesarda
10 months, 2 weeks ago
Selected Answer: D
Use databricks jobs as it as native integration with Streaming use case. See the example Job here https://docs.databricks.com/en/structured-streaming/query-recovery.html#configure-structured-streaming-jobs-to-restart-streaming-queries-on-failure
upvoted 2 times
...
imatheushenrique
1 year, 2 months ago
D. Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: 1
upvoted 1 times
...
imatheushenrique
1 year, 2 months ago
D. Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: 1
upvoted 1 times
...
juliom6
1 year, 4 months ago
D is correct https://docs.databricks.com/en/structured-streaming/query-recovery.html
upvoted 1 times
...
AziLa
1 year, 6 months ago
Correct Ans is D
upvoted 1 times
...
Jay_98_11
1 year, 7 months ago
Selected Answer: D
D is correct
upvoted 1 times
...
kz_data
1 year, 7 months ago
Selected Answer: D
D is correct
upvoted 1 times
...
sturcu
1 year, 9 months ago
Selected Answer: D
D is correct
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...