Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 22 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 22
Topic #: 1
[All Certified Data Engineer Professional Questions]

Which statement describes Delta Lake Auto Compaction?

  • A. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 1 GB.
  • B. Before a Jobs cluster terminates, OPTIMIZE is executed on all tables modified during the most recent job.
  • C. Optimized writes use logical partitions instead of directory partitions; because partition boundaries are only represented in metadata, fewer small files are written.
  • D. Data is queued in a messaging bus instead of committing data directly to memory; all data is committed from the messaging bus in one batch once the job is complete.
  • E. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 128 MB.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
partha1022
1 month ago
Selected Answer: B
Auto compaction is synchronous job.
upvoted 1 times
...
Shailly
2 months ago
Selected Answer: B
A and E are wrong because auto compaction is synchronous operation! I vote for B As per documentation - "Auto compaction occurs after a write to a table has succeeded and runs synchronously on the cluster that has performed the write. Auto compaction only compacts files that haven’t been compacted previously." https://docs.delta.io/latest/optimizations-oss.html
upvoted 3 times
...
imatheushenrique
3 months, 3 weeks ago
E. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 128 MB. https://community.databricks.com/t5/data-engineering/what-is-the-difference-between-optimize-and-auto-optimize/td-p/21189
upvoted 1 times
...
ojudz08
7 months, 1 week ago
Selected Answer: E
E is the answer. Enable the settings uses the 128 MB as the target file size https://learn.microsoft.com/en-us/azure/databricks/delta/tune-file-size
upvoted 2 times
...
DAN_H
7 months, 3 weeks ago
Selected Answer: E
default file size is 128MB in auto compaction
upvoted 1 times
...
kz_data
8 months, 2 weeks ago
E is correct as the default file size is 128MB in auto compaction, not 1GB as normal OPTIMIZE statement.
upvoted 1 times
...
IWantCerts
8 months, 2 weeks ago
Selected Answer: E
128MB is the default.
upvoted 1 times
...
Yogi05
9 months ago
Question is more on auto compaction hence the answer is E, as default size or auto compaction is 128 mb
upvoted 1 times
...
hamzaKhribi
9 months, 3 weeks ago
Selected Answer: E
Optimize default target file size is 1Gb, however in this question we are dealing with auto compaction. Which when enabled runs optimize with 128MB file size by default.
upvoted 1 times
...
aragorn_brego
10 months ago
Selected Answer: A
Delta Lake's Auto Compaction feature is designed to improve the efficiency of data storage by reducing the number of small files in a Delta table. After data is written to a Delta table, an asynchronous job can be triggered to evaluate the file sizes. If it determines that there are a significant number of small files, it will automatically run the OPTIMIZE command, which coalesces these small files into larger ones, typically aiming for files around 1 GB in size for optimal performance. E is incorrect because the statement is similar to A but with an incorrect default file size target.
upvoted 4 times
Kill9
3 months ago
Table property delta.autoOptimize.autoCompact target 128 mb. For table property delta.tuneFileSizesForRewrites, tables larger than 10 TB, the target file size is 1 GB. https://learn.microsoft.com/en-us/azure/databricks/delta/tune-file-size
upvoted 1 times
...
...
BIKRAM063
10 months, 3 weeks ago
Selected Answer: E
E is correct. Auto compact tries to optimize to a file size of 128MB
upvoted 1 times
...
sturcu
11 months, 2 weeks ago
Selected Answer: E
E is the best feet, although databricks says that auto compaction runs runs synchronously
upvoted 3 times
...
Eertyy
1 year ago
correct answer is e
upvoted 1 times
...
cotardo2077
1 year ago
Selected Answer: E
E fits best, but according to docs it is synchronous opeartion "Auto compaction occurs after a write to a table has succeeded and runs synchronously on the cluster that has performed the write. Auto compaction only compacts files that haven’t been compacted previously."
upvoted 4 times
...
taif12340
1 year ago
Correct answer is E: Auto optimize consists of 2 complementary operations: - Optimized writes: with this feature enabled, Databricks attempts to write out 128 MB files for each table partition. - Auto compaction: this will check after an individual write, if files can further be compacted. If yes, it runs an OPTIMIZE job with 128 MB file sizes (instead of the 1 GB file size used in the standard OPTIMIZE)
upvoted 3 times
...
BrianNguyen95
1 year, 1 month ago
correct answer is A
upvoted 1 times
...
8605246
1 year, 1 month ago
correct answer is E, the auto-compaction runs a asynchronous job to combine small files to a default of 128 MB https://learn.microsoft.com/en-us/azure/databricks/delta/tune-file-size
upvoted 4 times
BrianNguyen95
1 year, 1 month ago
128 MB for partition is not compress
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...