Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Exam Certified Data Engineer Professional topic 1 question 22 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 22
Topic #: 1
[All Certified Data Engineer Professional Questions]

Which statement describes Delta Lake Auto Compaction?

  • A. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 1 GB.
  • B. Before a Jobs cluster terminates, OPTIMIZE is executed on all tables modified during the most recent job.
  • C. Optimized writes use logical partitions instead of directory partitions; because partition boundaries are only represented in metadata, fewer small files are written.
  • D. Data is queued in a messaging bus instead of committing data directly to memory; all data is committed from the messaging bus in one batch once the job is complete.
  • E. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 128 MB.
Show Suggested Answer Hide Answer
Suggested Answer: A ūüó≥ÔłŹ

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
hamzaKhribi
1 week, 2 days ago
Selected Answer: E
Optimize default target file size is 1Gb, however in this question we are dealing with auto compaction. Which when enabled runs optimize with 128MB file size by default.
upvoted 1 times
...
aragorn_brego
2 weeks, 6 days ago
Selected Answer: A
Delta Lake's Auto Compaction feature is designed to improve the efficiency of data storage by reducing the number of small files in a Delta table. After data is written to a Delta table, an asynchronous job can be triggered to evaluate the file sizes. If it determines that there are a significant number of small files, it will automatically run the OPTIMIZE command, which coalesces these small files into larger ones, typically aiming for files around 1 GB in size for optimal performance. E is incorrect because the statement is similar to A but with an incorrect default file size target.
upvoted 1 times
...
BIKRAM063
1 month, 1 week ago
Selected Answer: E
E is correct. Auto compact tries to optimize to a file size of 128MB
upvoted 1 times
...
sturcu
2 months ago
Selected Answer: E
E is the best feet, although databricks says that auto compaction runs runs synchronously
upvoted 2 times
...
Eertyy
2 months, 3 weeks ago
correct answer is e
upvoted 1 times
...
cotardo2077
3 months, 1 week ago
Selected Answer: E
E fits best, but according to docs it is synchronous opeartion "Auto compaction occurs after a write to a table has succeeded and runs synchronously on the cluster that has performed the write. Auto compaction only compacts files that haven’t been compacted previously."
upvoted 4 times
...
taif12340
3 months, 2 weeks ago
Correct answer is E: Auto optimize consists of 2 complementary operations: - Optimized writes: with this feature enabled, Databricks attempts to write out 128 MB files for each table partition. - Auto compaction: this will check after an individual write, if files can further be compacted. If yes, it runs an OPTIMIZE job with 128 MB file sizes (instead of the 1 GB file size used in the standard OPTIMIZE)
upvoted 3 times
...
BrianNguyen95
3 months, 3 weeks ago
correct answer is A
upvoted 1 times
...
8605246
4 months, 1 week ago
correct answer is E, the auto-compaction runs a asynchronous job to combine small files to a default of 128 MB https://learn.microsoft.com/en-us/azure/databricks/delta/tune-file-size
upvoted 4 times
BrianNguyen95
3 months, 3 weeks ago
128 MB for partition is not compress
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...