Menu

Post image 1
Post image 2
1 / 2
0

How I debugged a Delta Lake DESCRIBE HISTORY timeout (and what's actually causing it)

DEV Community·Abhishek Ambare·29 days ago
#4NWsUIJM
Reading 0:00
15s threshold

If you have ever run DESCRIBE HISTORY on a Delta table that receives streaming data every 60 seconds and watched it either hang for hours or crash with an OutOfMemoryError , you are not alone and you are not doing anything wrong. The problem is architectural, and once you understand the internals, the fix becomes a lot clearer. Here is what I learned after digging into why this happens and what you can actually do about it. How the Delta transaction log works Every write to a Delta table, INSERT , UPDATE , DELETE , MERGE , schema change, gets recorded as a JSON file in a directory called _delta_log at the root of the table. Files are named with zero-padded twenty-digit integers: _delta_log/ ├── 00000000000000000000.json ├── 00000000000000000001.json ├── 00000000000000000002.json ...…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More