Menu

📰
0

Stop telling people to avoid JOINs in ClickHouse (It’s not 2020 anymore)

Reddit r/analytics·u/FarRub2855·about 1 month ago
#tvVus7pT
Reading 0:00
15s threshold

Stop telling people to avoid JOINs in ClickHouse (It’s not 2020 anymore) Hi r/analytics, If you evaluated ClickHouse a few years ago, you were likely told to avoid **JOINs** at all costs. The standard "tribal knowledge" was to denormalize everything into massive, flat tables. In 2020, that criticism was fair. ClickHouse had a single hash join algorithm. If your right-side table exceeded memory, the query just crashed. **That advice is now officially outdated.** I recently dug into the commit history. Between 2022 and 2026, the engineering team merged over 50 pull requests that dismantled almost every limitation of the join engine. ClickHouse now has the planning sophistication of a mature RDBMS operating inside a vectorized model. I mapped out the most impactful changes shipping by default today: * **Grace Hash Join:** Inactive buckets now spill to disk. OOM crashes for memory-bound joins are completely solved.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More