PostgreSQL Join Optimization: Nested Loop, Hash, and Merge

1 / 2

PostgreSQL Join Optimization: Nested Loop, Hash, and Merge

DEV Community·Philip McClarence·about 1 month ago

#5VAsqr1L

#postgres #performance #database #sql #when #join

Reading 0:00

15s threshold

PostgreSQL has three join algorithms. The planner picks between them for every join in every query, driven by several things at once: the estimated sizes of the two inputs, whether they arrive already sorted on the join key, the type of join (inner vs left/semi/anti), which operators are mergejoinable or hashjoinable , whether a hash table will fit in work_mem , and the cost parameters that weigh I/O against CPU. Get the decision right and a three-way join across millions of rows runs in tens of milliseconds. Get it wrong — usually by encouraging a Nested Loop on two large unsorted inputs — and the same query takes minutes. This article is the third in the Complete Guide to PostgreSQL SQL Query Analysis & Optimization series. We assume the reader can read EXPLAIN output and is familiar with the indexing vocabulary . The running dataset is the same Neon Postgres 17.8 database used throughout the series: 500,000-row sim_bp_orders , 1,000,000-row sim_bp_order_items , 200,000-row sim_bp_users .…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

PostgreSQL Join Optimization: Nested Loop, Hash, and Merge