From c3d21b807afaa213e2c790fff70f27461c834a96 Mon Sep 17 00:00:00 2001 From: Colm Date: Tue, 17 Dec 2024 21:42:15 +0000 Subject: [PATCH] PG17 compatibility: fix plan diffs in multi_explain (#7780) Regress test `multi_explain` has two queries that have a different query plan with PG17. Here is part of the plan diff for the query labelled _Union and left join subquery pushdown_ in `multi_explain.sql` (for the complete diff, search for `multi_explain` [here](https://github.com/citusdata/citus/actions/runs/12158205599/attempts/1)): ``` -> Sort Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone, events.event_time - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) - -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Nested Loop Left Join + Join Filter: (users.composite_id = subquery_2.composite_id) + -> Unique + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time -> Append ``` The change is the same in both queries; a hash left join with subquery_1 on the outer and subquery_2 on the inner side of the join is now a nested loop left join with subquery_1 on the outer and subquery_2 on the inner; additionally, the chosen method of uniquifying the UNION in subquery_1 has changed from hashed grouping to sort followed by unique, as shown in the diff above. The PG17 commit that caused this plan change is likely _[Fix MergeAppend to more accurately compute the number of rows that need to be sorted](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9d1a5354f)_ because it impacts the estimated rows counts of UNION paths. Comparing a costed plan of the query between PG16 and PG17 I noticed that with PG16 the rows estimate for the UNION in subquery_1 is 4, whereas with PG17 the rows estimate is 2. A lower rows estimate in the outer side of the join may result in nested loop looking cheaper than hash join for the left outer join, hence the plan change in the two queries where there is a UNION on the outer side of a left outer join. The proposed fix achieves a consistent plan across all supported postgres versions by temporarily disabling nested loop join and sort for the two impacted queries; the postgres optimizer selects hash join for the outer left join and hashed aggregation for the UNION operation. I investigated tweaking the queries, but was not able to arrive at a consistent plan, and I believe the SQL operator (e.g. join, group by, union) implementations are orthogonal to the intent of the test, so this should be a satisfactory solution, particularly as it avoids introducing a second alternative output file for `multi_explain`. --- src/test/regress/expected/multi_explain.out | 154 ++++++++++-------- src/test/regress/expected/multi_explain_0.out | 154 ++++++++++-------- src/test/regress/sql/multi_explain.sql | 8 + 3 files changed, 172 insertions(+), 144 deletions(-) diff --git a/src/test/regress/expected/multi_explain.out b/src/test/regress/expected/multi_explain.out index 906add24c..bfcf29c4d 100644 --- a/src/test/regress/expected/multi_explain.out +++ b/src/test/regress/expected/multi_explain.out @@ -671,6 +671,15 @@ Aggregate -> Hash -> Seq Scan on events_1400285 events Filter: ((event_type)::text = ANY ('{click,submit,pay}'::text[])) +SELECT success FROM run_command_on_workers('alter system set enable_nestloop to off'); +t +t +SELECT success FROM run_command_on_workers('alter system set enable_sort to off'); +t +t +SELECT success FROM run_command_on_workers('select pg_reload_conf()'); +t +t -- Union and left join subquery pushdown EXPLAIN (COSTS OFF) SELECT @@ -741,41 +750,38 @@ HashAggregate Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression - -> GroupAggregate - Group Key: subquery_top.hasdone - -> Sort - Sort Key: subquery_top.hasdone - -> Subquery Scan on subquery_top - -> GroupAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone - -> Sort - Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone, events.event_time - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) - -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time - -> Append - -> Hash Join - Hash Cond: (users.composite_id = events.composite_id) - -> Seq Scan on users_1400289 users - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events - Filter: ((event_type)::text = 'click'::text) - -> Hash Join - Hash Cond: (users_1.composite_id = events_1.composite_id) - -> Seq Scan on users_1400289 users_1 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events_1 - Filter: ((event_type)::text = 'submit'::text) - -> Hash - -> Subquery Scan on subquery_2 - -> Unique - -> Sort - Sort Key: ((events_2.composite_id).tenant_id), ((events_2.composite_id).user_id) - -> Seq Scan on events_1400285 events_2 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) + -> HashAggregate + Group Key: COALESCE(subquery_2.hasdone, 'Has not done paying'::text) + -> GroupAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone, events.event_time + -> Hash Left Join + Hash Cond: (users.composite_id = subquery_2.composite_id) + -> HashAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Append + -> Hash Join + Hash Cond: (users.composite_id = events.composite_id) + -> Seq Scan on users_1400289 users + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events + Filter: ((event_type)::text = 'click'::text) + -> Hash Join + Hash Cond: (users_1.composite_id = events_1.composite_id) + -> Seq Scan on users_1400289 users_1 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events_1 + Filter: ((event_type)::text = 'submit'::text) + -> Hash + -> Subquery Scan on subquery_2 + -> Unique + -> Sort + Sort Key: ((events_2.composite_id).tenant_id), ((events_2.composite_id).user_id) + -> Seq Scan on events_1400285 events_2 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) -- Union, left join and having subquery pushdown EXPLAIN (COSTS OFF) SELECT @@ -856,44 +862,48 @@ Sort Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression - -> GroupAggregate - Group Key: subquery_top.count_pay - -> Sort - Sort Key: subquery_top.count_pay - -> Subquery Scan on subquery_top - -> GroupAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay - Filter: (array_ndims(array_agg(('action=>1'::text) ORDER BY events.event_time)) > 0) - -> Sort - Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay, events.event_time - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) + -> HashAggregate + Group Key: COALESCE(subquery_2.count_pay, '0'::bigint) + -> GroupAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay + Filter: (array_ndims(array_agg(('action=>1'::text) ORDER BY events.event_time)) > 0) + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay, events.event_time + -> Hash Left Join + Hash Cond: (users.composite_id = subquery_2.composite_id) + -> HashAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Append + -> Hash Join + Hash Cond: (users.composite_id = events.composite_id) + -> Seq Scan on users_1400289 users + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events + Filter: ((event_type)::text = 'click'::text) + -> Hash Join + Hash Cond: (users_1.composite_id = events_1.composite_id) + -> Seq Scan on users_1400289 users_1 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events_1 + Filter: ((event_type)::text = 'submit'::text) + -> Hash + -> Subquery Scan on subquery_2 -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time - -> Append - -> Hash Join - Hash Cond: (users.composite_id = events.composite_id) - -> Seq Scan on users_1400289 users - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events - Filter: ((event_type)::text = 'click'::text) - -> Hash Join - Hash Cond: (users_1.composite_id = events_1.composite_id) - -> Seq Scan on users_1400289 users_1 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events_1 - Filter: ((event_type)::text = 'submit'::text) - -> Hash - -> Subquery Scan on subquery_2 - -> GroupAggregate - Group Key: events_2.composite_id - Filter: (count(*) > 2) - -> Sort - Sort Key: events_2.composite_id - -> Seq Scan on events_1400285 events_2 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) + Group Key: events_2.composite_id + Filter: (count(*) > 2) + -> Seq Scan on events_1400285 events_2 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) +SELECT success FROM run_command_on_workers('alter system reset enable_nestloop'); +t +t +SELECT success FROM run_command_on_workers('alter system reset enable_sort'); +t +t +SELECT success FROM run_command_on_workers('select pg_reload_conf()'); +t +t -- Lateral join subquery pushdown -- set subquery_pushdown due to limit in the query SET citus.subquery_pushdown to ON; diff --git a/src/test/regress/expected/multi_explain_0.out b/src/test/regress/expected/multi_explain_0.out index 5ba5e056f..4d3acd14d 100644 --- a/src/test/regress/expected/multi_explain_0.out +++ b/src/test/regress/expected/multi_explain_0.out @@ -671,6 +671,15 @@ Aggregate -> Hash -> Seq Scan on events_1400285 events Filter: ((event_type)::text = ANY ('{click,submit,pay}'::text[])) +SELECT success FROM run_command_on_workers('alter system set enable_nestloop to off'); +t +t +SELECT success FROM run_command_on_workers('alter system set enable_sort to off'); +t +t +SELECT success FROM run_command_on_workers('select pg_reload_conf()'); +t +t -- Union and left join subquery pushdown EXPLAIN (COSTS OFF) SELECT @@ -741,41 +750,38 @@ HashAggregate Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression - -> GroupAggregate - Group Key: subquery_top.hasdone - -> Sort - Sort Key: subquery_top.hasdone - -> Subquery Scan on subquery_top - -> GroupAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone - -> Sort - Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) - -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time - -> Append - -> Hash Join - Hash Cond: (users.composite_id = events.composite_id) - -> Seq Scan on users_1400289 users - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events - Filter: ((event_type)::text = 'click'::text) - -> Hash Join - Hash Cond: (users_1.composite_id = events_1.composite_id) - -> Seq Scan on users_1400289 users_1 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events_1 - Filter: ((event_type)::text = 'submit'::text) - -> Hash - -> Subquery Scan on subquery_2 - -> Unique - -> Sort - Sort Key: ((events_2.composite_id).tenant_id), ((events_2.composite_id).user_id) - -> Seq Scan on events_1400285 events_2 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) + -> HashAggregate + Group Key: COALESCE(subquery_2.hasdone, 'Has not done paying'::text) + -> GroupAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone + -> Hash Left Join + Hash Cond: (users.composite_id = subquery_2.composite_id) + -> HashAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Append + -> Hash Join + Hash Cond: (users.composite_id = events.composite_id) + -> Seq Scan on users_1400289 users + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events + Filter: ((event_type)::text = 'click'::text) + -> Hash Join + Hash Cond: (users_1.composite_id = events_1.composite_id) + -> Seq Scan on users_1400289 users_1 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events_1 + Filter: ((event_type)::text = 'submit'::text) + -> Hash + -> Subquery Scan on subquery_2 + -> Unique + -> Sort + Sort Key: ((events_2.composite_id).tenant_id), ((events_2.composite_id).user_id) + -> Seq Scan on events_1400285 events_2 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) -- Union, left join and having subquery pushdown EXPLAIN (COSTS OFF) SELECT @@ -856,44 +862,48 @@ Sort Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression - -> GroupAggregate - Group Key: subquery_top.count_pay - -> Sort - Sort Key: subquery_top.count_pay - -> Subquery Scan on subquery_top - -> GroupAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay - Filter: (array_ndims(array_agg(('action=>1'::text) ORDER BY events.event_time)) > 0) - -> Sort - Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) + -> HashAggregate + Group Key: COALESCE(subquery_2.count_pay, '0'::bigint) + -> GroupAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay + Filter: (array_ndims(array_agg(('action=>1'::text) ORDER BY events.event_time)) > 0) + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.count_pay + -> Hash Left Join + Hash Cond: (users.composite_id = subquery_2.composite_id) + -> HashAggregate + Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Append + -> Hash Join + Hash Cond: (users.composite_id = events.composite_id) + -> Seq Scan on users_1400289 users + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events + Filter: ((event_type)::text = 'click'::text) + -> Hash Join + Hash Cond: (users_1.composite_id = events_1.composite_id) + -> Seq Scan on users_1400289 users_1 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) + -> Hash + -> Seq Scan on events_1400285 events_1 + Filter: ((event_type)::text = 'submit'::text) + -> Hash + -> Subquery Scan on subquery_2 -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time - -> Append - -> Hash Join - Hash Cond: (users.composite_id = events.composite_id) - -> Seq Scan on users_1400289 users - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events - Filter: ((event_type)::text = 'click'::text) - -> Hash Join - Hash Cond: (users_1.composite_id = events_1.composite_id) - -> Seq Scan on users_1400289 users_1 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type)) - -> Hash - -> Seq Scan on events_1400285 events_1 - Filter: ((event_type)::text = 'submit'::text) - -> Hash - -> Subquery Scan on subquery_2 - -> GroupAggregate - Group Key: events_2.composite_id - Filter: (count(*) > 2) - -> Sort - Sort Key: events_2.composite_id - -> Seq Scan on events_1400285 events_2 - Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) + Group Key: events_2.composite_id + Filter: (count(*) > 2) + -> Seq Scan on events_1400285 events_2 + Filter: ((composite_id >= '(1,-9223372036854775808)'::user_composite_type) AND (composite_id <= '(1,9223372036854775807)'::user_composite_type) AND ((event_type)::text = 'pay'::text)) +SELECT success FROM run_command_on_workers('alter system reset enable_nestloop'); +t +t +SELECT success FROM run_command_on_workers('alter system reset enable_sort'); +t +t +SELECT success FROM run_command_on_workers('select pg_reload_conf()'); +t +t -- Lateral join subquery pushdown -- set subquery_pushdown due to limit in the query SET citus.subquery_pushdown to ON; diff --git a/src/test/regress/sql/multi_explain.sql b/src/test/regress/sql/multi_explain.sql index 4fc16fbd8..65ca6f5da 100644 --- a/src/test/regress/sql/multi_explain.sql +++ b/src/test/regress/sql/multi_explain.sql @@ -260,6 +260,10 @@ FROM tenant_id, user_id) AS subquery; +SELECT success FROM run_command_on_workers('alter system set enable_nestloop to off'); +SELECT success FROM run_command_on_workers('alter system set enable_sort to off'); +SELECT success FROM run_command_on_workers('select pg_reload_conf()'); + -- Union and left join subquery pushdown EXPLAIN (COSTS OFF) SELECT @@ -396,6 +400,10 @@ GROUP BY ORDER BY count_pay; +SELECT success FROM run_command_on_workers('alter system reset enable_nestloop'); +SELECT success FROM run_command_on_workers('alter system reset enable_sort'); +SELECT success FROM run_command_on_workers('select pg_reload_conf()'); + -- Lateral join subquery pushdown -- set subquery_pushdown due to limit in the query SET citus.subquery_pushdown to ON;