PG17 compatibility: Preserve DEBUG output in cte_inline (#7755)

Regression test cte_inline has the following diff;
```
DEBUG: CTE cte_1 is going to be inlined via distributed planning 
DEBUG: CTE cte_1 is going to be inlined via distributed planning 
DEBUG: Creating router plan 
-DEBUG: query has a single distribution column value: 1 
```
DEBUG message `query has a single distribution column value` does not
appear with PG17. This is because PG17 can recognize when a Result node
does not need to have an input node, so the predicate on the
distribution column is not present in the query plan. Comparing the
query plan obtained before PG17:
```
│ Result                                                                         │
│   One-Time Filter: false                                                       │
│   ->  GroupAggregate                                                           │
│         ->  Seq Scan on public.test_table                                      │
│               Filter: (test_table.key = 1)                                     │

```
with the PG17 query plan:
```
┌──────────────────────────────────┐
│            QUERY PLAN            │
├──────────────────────────────────┤
│ Result                           │
│   One-Time Filter: false         │
└──────────────────────────────────┘
```
we see that the Result node in the PG16 plan has an Aggregate node, but
the Result node in the PG17 plan does not have any input node; PG17
recognizes it is not needed given a Filter that evaluates to False at
compile-time. The Result node is present in both plans because PG in
both versions can recognize when a combination of predicates equate to
false at compile time; this is the because the successive predicates in
the test query (key=6, key=5, key=4, etc) become contradictory when the
CTEs are inlined. Here is an example query showing the effect of the CTE
inlining:
```
select count(*), key FROM test_table WHERE key = 1 AND key = 2 GROUP BY key;
```
In this case, the WHERE clause obviously evaluates to False. The PG16
query plan for this query is:
```
┌────────────────────────────────────┐
│             QUERY PLAN             │
├────────────────────────────────────┤
│ GroupAggregate                     │
│   ->  Result                       │
│         One-Time Filter: false     │
│         ->  Seq Scan on test_table │
│               Filter: (key = 1)    │
└────────────────────────────────────┘
```
The PG17 query plan is:
```
┌────────────────────────────────┐
│           QUERY PLAN           │
├────────────────────────────────┤
│ GroupAggregate                 │
│   ->  Result                   │
│         One-Time Filter: false │
└────────────────────────────────┘
```
In both plans the PG optimizer is able to derive the predicate 1=2 from
the equivalence class { key, 1, 2 } and then constant fold this to
False. But, in the PG16 plan the Result node has an input node (a
sequential scan on test_table), while in the PG17 plan the Result node
does not have any input. This is because PG17 recognizes that when the
Result filter resolves to False at compile time it is not necessary to
set an input on the Result. I think this is a consequence of this PG17
commit:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440
which handles redundant IS [NOT] NULL predicates, but also refactored
evaluating of predicates to true/false at compile-time, enabling
optimizations such as those seen here.

Given the reason for the diff, the fix preserves the test output by
modifying the query so the predicates are not contradictory when the
CTEs are inlined.
pull/7745/head
Colm 2024-11-19 21:14:57 +00:00 committed by GitHub
parent b29ecd1b12
commit 0fed87ada9
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 18 additions and 18 deletions

View File

@ -654,12 +654,12 @@ WITH cte_1 AS (
WITH cte_1 AS (
WITH cte_1 AS (SELECT count(*), key FROM test_table GROUP BY key)
SELECT * FROM cte_1)
SELECT * FROM cte_1 WHERE key = 1)
SELECT * FROM cte_1 WHERE key = 2)
SELECT * FROM cte_1 WHERE key = 3)
SELECT * FROM cte_1 WHERE key = 4)
SELECT * FROM cte_1 WHERE key = 5)
SELECT * FROM cte_1 WHERE key = 6;
SELECT * FROM cte_1 WHERE key >= 1)
SELECT * FROM cte_1 WHERE key >= 2)
SELECT * FROM cte_1 WHERE key >= 3)
SELECT * FROM cte_1 WHERE key >= 4)
SELECT * FROM cte_1 WHERE key >= 5)
SELECT * FROM cte_1 WHERE key = 1;
DEBUG: CTE cte_1 is going to be inlined via distributed planning
DEBUG: CTE cte_1 is going to be inlined via distributed planning
DEBUG: CTE cte_1 is going to be inlined via distributed planning

View File

@ -654,12 +654,12 @@ WITH cte_1 AS (
WITH cte_1 AS (
WITH cte_1 AS (SELECT count(*), key FROM test_table GROUP BY key)
SELECT * FROM cte_1)
SELECT * FROM cte_1 WHERE key = 1)
SELECT * FROM cte_1 WHERE key = 2)
SELECT * FROM cte_1 WHERE key = 3)
SELECT * FROM cte_1 WHERE key = 4)
SELECT * FROM cte_1 WHERE key = 5)
SELECT * FROM cte_1 WHERE key = 6;
SELECT * FROM cte_1 WHERE key >= 1)
SELECT * FROM cte_1 WHERE key >= 2)
SELECT * FROM cte_1 WHERE key >= 3)
SELECT * FROM cte_1 WHERE key >= 4)
SELECT * FROM cte_1 WHERE key >= 5)
SELECT * FROM cte_1 WHERE key = 1;
DEBUG: CTE cte_1 is going to be inlined via distributed planning
DEBUG: CTE cte_1 is going to be inlined via distributed planning
DEBUG: CTE cte_1 is going to be inlined via distributed planning

View File

@ -350,12 +350,12 @@ WITH cte_1 AS (
WITH cte_1 AS (
WITH cte_1 AS (SELECT count(*), key FROM test_table GROUP BY key)
SELECT * FROM cte_1)
SELECT * FROM cte_1 WHERE key = 1)
SELECT * FROM cte_1 WHERE key = 2)
SELECT * FROM cte_1 WHERE key = 3)
SELECT * FROM cte_1 WHERE key = 4)
SELECT * FROM cte_1 WHERE key = 5)
SELECT * FROM cte_1 WHERE key = 6;
SELECT * FROM cte_1 WHERE key >= 1)
SELECT * FROM cte_1 WHERE key >= 2)
SELECT * FROM cte_1 WHERE key >= 3)
SELECT * FROM cte_1 WHERE key >= 4)
SELECT * FROM cte_1 WHERE key >= 5)
SELECT * FROM cte_1 WHERE key = 1;