Commit Graph

2 Commits (274504465d3f49920b4b7519c599e9044a91459a)

Author SHA1 Message Date
Naisila Puka 274504465d Fix invalid input syntax for type bigint (#8166)
Fixes #8164
2025-08-29 01:43:57 +03:00
Naisila Puka eaa609f510
Add citus_stats UDF (#8026)
DESCRIPTION: Add `citus_stats` UDF

This UDF acts on a Citus table, and provides `null_frac`,
`most_common_vals` and `most_common_freqs` for each column in the table,
based on the definitions of these columns in the Postgres view
`pg_stats`.

**Aggregated Views: pg\_stats > citus\_stats** 

citus\_stats, is a **view** intended for use in **Citus**, a distributed
extension of PostgreSQL. It collects and returns **column-level**
**statistics** for a distributed table—specifically, the **most common
values**, their **frequencies,** and **fraction of null values**, like
pg\_stats view does for regular Postgres tables.

**Use Case** 

This view is useful when: 

- You need **column-level insights** on a distributed table. 
- You're performing **query optimization**, **cardinality estimation**,
or **data profiling** across shards.

**What It Returns** 

A **table** with: 

| Column Name | Data Type | Description |

|---------------------|-----------|-----------------------------------------------------------------------------|
| schemaname | text | Name of the schema containing the distributed
table |
| tablename | text | Name of the distributed table |
| attname | text | Name of the column (attribute) |
| null_frac | float4 | Estimated fraction of NULLs in the column across
all shards |
| most_common_vals | text[] | Array of most common values for the column
|
| most_common_freqs | float4[] | Array of corresponding frequencies (as
fractions) of the most common values|

**Caveats** 
- The function assumes that the array of the most common values among
different shards will be the same, therefore it just adds everything up.
2025-08-19 23:17:13 +03:00