DESCRIPTION: Add `citus_stats` UDF
This UDF acts on a Citus table, and provides `null_frac`,
`most_common_vals` and `most_common_freqs` for each column in the table,
based on the definitions of these columns in the Postgres view
`pg_stats`.
**Aggregated Views: pg\_stats > citus\_stats**
citus\_stats, is a **view** intended for use in **Citus**, a distributed
extension of PostgreSQL. It collects and returns **column-level**
**statistics** for a distributed table—specifically, the **most common
values**, their **frequencies,** and **fraction of null values**, like
pg\_stats view does for regular Postgres tables.
**Use Case**
This view is useful when:
- You need **column-level insights** on a distributed table.
- You're performing **query optimization**, **cardinality estimation**,
or **data profiling** across shards.
**What It Returns**
A **table** with:
| Column Name | Data Type | Description |
|---------------------|-----------|-----------------------------------------------------------------------------|
| schemaname | text | Name of the schema containing the distributed
table |
| tablename | text | Name of the distributed table |
| attname | text | Name of the column (attribute) |
| null_frac | float4 | Estimated fraction of NULLs in the column across
all shards |
| most_common_vals | text[] | Array of most common values for the column
|
| most_common_freqs | float4[] | Array of corresponding frequencies (as
fractions) of the most common values|
**Caveats**
- The function assumes that the array of the most common values among
different shards will be the same, therefore it just adds everything up.