Add 5.2.0 CHANGELOG entry

Our longest yet!
Fix Travis local_first_candidate_nodes failures
2016-08-15 12:59:35 -06:00 · 2016-08-14 23:19:16 -06:00 · 2016-08-14 23:19:16 -06:00 · 2016-08-05 14:28:23 -07:00 · 2016-08-05 14:20:22 -07:00 · 2016-08-04 14:45:54 -07:00
3155 changed files with 89145 additions and 894874 deletions
--- a/.codeclimate.yml
+++ b/.codeclimate.yml
@ -1,7 +0,0 @@
 exclude_patterns:
  - "src/backend/distributed/utils/citus_outfuncs.c"
  - "src/backend/distributed/deparser/ruleutils_*.c"
  - "src/include/distributed/citus_nodes.h"
  - "src/backend/distributed/safeclib"
  - "src/backend/columnar/safeclib"
  - "**/vendor/"
--- a/.codecov.yml
+++ b/.codecov.yml
@ -1,40 +0,0 @@
 codecov:
  notify:
    require_ci_to_pass: yes
 coverage:
  precision: 2
  round: down
  range: "70...100"
  ignore:
    - "src/backend/distributed/utils/citus_outfuncs.c"
    - "src/backend/distributed/deparser/ruleutils_*.c"
    - "src/include/distributed/citus_nodes.h"
    - "src/backend/distributed/safeclib"
    - "vendor"
  status:
    project:
      default:
        target: 87.5
        threshold: 0.5
    patch:
      default:
        target: 75
    changes: no
 parsers:
  gcov:
    branch_detection:
      conditional: yes
      loop: yes
      method: no
      macro: no
 comment:
  layout: "header, diff"
  behavior: default
  require_changes: no
--- a/.devcontainer/.gdbinit
+++ b/.devcontainer/.gdbinit
@ -1,33 +0,0 @@
 # gdbpg.py contains scripts to nicely print the postgres datastructures
 # while in a gdb session. Since the vscode debugger is based on gdb this
 # actually also works when debugging with vscode. Providing nice tools
 # to understand the internal datastructures we are working with.
 source /root/gdbpg.py
 # when debugging postgres it is convenient to _always_ have a breakpoint
 # trigger when an error is logged. Because .gdbinit is sourced before gdb
 # is fully attached and has the sources loaded. To make sure the breakpoint
 # is added when the library is loaded we temporary set the breakpoint pending
 # to on. After we have added out breakpoint we revert back to the default
 # configuration for breakpoint pending.
 # The breakpoint is hard to read, but at entry of the function we don't have
 # the level loaded in elevel. Instead we hardcode the location where the
 # level of the current error is stored. Also gdb doesn't understand the
 # ERROR symbol so we hardcode this to the value of ERROR. It is very unlikely
 # this value will ever change in postgres, but if it does we might need to
 # find a way to conditionally load the correct breakpoint.
 set breakpoint pending on
 break elog.c:errfinish if errordata[errordata_stack_depth].elevel == 21
 set breakpoint pending auto
 echo \n
 echo ----------------------------------------------------------------------------------\n
 echo when attaching to a postgres backend a breakpoint will be set on elog.c:errfinish \n
 echo it will only break on errors being raised in postgres \n
 echo \n
 echo to disable this breakpoint from vscode run `-exec disable 1` in the debug console \n
 echo this assumes it's the first breakpoint loaded as it is loaded from .gdbinit \n
 echo this can be verified with `-exec info break`, enabling can be done with \n
 echo `-exec enable 1` \n
 echo ----------------------------------------------------------------------------------\n
 echo \n
--- a/.devcontainer/.gitignore
+++ b/.devcontainer/.gitignore
@ -1 +0,0 @@
 postgresql-*.tar.bz2
--- a/.devcontainer/.psqlrc
+++ b/.devcontainer/.psqlrc
@ -1,7 +0,0 @@
 \timing on
 \pset linestyle unicode
 \pset border 2
 \setenv PAGER 'pspg --no-mouse -bX --no-commandbar --no-topbar'
 \set HISTSIZE 100000
 \set PROMPT1 '\n%[%033[1m%]%M %n@%/:%> (PID: %p)%R%[%033[0m%]%# '
 \set PROMPT2 '  '
--- a/.devcontainer/.vscode/Pipfile
+++ b/.devcontainer/.vscode/Pipfile
@ -1,12 +0,0 @@
 [[source]]
 url = "https://pypi.org/simple"
 verify_ssl = true
 name = "pypi"
 [packages]
 docopt = "*"
 [dev-packages]
 [requires]
 python_version = "3.9"
--- a/.devcontainer/.vscode/Pipfile.lock
+++ b/.devcontainer/.vscode/Pipfile.lock
@ -1,28 +0,0 @@
 {
    "_meta": {
        "hash": {
            "sha256": "6956a6700ead5804aa56bd597c93bb4a13f208d2d49d3b5399365fd240ca0797"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.9"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "docopt": {
            "hashes": [
                "sha256:49b3a825280bd66b3aa83585ef59c4a8c82f2c8a522dbe754a8bc8d08c85c491"
            ],
            "index": "pypi",
            "version": "==0.6.2"
        }
    },
    "develop": {}
 }
--- a/.devcontainer/.vscode/generate_c_cpp_properties-json.py
+++ b/.devcontainer/.vscode/generate_c_cpp_properties-json.py
@ -1,84 +0,0 @@
 #! /usr/bin/env pipenv-shebang
 """Generate C/C++ properties file for VSCode.
 Uses pgenv to iterate postgres versions and generate
 a C/C++ properties file for VSCode containing the
 include paths for the postgres headers.
 Usage:
  generate_c_cpp_properties-json.py <target_path>
  generate_c_cpp_properties-json.py (-h | --help)
  generate_c_cpp_properties-json.py --version
 Options:
  -h --help     Show this screen.
  --version     Show version.
 """
 import json
 import subprocess
 from docopt import docopt
 def main(args):
    target_path = args['<target_path>']
    output = subprocess.check_output(['pgenv', 'versions'])
    # typical output is:
    #      14.8      pgsql-14.8
    #  *   15.3      pgsql-15.3
    #      16beta2    pgsql-16beta2
    # where the line marked with a * is the currently active version
    #
    # we are only interested in the first word of each line, which is the version number
    # thus we strip the whitespace and the * from the line and split it into words
    # and take the first word
    versions = [line.strip('* ').split()[0] for line in output.decode('utf-8').splitlines()]
    # create the list of configurations per version
    configurations = []
    for version in versions:
        configurations.append(generate_configuration(version))
    # create the json file
    c_cpp_properties = {
        "configurations": configurations,
        "version": 4
    }
    # write the c_cpp_properties.json file
    with open(target_path, 'w') as f:
        json.dump(c_cpp_properties, f, indent=4)
 def generate_configuration(version):
    """Returns a configuration for the given postgres version.
    >>> generate_configuration('14.8')
    {
        "name": "Citus Development Configuration - Postgres 14.8",
        "includePath": [
            "/usr/local/include",
            "/home/citus/.pgenv/src/postgresql-14.8/src/**",
            "${workspaceFolder}/**",
            "${workspaceFolder}/src/include/",
        ],
        "configurationProvider": "ms-vscode.makefile-tools"
    }
    """
    return {
        "name": f"Citus Development Configuration - Postgres {version}",
        "includePath": [
            "/usr/local/include",
            f"/home/citus/.pgenv/src/postgresql-{version}/src/**",
            "${workspaceFolder}/**",
            "${workspaceFolder}/src/include/",
        ],
        "configurationProvider": "ms-vscode.makefile-tools"
    }
 if __name__ == '__main__':
    arguments = docopt(__doc__, version='0.1.0')
    main(arguments)
--- a/.devcontainer/.vscode/launch.json
+++ b/.devcontainer/.vscode/launch.json
@ -1,40 +0,0 @@
 {
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Attach Citus (devcontainer)",
            "type": "cppdbg",
            "request": "attach",
            "processId": "${command:pickProcess}",
            "program": "/home/citus/.pgenv/pgsql/bin/postgres",
            "additionalSOLibSearchPath": "/home/citus/.pgenv/pgsql/lib",
            "setupCommands": [
                {
                    "text": "handle SIGUSR1 noprint nostop pass",
                    "description": "let gdb not stop when SIGUSR1 is sent to process",
                    "ignoreFailures": true
                }
            ],
        },
        {
            "name": "Open core file",
            "type": "cppdbg",
            "request": "launch",
            "program": "/home/citus/.pgenv/pgsql/bin/postgres",
            "coreDumpPath": "${input:corefile}",
            "cwd": "${workspaceFolder}",
            "MIMode": "gdb",
        }
    ],
    "inputs": [
        {
            "id": "corefile",
            "type": "command",
            "command": "extension.commandvariable.file.pickFile",
            "args": {
                "dialogTitle": "Select core file",
                "include": "**/core*",
            },
        },
    ],
 }
--- a/.devcontainer/Dockerfile
+++ b/.devcontainer/Dockerfile
@ -1,222 +0,0 @@
 FROM ubuntu:22.04 AS base
 # environment is to make python pass an interactive shell, probably not the best timezone given a wide variety of colleagues
 ENV TZ=UTC
 RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
 # install build tools
 RUN apt update && apt install -y \
    bison \
    bzip2 \
    cpanminus \
    curl \
    docbook-xml \
    docbook-xsl \
    flex \
    gcc \
    git \
    libcurl4-gnutls-dev \
    libicu-dev \
    libkrb5-dev \
    liblz4-dev \
    libpam0g-dev \
    libreadline-dev \
    libselinux1-dev \
    libssl-dev \
    libxml2-utils \
    libxslt-dev \
    libzstd-dev \
    locales \
    make \
    perl \
    pkg-config \
    python3 \
    python3-pip \
    software-properties-common \
    sudo \
    uuid-dev \
    valgrind \
    xsltproc \
    zlib1g-dev \
 && add-apt-repository ppa:deadsnakes/ppa -y \
 && apt install -y \
    python3.9-full \
 # software properties pulls in pkexec, which makes the debugger unusable in vscode
 && apt purge -y \
    software-properties-common \
 && apt autoremove -y \
 && apt clean
 RUN sudo pip3 install pipenv pipenv-shebang
 RUN cpanm install IPC::Run
 RUN locale-gen en_US.UTF-8
 # add the citus user to sudoers and allow all sudoers to login without a password prompt
 RUN useradd -ms /bin/bash citus \
 && usermod -aG sudo citus \
 && echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
 WORKDIR /home/citus
 USER citus
 # run all make commands with the number of cores available
 RUN echo "export MAKEFLAGS=\"-j \$(nproc)\"" >> "/home/citus/.bashrc"
 RUN git clone --branch v1.3.2 --depth 1 https://github.com/theory/pgenv.git .pgenv
 COPY --chown=citus:citus pgenv/config/ .pgenv/config/
 ENV PATH="/home/citus/.pgenv/bin:${PATH}"
 ENV PATH="/home/citus/.pgenv/pgsql/bin:${PATH}"
 USER citus
 # build postgres versions separately for effective parrallelism and caching of already built versions when changing only certain versions
 FROM base AS pg15
 RUN MAKEFLAGS="-j $(nproc)" pgenv build 15.13
 RUN rm .pgenv/src/*.tar*
 RUN make -C .pgenv/src/postgresql-*/ clean
 RUN make -C .pgenv/src/postgresql-*/src/include install
 # create a staging directory with all files we want to copy from our pgenv build
 # we will copy the contents of the staged folder into the final image at once
 RUN mkdir .pgenv-staging/
 RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
 RUN rm .pgenv-staging/config/default.conf
 FROM base AS pg16
 RUN MAKEFLAGS="-j $(nproc)" pgenv build 16.9
 RUN rm .pgenv/src/*.tar*
 RUN make -C .pgenv/src/postgresql-*/ clean
 RUN make -C .pgenv/src/postgresql-*/src/include install
 # create a staging directory with all files we want to copy from our pgenv build
 # we will copy the contents of the staged folder into the final image at once
 RUN mkdir .pgenv-staging/
 RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
 RUN rm .pgenv-staging/config/default.conf
 FROM base AS pg17
 RUN MAKEFLAGS="-j $(nproc)" pgenv build 17.5
 RUN rm .pgenv/src/*.tar*
 RUN make -C .pgenv/src/postgresql-*/ clean
 RUN make -C .pgenv/src/postgresql-*/src/include install
 # create a staging directory with all files we want to copy from our pgenv build
 # we will copy the contents of the staged folder into the final image at once
 RUN mkdir .pgenv-staging/
 RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
 RUN rm .pgenv-staging/config/default.conf
 FROM base AS uncrustify-builder
 RUN sudo apt update && sudo apt install -y cmake tree
 WORKDIR /uncrustify
 RUN curl -L https://github.com/uncrustify/uncrustify/archive/uncrustify-0.68.1.tar.gz | tar xz
 WORKDIR /uncrustify/uncrustify-uncrustify-0.68.1/
 RUN mkdir build
 WORKDIR /uncrustify/uncrustify-uncrustify-0.68.1/build/
 RUN cmake ..
 RUN MAKEFLAGS="-j $(nproc)" make -s
 RUN make install DESTDIR=/uncrustify
 # builder for all pipenv's to get them contained in a single layer
 FROM base AS pipenv
 WORKDIR /workspaces/citus/
 # tools to sync pgenv with vscode
 COPY --chown=citus:citus .vscode/Pipfile .vscode/Pipfile.lock .devcontainer/.vscode/
 RUN ( cd .devcontainer/.vscode && pipenv install )
 # environment to run our failure tests
 COPY --chown=citus:citus src/ src/
 RUN ( cd src/test/regress && pipenv install )
 # assemble the final container by copying over the artifacts from separately build containers
 FROM base AS devcontainer
 LABEL org.opencontainers.image.source=https://github.com/citusdata/citus
 LABEL org.opencontainers.image.description="Development container for the Citus project"
 LABEL org.opencontainers.image.licenses=AGPL-3.0-only
 RUN yes | sudo unminimize
 # install developer productivity tools
 RUN sudo apt update \
 && sudo apt install -y \
    autoconf2.69 \
    bash-completion \
    fswatch \
    gdb \
    htop \
    libdbd-pg-perl \
    libdbi-perl \
    lsof \
    man \
    net-tools \
    psmisc \
    pspg \
    tree \
    vim \
 && sudo apt clean
 # Since gdb will run in the context of the root user when debugging citus we will need to both
 # download the gdbpg.py script as the root user, into their home directory, as well as add .gdbinit
 # as a file owned by root
 # This will make that as soon as the debugger attaches to a postgres backend (or frankly any other process)
 # the gdbpg.py script will be sourced and the developer can direcly use it.
 RUN sudo curl -o /root/gdbpg.py https://raw.githubusercontent.com/tvesely/gdbpg/6065eee7872457785f830925eac665aa535caf62/gdbpg.py
 COPY --chown=root:root .gdbinit /root/
 # install developer dependencies in the global environment
 RUN --mount=type=bind,source=requirements.txt,target=requirements.txt pip install -r requirements.txt
 # for persistent bash history across devcontainers we need to have
 # a) a directory to store the history in
 # b) a prompt command to append the history to the file
 # c) specify the history file to store the history in
 # b and c are done in the .bashrc to make it persistent across shells only
 RUN sudo install -d -o citus -g citus /commandhistory \
 && echo "export PROMPT_COMMAND='history -a' && export HISTFILE=/commandhistory/.bash_history" >> "/home/citus/.bashrc"
 # install citus-dev
 RUN git clone --branch develop https://github.com/citusdata/tools.git citus-tools \
 && ( cd citus-tools/citus_dev && pipenv install ) \
 && mkdir -p ~/.local/bin \
 && ln -s /home/citus/citus-tools/citus_dev/citus_dev-pipenv .local/bin/citus_dev \
 && sudo make -C citus-tools/uncrustify install bindir=/usr/local/bin pkgsysconfdir=/usr/local/etc/ \
 && mkdir -p ~/.local/share/bash-completion/completions/ \
 && ln -s ~/citus-tools/citus_dev/bash_completion ~/.local/share/bash-completion/completions/citus_dev
 # TODO some LC_ALL errors, possibly solved by locale-gen
 RUN git clone https://github.com/so-fancy/diff-so-fancy.git \
 && mkdir -p ~/.local/bin \
 && ln -s /home/citus/diff-so-fancy/diff-so-fancy .local/bin/
 COPY --link --from=uncrustify-builder /uncrustify/usr/ /usr/
 COPY --link --from=pg15 /home/citus/.pgenv-staging/ /home/citus/.pgenv/
 COPY --link --from=pg16 /home/citus/.pgenv-staging/ /home/citus/.pgenv/
 COPY --link --from=pg17 /home/citus/.pgenv-staging/ /home/citus/.pgenv/
 COPY --link --from=pipenv /home/citus/.local/share/virtualenvs/ /home/citus/.local/share/virtualenvs/
 # place to run your cluster with citus_dev
 VOLUME /data
 RUN sudo mkdir /data \
 && sudo chown citus:citus /data
 COPY --chown=citus:citus .psqlrc .
 # with the copy linking of layers github actions seem to misbehave with the ownership of the
 # directories leading upto the link, hence a small patch layer to have to right ownerships set
 RUN sudo chown --from=root:root citus:citus -R ~
 # sets default pg version
 RUN pgenv switch 17.5
 # make connecting to the coordinator easy
 ENV PGPORT=9700
--- a/.devcontainer/Makefile
+++ b/.devcontainer/Makefile
@ -1,11 +0,0 @@
 init: ../.vscode/c_cpp_properties.json ../.vscode/launch.json
 ../.vscode:
 	mkdir -p ../.vscode
 ../.vscode/launch.json: ../.vscode .vscode/launch.json
 	cp .vscode/launch.json ../.vscode/launch.json
 ../.vscode/c_cpp_properties.json: ../.vscode
 	./.vscode/generate_c_cpp_properties-json.py ../.vscode/c_cpp_properties.json
--- a/.devcontainer/devcontainer.json
+++ b/.devcontainer/devcontainer.json
@ -1,37 +0,0 @@
 {
    "image": "ghcr.io/citusdata/citus-devcontainer:main",
    "runArgs": [
        "--cap-add=SYS_PTRACE",
        "--ulimit=core=-1",
    ],
    "forwardPorts": [
        9700
    ],
    "customizations": {
        "vscode": {
            "extensions": [
                "eamodio.gitlens",
                "GitHub.copilot-chat",
                "GitHub.copilot",
                "github.vscode-github-actions",
                "github.vscode-pull-request-github",
                "ms-vscode.cpptools-extension-pack",
                "ms-vsliveshare.vsliveshare",
                "rioj7.command-variable",
            ],
            "settings": {
                "files.exclude": {
                    "**/*.o": true,
                    "**/.deps/": true,
                }
            },
        }
    },
    "mounts": [
        "type=volume,target=/data",
        "source=citus-bashhistory,target=/commandhistory,type=volume",
    ],
    "updateContentCommand": "./configure",
    "postCreateCommand": "make -C .devcontainer/",
 }
--- a/.devcontainer/pgenv/config/default.conf
+++ b/.devcontainer/pgenv/config/default.conf
@ -1,15 +0,0 @@
 PGENV_MAKE_OPTIONS=(-s)
 PGENV_CONFIGURE_OPTIONS=(
    --enable-debug
    --enable-depend
    --enable-cassert
    --enable-tap-tests
    'CFLAGS=-ggdb -Og -g3 -fno-omit-frame-pointer -DUSE_VALGRIND'
    --with-openssl
    --with-libxml
    --with-libxslt
    --with-uuid=e2fs
    --with-icu
    --with-lz4
 )
--- a/.devcontainer/requirements.txt
+++ b/.devcontainer/requirements.txt
@ -1,9 +0,0 @@
 black==23.11.0
 click==8.1.7
 isort==5.12.0
 mypy-extensions==1.0.0
 packaging==23.2
 pathspec==0.11.2
 platformdirs==4.0.0
 tomli==2.0.1
 typing_extensions==4.8.0
--- a/.devcontainer/src/test/regress/Pipfile
+++ b/.devcontainer/src/test/regress/Pipfile
@ -1,28 +0,0 @@
 [[source]]
 name = "pypi"
 url = "https://pypi.python.org/simple"
 verify_ssl = true
 [packages]
 mitmproxy = {editable = true, ref = "main", git = "https://github.com/citusdata/mitmproxy.git"}
 construct = "*"
 docopt = "==0.6.2"
 cryptography = ">=41.0.4"
 pytest = "*"
 psycopg = "*"
 filelock = "*"
 pytest-asyncio = "*"
 pytest-timeout = "*"
 pytest-xdist = "*"
 pytest-repeat = "*"
 pyyaml = "*"
 werkzeug = "==2.3.7"
 [dev-packages]
 black = "*"
 isort = "*"
 flake8 = "*"
 flake8-bugbear = "*"
 [requires]
 python_version = "3.9"
--- a/.devcontainer/src/test/regress/Pipfile.lock
+++ b/.devcontainer/src/test/regress/Pipfile.lock
--- a/.editorconfig
+++ b/.editorconfig
@ -1,28 +0,0 @@
 # top-most EditorConfig file
 root = true
 # rules for all files
 # we use tabs with indent size 4
 [*]
 indent_style = tab
 indent_size = 4
 tab_width = 4
 end_of_line = lf
 insert_final_newline = true
 charset = utf-8
 trim_trailing_whitespace = true
 # Don't change test output files, pngs or test data files
 [*.{out,png,data}]
 insert_final_newline = unset
 trim_trailing_whitespace = unset
 [*.{sql,sh,py,toml}]
 indent_style = space
 indent_size = 4
 tab_width = 4
 [*.yml]
 indent_style = space
 indent_size = 2
 tab_width = 2
--- a/.flake8
+++ b/.flake8
@ -1,7 +0,0 @@
 [flake8]
 # E203 is ignored for black
 extend-ignore = E203
 # black will truncate to 88 characters usually, but long string literals it
 # might keep. That's fine in most cases unless it gets really excessive.
 max-line-length = 150
 exclude = .git,__pycache__,vendor,tmp_*
--- a/.gitattributes
+++ b/.gitattributes
@ -16,6 +16,7 @@ README.*	conflict-marker-size=32
 # Test output files that contain extra whitespace
 *.out					-whitespace
 src/test/regress/output/*.source	-whitespace
 # These files are maintained or generated elsewhere.  We take them as is.
 configure				-whitespace
@ -25,13 +26,17 @@ configure				-whitespace
 # except these exceptions...
 src/backend/distributed/utils/citus_outfuncs.c -citus-style
-src/backend/distributed/deparser/ruleutils_15.c -citus-style
+src/backend/distributed/utils/citus_read.c -citus-style
-src/backend/distributed/deparser/ruleutils_16.c -citus-style
+src/backend/distributed/utils/citus_readfuncs_94.c -citus-style
-src/backend/distributed/deparser/ruleutils_17.c -citus-style
+src/backend/distributed/utils/citus_readfuncs_95.c -citus-style
-src/backend/distributed/commands/index_pg_source.c -citus-style
+src/backend/distributed/utils/ruleutils_94.c -citus-style
-
+src/backend/distributed/utils/ruleutils_95.c -citus-style
 src/include/distributed/citus_nodes.h -citus-style
-/vendor/** -citus-style
+src/include/dumputils.h -citus-style
-# Hide diff on github by default for copied udfs
+# all csql files use PostgreSQL style...
-src/backend/distributed/sql/udfs/*/[123456789]*.sql linguist-generated=true
+src/bin/csql/*.[ch] -citus-style
 # except these exceptions
 src/bin/csql/copy_options.c citus-style
 src/bin/csql/stage.[ch] citus-style
--- a/.github/actions/parallelization/action.yml
+++ b/.github/actions/parallelization/action.yml
@ -1,23 +0,0 @@
 name: 'Parallelization matrix'
 inputs:
  count:
    required: false
    default: 32
 outputs:
  json:
    value: ${{ steps.generate_matrix.outputs.json }}
 runs:
  using: "composite"
  steps:
    - name: Generate parallelization matrix
      id: generate_matrix
      shell: bash
      run: |-
        json_array="{\"include\": ["
        for ((i = 1; i <= ${{ inputs.count }}; i++)); do
            json_array+="{\"id\":\"$i\"},"
        done
        json_array=${json_array%,}
        json_array+=" ]}"
        echo "json=$json_array" >> "$GITHUB_OUTPUT"
        echo "json=$json_array"
--- a/.github/actions/save_logs_and_results/action.yml
+++ b/.github/actions/save_logs_and_results/action.yml
@ -1,38 +0,0 @@
 name: save_logs_and_results
 inputs:
  folder:
    required: false
    default: "log"
 runs:
  using: composite
  steps:
  - uses: actions/upload-artifact@v4.6.0
    name: Upload logs
    with:
      name: ${{ inputs.folder }}
      if-no-files-found: ignore
      path: |
        src/test/**/proxy.output
        src/test/**/results/
        src/test/**/tmp_check/master/log
        src/test/**/tmp_check/worker.57638/log
        src/test/**/tmp_check/worker.57637/log
        src/test/**/*.diffs
        src/test/**/out/ddls.sql
        src/test/**/out/queries.sql
        src/test/**/logfile_*
        /tmp/pg_upgrade_newData_logs
  - name: Publish regression.diffs
    run: |-
      diffs="$(find src/test/regress -name "*.diffs" -exec cat {} \;)"
      if ! [ -z "$diffs" ]; then
        echo '```diff' >> $GITHUB_STEP_SUMMARY
        echo -E "$diffs" >> $GITHUB_STEP_SUMMARY
        echo '```' >> $GITHUB_STEP_SUMMARY
        echo -E $diffs
      fi
    shell: bash
  - name: Print stack traces
    run: "./ci/print_stack_trace.sh"
    if: failure()
    shell: bash
--- a/.github/actions/setup_extension/action.yml
+++ b/.github/actions/setup_extension/action.yml
@ -1,35 +0,0 @@
 name: setup_extension
 inputs:
  pg_major:
    required: false
  skip_installation:
    required: false
    default: false
    type: boolean
 runs:
  using: composite
  steps:
  - name: Expose $PG_MAJOR to Github Env
    run: |-
        if [ -z "${{ inputs.pg_major }}" ]; then
          echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
        else
          echo "PG_MAJOR=${{ inputs.pg_major }}" >> $GITHUB_ENV
        fi
    shell: bash
  - uses: actions/download-artifact@v4.1.8
    with:
      name: build-${{ env.PG_MAJOR }}
  - name: Install Extension
    if: ${{ inputs.skip_installation == 'false' }}
    run: tar xfv "install-$PG_MAJOR.tar" --directory /
    shell: bash
  - name: Configure
    run: |-
      chown -R circleci .
      git config --global --add safe.directory ${GITHUB_WORKSPACE}
      gosu circleci ./configure --without-pg-version-check
    shell: bash
  - name: Enable core dumps
    run: ulimit -c unlimited
    shell: bash
--- a/.github/actions/upload_coverage/action.yml
+++ b/.github/actions/upload_coverage/action.yml
@ -1,27 +0,0 @@
 name: coverage
 inputs:
  flags:
    required: false
  codecov_token:
    required: true
 runs:
  using: composite
  steps:
  - uses: codecov/codecov-action@v3
    with:
      flags: ${{ inputs.flags }}
      token: ${{ inputs.codecov_token }}
      verbose: true
      gcov: true
  - name: Create codeclimate coverage
    run: |-
      lcov --directory . --capture --output-file lcov.info
      lcov --remove lcov.info -o lcov.info '/usr/*'
      sed "s=^SF:$PWD/=SF:=g" -i lcov.info # relative pats are required by codeclimate
      mkdir -p /tmp/codeclimate
      cc-test-reporter format-coverage -t lcov -o /tmp/codeclimate/${{ inputs.flags }}.json lcov.info
    shell: bash
  - uses: actions/upload-artifact@v4.6.0
    with:
      path: "/tmp/codeclimate/*.json"
      name: codeclimate-${{ inputs.flags }}
--- a/.github/packaging/packaging_ignore.yml
+++ b/.github/packaging/packaging_ignore.yml
@ -1,3 +0,0 @@
 base:
  - ".* warning: ignoring old recipe for target [`']check'"
  - ".* warning: overriding recipe for target [`']check'"
--- a/.github/packaging/validate_build_output.sh
+++ b/.github/packaging/validate_build_output.sh
@ -1,51 +0,0 @@
 #!/bin/bash
 set -ex
 # Function to get the OS version
 get_rpm_os_version() {
    if [[ -f /etc/centos-release ]]; then
        cat /etc/centos-release | awk '{print $4}'
    elif [[ -f /etc/oracle-release ]]; then
        cat /etc/oracle-release | awk '{print $5}'
    else
        echo "Unknown"
    fi
 }
 package_type=${1}
 # Since $HOME is set in GH_Actions as /github/home, pyenv fails to create virtualenvs.
 # For this script, we set $HOME to /root and then set it back to /github/home.
 GITHUB_HOME="${HOME}"
 export HOME="/root"
 eval "$(pyenv init -)"
 pyenv versions
 pyenv virtualenv ${PACKAGING_PYTHON_VERSION} packaging_env
 pyenv activate packaging_env
 git clone -b v0.8.27 --depth=1  https://github.com/citusdata/tools.git tools
 python3 -m pip install -r tools/packaging_automation/requirements.txt
 echo "Package type: ${package_type}"
 echo "OS version: $(get_rpm_os_version)"
 # For RHEL 7, we need to install urllib3<2 due to below execution error
 # ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl'
 # module is compiled with 'OpenSSL 1.0.2k-fips  26 Jan 2017'.
 # See: https://github.com/urllib3/urllib3/issues/2168
 if [[ ${package_type} == "rpm" && $(get_rpm_os_version) == 7* ]]; then
    python3 -m pip uninstall -y urllib3
    python3 -m pip install 'urllib3<2'
 fi
 python3 -m tools.packaging_automation.validate_build_output --output_file output.log \
                                                            --ignore_file .github/packaging/packaging_ignore.yml \
                                                            --package_type ${package_type}
 pyenv deactivate
 # Set $HOME back to /github/home
 export HOME=${GITHUB_HOME}
 # Print the output to the console
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@ -1 +0,0 @@
 DESCRIPTION: PR description that will go into the change log, up to 78 characters
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@ -1,546 +0,0 @@
 name: Build & Test
 run-name: Build & Test - ${{ github.event.pull_request.title || github.ref_name }}
 concurrency:
  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: true
 on:
  workflow_dispatch:
    inputs:
      skip_test_flakyness:
        required: false
        default: false
        type: boolean
  push:
    branches:
      - "main"
      - "release-*"
  pull_request:
    types: [opened, reopened,synchronize]
  merge_group:
 jobs:
  # Since GHA does not interpolate env varibles in matrix context, we need to
  # define them in a separate job and use them in other jobs.
  params:
    runs-on: ubuntu-latest
    name: Initialize parameters
    outputs:
      build_image_name: "ghcr.io/citusdata/extbuilder"
      test_image_name: "ghcr.io/citusdata/exttester"
      citusupgrade_image_name: "ghcr.io/citusdata/citusupgradetester"
      fail_test_image_name: "ghcr.io/citusdata/failtester"
      pgupgrade_image_name: "ghcr.io/citusdata/pgupgradetester"
      style_checker_image_name: "ghcr.io/citusdata/stylechecker"
      style_checker_tools_version: "0.8.18"
      sql_snapshot_pg_version: "17.5"
      image_suffix: "-dev-d28f316"
      pg15_version: '{ "major": "15", "full": "15.13" }'
      pg16_version: '{ "major": "16", "full": "16.9" }'
      pg17_version: '{ "major": "17", "full": "17.5" }'
      upgrade_pg_versions: "15.13-16.9-17.5"
    steps:
      # Since GHA jobs need at least one step we use a noop step here.
      - name: Set up parameters
        run: echo 'noop'
  check-sql-snapshots:
    needs: params
    runs-on: ubuntu-latest
    container:
      image: ${{ needs.params.outputs.build_image_name }}:${{ needs.params.outputs.sql_snapshot_pg_version }}${{ needs.params.outputs.image_suffix }}
      options: --user root
    steps:
    - uses: actions/checkout@v4
    - name: Check Snapshots
      run: |
        git config --global --add safe.directory ${GITHUB_WORKSPACE}
        ci/check_sql_snapshots.sh
  check-style:
    needs: params
    runs-on: ubuntu-latest
    container:
      image: ${{ needs.params.outputs.style_checker_image_name }}:${{ needs.params.outputs.style_checker_tools_version }}${{ needs.params.outputs.image_suffix }}
    steps:
    - name: Check Snapshots
      run: |
        git config --global --add safe.directory ${GITHUB_WORKSPACE}
    - uses: actions/checkout@v4
      with:
        fetch-depth: 0
    - name: Check C Style
      run: citus_indent --check
    - name: Check Python style
      run: black --check .
    - name: Check Python import order
      run: isort --check .
    - name: Check Python lints
      run: flake8 .
    - name: Fix whitespace
      run: ci/editorconfig.sh && git diff --exit-code
    - name: Remove useless declarations
      run: ci/remove_useless_declarations.sh && git diff --cached --exit-code
    - name: Sort and group includes
      run: ci/sort_and_group_includes.sh && git diff --exit-code
    - name: Normalize test output
      run: ci/normalize_expected.sh && git diff --exit-code
    - name: Check for C-style comments in migration files
      run: ci/disallow_c_comments_in_migrations.sh && git diff --exit-code
    - name: 'Check for comment--cached ns that start with # character in spec files'
      run: ci/disallow_hash_comments_in_spec_files.sh && git diff --exit-code
    - name: Check for gitignore entries .for source files
      run: ci/fix_gitignore.sh && git diff --exit-code
    - name: Check for lengths of changelog entries
      run: ci/disallow_long_changelog_entries.sh
    - name: Check for banned C API usage
      run: ci/banned.h.sh
    - name: Check for tests missing in schedules
      run: ci/check_all_tests_are_run.sh
    - name: Check if all CI scripts are actually run
      run: ci/check_all_ci_scripts_are_run.sh
    - name: Check if all GUCs are sorted alphabetically
      run: ci/check_gucs_are_alphabetically_sorted.sh
    - name: Check for missing downgrade scripts
      run: ci/check_migration_files.sh
  build:
    needs: params
    name: Build for PG${{ fromJson(matrix.pg_version).major }}
    strategy:
      fail-fast: false
      matrix:
        image_name:
          - ${{ needs.params.outputs.build_image_name }}
        image_suffix:
          - ${{ needs.params.outputs.image_suffix}}
        pg_version:
          - ${{ needs.params.outputs.pg15_version }}
          - ${{ needs.params.outputs.pg16_version }}
          - ${{ needs.params.outputs.pg17_version }}
    runs-on: ubuntu-latest
    container:
      image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ matrix.image_suffix }}"
      options: --user root
    steps:
    - uses: actions/checkout@v4
    - name: Expose $PG_MAJOR to Github Env
      run: echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
      shell: bash
    - name: Build
      run: "./ci/build-citus.sh"
      shell: bash
    - uses: actions/upload-artifact@v4.6.0
      with:
        name: build-${{ env.PG_MAJOR }}
        path: |-
          ./build-${{ env.PG_MAJOR }}/*
          ./install-${{ env.PG_MAJOR }}.tar
  test-citus:
    name: PG${{ fromJson(matrix.pg_version).major }} - ${{ matrix.make }}
    strategy:
      fail-fast: false
      matrix:
        suite:
          - regress
        image_name:
          - ${{ needs.params.outputs.test_image_name }}
        pg_version:
          - ${{ needs.params.outputs.pg15_version }}
          - ${{ needs.params.outputs.pg16_version }}
          - ${{ needs.params.outputs.pg17_version }}
        make:
          - check-split
          - check-multi
          - check-multi-1
          - check-multi-mx
          - check-vanilla
          - check-isolation
          - check-operations
          - check-follower-cluster
          - check-columnar
          - check-columnar-isolation
          - check-enterprise
          - check-enterprise-isolation
          - check-enterprise-isolation-logicalrep-1
          - check-enterprise-isolation-logicalrep-2
          - check-enterprise-isolation-logicalrep-3
        include:
          - make: check-failure
            pg_version: ${{ needs.params.outputs.pg15_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-failure
            pg_version: ${{ needs.params.outputs.pg16_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-failure
            pg_version: ${{ needs.params.outputs.pg17_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-enterprise-failure
            pg_version: ${{ needs.params.outputs.pg15_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-enterprise-failure
            pg_version: ${{ needs.params.outputs.pg16_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-enterprise-failure
            pg_version: ${{ needs.params.outputs.pg17_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-pytest
            pg_version: ${{ needs.params.outputs.pg15_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-pytest
            pg_version: ${{ needs.params.outputs.pg16_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-pytest
            pg_version: ${{ needs.params.outputs.pg17_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: installcheck
            suite: cdc
            image_name: ${{ needs.params.outputs.test_image_name }}
            pg_version: ${{ needs.params.outputs.pg15_version }}
          - make: installcheck
            suite: cdc
            image_name: ${{ needs.params.outputs.test_image_name }}
            pg_version: ${{ needs.params.outputs.pg16_version }}
          - make: installcheck
            suite: cdc
            image_name: ${{ needs.params.outputs.test_image_name }}
            pg_version: ${{ needs.params.outputs.pg17_version }}
          - make: check-query-generator
            pg_version: ${{ needs.params.outputs.pg15_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-query-generator
            pg_version: ${{ needs.params.outputs.pg16_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
          - make: check-query-generator
            pg_version: ${{ needs.params.outputs.pg17_version }}
            suite: regress
            image_name: ${{ needs.params.outputs.fail_test_image_name }}
    runs-on: ubuntu-latest
    container:
      image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
      options: --user root --dns=8.8.8.8
      # Due to Github creates a default network for each job, we need to use
      # --dns= to have similar DNS settings as our other CI systems or local
      # machines. Otherwise, we may see different results.
    needs:
    - params
    - build
    steps:
    - uses: actions/checkout@v4
    - uses: "./.github/actions/setup_extension"
    - name: Run Test
      run: gosu circleci make -C src/test/${{ matrix.suite }} ${{ matrix.make }}
      timeout-minutes: 20
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
        folder: ${{ fromJson(matrix.pg_version).major }}_${{ matrix.make }}
    - uses: "./.github/actions/upload_coverage"
      if: always()
      with:
        flags: ${{ env.PG_MAJOR }}_${{ matrix.suite }}_${{ matrix.make }}
        codecov_token: ${{ secrets.CODECOV_TOKEN }}
  test-arbitrary-configs:
    name: PG${{ fromJson(matrix.pg_version).major }} - check-arbitrary-configs-${{ matrix.parallel }}
    runs-on: ["self-hosted", "1ES.Pool=1es-gha-citusdata-pool"]
    container:
      image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
      options: --user root
    needs:
      - params
      - build
    strategy:
      fail-fast: false
      matrix:
        image_name:
          - ${{ needs.params.outputs.fail_test_image_name }}
        pg_version:
          - ${{ needs.params.outputs.pg15_version }}
          - ${{ needs.params.outputs.pg16_version }}
          - ${{ needs.params.outputs.pg17_version }}
        parallel: [0,1,2,3,4,5] # workaround for running 6 parallel jobs
    steps:
    - uses: actions/checkout@v4
    - uses: "./.github/actions/setup_extension"
    - name: Test arbitrary configs
      run: |-
        # we use parallel jobs to split the tests into 6 parts and run them in parallel
        # the script below extracts the tests for the current job
        N=6  # Total number of jobs (see matrix.parallel)
        X=${{ matrix.parallel }}  # Current job number
        TESTS=$(src/test/regress/citus_tests/print_test_names.py |
          tr '\n' ',' | awk -v N="$N" -v X="$X" -F, '{
            split("", parts)
            for (i = 1; i <= NF; i++) {
                parts[i % N] = parts[i % N] $i ","
            }
            print substr(parts[X], 1, length(parts[X])-1)
        }')
        echo $TESTS
        gosu circleci \
          make -C src/test/regress \
            check-arbitrary-configs parallel=4 CONFIGS=$TESTS
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
        folder: ${{ env.PG_MAJOR }}_arbitrary_configs_${{ matrix.parallel }}
    - uses: "./.github/actions/upload_coverage"
      if: always()
      with:
        flags: ${{ env.PG_MAJOR }}_arbitrary_configs_${{ matrix.parallel }}
        codecov_token: ${{ secrets.CODECOV_TOKEN }}
  test-pg-upgrade:
    name: PG${{ matrix.old_pg_major }}-PG${{ matrix.new_pg_major }} - check-pg-upgrade
    runs-on: ubuntu-latest
    container:
      image: "${{ needs.params.outputs.pgupgrade_image_name }}:${{ needs.params.outputs.upgrade_pg_versions }}${{ needs.params.outputs.image_suffix }}"
      options: --user root
    needs:
    - params
    - build
    strategy:
      fail-fast: false
      matrix:
        include:
          - old_pg_major: 15
            new_pg_major: 16
          - old_pg_major: 16
            new_pg_major: 17
          - old_pg_major: 15
            new_pg_major: 17
    env:
      old_pg_major: ${{ matrix.old_pg_major }}
      new_pg_major: ${{ matrix.new_pg_major }}
    steps:
    - uses: actions/checkout@v4
    - uses: "./.github/actions/setup_extension"
      with:
        pg_major: "${{ env.old_pg_major }}"
    - uses: "./.github/actions/setup_extension"
      with:
        pg_major: "${{ env.new_pg_major }}"
    - name: Install and test postgres upgrade
      run: |-
        gosu circleci \
          make -C src/test/regress \
            check-pg-upgrade \
            old-bindir=/usr/lib/postgresql/${{ env.old_pg_major }}/bin \
            new-bindir=/usr/lib/postgresql/${{ env.new_pg_major }}/bin
    - name: Copy pg_upgrade logs for newData dir
      run: |-
        mkdir -p /tmp/pg_upgrade_newData_logs
        if ls src/test/regress/tmp_upgrade/newData/*.log 1> /dev/null 2>&1; then
            cp src/test/regress/tmp_upgrade/newData/*.log /tmp/pg_upgrade_newData_logs
        fi
      if: failure()
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
        folder: ${{ env.old_pg_major }}_${{ env.new_pg_major }}_upgrade
    - uses: "./.github/actions/upload_coverage"
      if: always()
      with:
        flags: ${{ env.old_pg_major }}_${{ env.new_pg_major }}_upgrade
        codecov_token: ${{ secrets.CODECOV_TOKEN }}
  test-citus-upgrade:
    name: PG${{ fromJson(needs.params.outputs.pg15_version).major }} - check-citus-upgrade
    runs-on: ubuntu-latest
    container:
      image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(needs.params.outputs.pg15_version).full }}${{ needs.params.outputs.image_suffix }}"
      options: --user root
    needs:
    - params
    - build
    steps:
    - uses: actions/checkout@v4
    - uses: "./.github/actions/setup_extension"
      with:
        skip_installation: true
    - name: Install and test citus upgrade
      run: |-
        # run make check-citus-upgrade for all citus versions
        # the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
        for citus_version in ${CITUS_VERSIONS}; do \
          gosu circleci \
            make -C src/test/regress \
              check-citus-upgrade \
              bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
              citus-old-version=${citus_version} \
              citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
              citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
        done;
        # run make check-citus-upgrade-mixed for all citus versions
        # the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
        for citus_version in ${CITUS_VERSIONS}; do \
          gosu circleci \
            make -C src/test/regress \
              check-citus-upgrade-mixed \
              citus-old-version=${citus_version} \
              bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
              citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
              citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
        done;
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
        folder: ${{ env.PG_MAJOR }}_citus_upgrade
    - uses: "./.github/actions/upload_coverage"
      if: always()
      with:
        flags: ${{ env.PG_MAJOR }}_citus_upgrade
        codecov_token: ${{ secrets.CODECOV_TOKEN }}
  upload-coverage:
    # secret below is not available for forks so disabling upload action for them
    if: ${{ github.event.pull_request.head.repo.full_name == github.repository || github.event_name != 'pull_request' }}
    env:
      CC_TEST_REPORTER_ID: ${{ secrets.CC_TEST_REPORTER_ID }}
    runs-on: ubuntu-latest
    container:
      image: ${{ needs.params.outputs.test_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}
    needs:
      - params
      - test-citus
      - test-arbitrary-configs
      - test-citus-upgrade
      - test-pg-upgrade
    steps:
      - uses: actions/download-artifact@v4.1.8
        with:
          pattern: codeclimate*
          path: codeclimate
          merge-multiple: true
      - name: Upload coverage results to Code Climate
        run: |-
          cc-test-reporter sum-coverage codeclimate/*.json -o total.json
          cc-test-reporter upload-coverage -i total.json
  ch_benchmark:
    name: CH Benchmark
    if: startsWith(github.ref, 'refs/heads/ch_benchmark/')
    runs-on: ubuntu-latest
    needs:
    - build
    steps:
    - uses: actions/checkout@v4
    - uses: azure/login@v1
      with:
        creds: ${{ secrets.AZURE_CREDENTIALS }}
    - name: install dependencies and run ch_benchmark tests
      uses: azure/CLI@v1
      with:
        inlineScript: |
          cd ./src/test/hammerdb
          chmod +x run_hammerdb.sh
          run_hammerdb.sh citusbot_ch_benchmark_rg
  tpcc_benchmark:
    name: TPCC Benchmark
    if: startsWith(github.ref, 'refs/heads/tpcc_benchmark/')
    runs-on: ubuntu-latest
    needs:
    - build
    steps:
    - uses: actions/checkout@v4
    - uses: azure/login@v1
      with:
        creds: ${{ secrets.AZURE_CREDENTIALS }}
    - name: install dependencies and run tpcc_benchmark tests
      uses: azure/CLI@v1
      with:
        inlineScript: |
          cd ./src/test/hammerdb
          chmod +x run_hammerdb.sh
          run_hammerdb.sh citusbot_tpcc_benchmark_rg
  prepare_parallelization_matrix_32:
    name: Prepare parallelization matrix
    if: ${{ needs.test-flakyness-pre.outputs.tests != ''}}
    needs: test-flakyness-pre
    runs-on: ubuntu-latest
    outputs:
      json: ${{ steps.parallelization.outputs.json }}
    steps:
      - uses: actions/checkout@v4
      - uses: "./.github/actions/parallelization"
        id: parallelization
        with:
          count: 32
  test-flakyness-pre:
    name: Detect regression tests need to be ran
    if: ${{ !inputs.skip_test_flakyness }}}
    runs-on: ubuntu-latest
    needs: build
    outputs:
      tests: ${{ steps.detect-regression-tests.outputs.tests }}
    steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 0
    - name: Detect regression tests need to be ran
      id: detect-regression-tests
      run: |-
        detected_changes=$(git diff origin/main... --name-only --diff-filter=AM | (grep 'src/test/regress/sql/.*\.sql\|src/test/regress/spec/.*\.spec\|src/test/regress/citus_tests/test/test_.*\.py' || true))
        tests=${detected_changes}
        # split the tests to be skipped --today we only skip upgrade tests
        skipped_tests=""
        not_skipped_tests=""
        for test in $tests; do
            if [[ $test =~ ^src/test/regress/sql/upgrade_ ]]; then
                skipped_tests="$skipped_tests $test"
            else
                not_skipped_tests="$not_skipped_tests $test"
            fi
        done
        if [ ! -z "$skipped_tests" ]; then
            echo "Skipped tests " $skipped_tests
        fi
        if [ -z "$not_skipped_tests" ]; then
            echo "Not detected any tests that flaky test detection should run"
        else
            echo "Detected tests " $not_skipped_tests
        fi
        echo 'tests<<EOF' >> $GITHUB_OUTPUT
        echo "$not_skipped_tests" >> "$GITHUB_OUTPUT"
        echo 'EOF' >> $GITHUB_OUTPUT
  test-flakyness:
    if: ${{ needs.test-flakyness-pre.outputs.tests != ''}}
    name: Test flakyness
    runs-on: ubuntu-latest
    container:
      image: ${{ needs.params.outputs.fail_test_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}
      options: --user root
    env:
      runs: 8
    needs:
    - params
    - build
    - test-flakyness-pre
    - prepare_parallelization_matrix_32
    strategy:
      fail-fast: false
      matrix: ${{ fromJson(needs.prepare_parallelization_matrix_32.outputs.json) }}
    steps:
    - uses: actions/checkout@v4
    - uses: actions/download-artifact@v4.1.8
    - uses: "./.github/actions/setup_extension"
    - name: Run minimal tests
      run: |-
        tests="${{ needs.test-flakyness-pre.outputs.tests }}"
        tests_array=($tests)
        for test in "${tests_array[@]}"
        do
            test_name=$(echo "$test" | sed -r "s/.+\/(.+)\..+/\1/")
            gosu circleci src/test/regress/citus_tests/run_test.py $test_name --repeat ${{ env.runs }} --use-whole-schedule-line
        done
      shell: bash
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
        folder: test_flakyness_parallel_${{ matrix.id }}
--- a/.github/workflows/codeql.yml
+++ b/.github/workflows/codeql.yml
@ -1,79 +0,0 @@
 name: "CodeQL"
 on:
  schedule:
    - cron: '59 23 * * 6'
  workflow_dispatch:
 jobs:
  analyze:
    name: Analyze
    runs-on: ubuntu-22.04
    permissions:
      actions: read
      contents: read
      security-events: write
    strategy:
      fail-fast: false
      matrix:
        language: [ 'cpp', 'python']
    steps:
    - name: Checkout repository
      uses: actions/checkout@v4
    - name: Initialize CodeQL
      uses: github/codeql-action/init@v3
      with:
        languages: ${{ matrix.language }}
    - name: Install package dependencies
      run: |
        # Create the file repository configuration:
        sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main 15" > /etc/apt/sources.list.d/pgdg.list'
        # Import the repository signing key:
        wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
        sudo apt-get update
        sudo apt-get install -y --no-install-recommends \
          autotools-dev \
          build-essential \
          ca-certificates \
          curl \
          debhelper \
          devscripts \
          fakeroot \
          flex \
          libcurl4-openssl-dev \
          libdistro-info-perl \
          libedit-dev \
          libfile-fcntllock-perl \
          libicu-dev \
          libkrb5-dev \
          liblz4-1 \
          liblz4-dev \
          libpam0g-dev \
          libreadline-dev \
          libselinux1-dev \
          libssl-dev \
          libxslt-dev \
          libzstd-dev \
          libzstd1 \
          lintian \
          postgresql-server-dev-15 \
          postgresql-server-dev-all \
          python3-pip \
          python3-setuptools \
          wget \
          zlib1g-dev
    - name: Configure, Build and Install Citus
      if: matrix.language == 'cpp'
      run: |
        ./configure
        make -sj8
        sudo make install-all
    - name: Perform CodeQL Analysis
      uses: github/codeql-action/analyze@v3
--- a/.github/workflows/devcontainer.yml
+++ b/.github/workflows/devcontainer.yml
@ -1,54 +0,0 @@
 name: "Build devcontainer"
 # Since building of containers can be quite time consuming, and take up some storage,
 # there is no need to finish a build for a tag if new changes are concurrently being made.
 # This cancels any previous builds for the same tag, and only the latest one will be kept.
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 on:
  push:
    paths:
      - ".devcontainer/**"
  workflow_dispatch:
 jobs:
  docker:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
      attestations: write
      id-token: write
    steps:
      -
        name: Docker meta
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: |
            ghcr.io/citusdata/citus-devcontainer
          tags: |
            type=ref,event=branch
            type=sha
      -
        name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      -
        name: 'Login to GitHub Container Registry'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{github.actor}}
          password: ${{secrets.GITHUB_TOKEN}}
      -
        name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: "{{defaultContext}}:.devcontainer"
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
--- a/.github/workflows/flaky_test_debugging.yml
+++ b/.github/workflows/flaky_test_debugging.yml
@ -1,79 +0,0 @@
 name: Flaky test debugging
 run-name: Flaky test debugging - ${{ inputs.flaky_test }} (${{ inputs.flaky_test_runs_per_job }}x${{ inputs.flaky_test_parallel_jobs }})
 concurrency:
  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: true
 on:
  workflow_dispatch:
    inputs:
      flaky_test:
        required: true
        type: string
        description: Test to run
      flaky_test_runs_per_job:
        required: false
        default: 8
        type: number
        description: Number of times to run the test
      flaky_test_parallel_jobs:
        required: false
        default: 32
        type: number
        description: Number of parallel jobs to run
 jobs:
  build:
    name: Build Citus
    runs-on: ubuntu-latest
    container:
      image: ${{ vars.build_image_name }}:${{ vars.pg15_version  }}${{ vars.image_suffix }}
      options: --user root
    steps:
    - uses: actions/checkout@v4
    - name: Configure, Build, and Install
      run: |
        echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
        ./ci/build-citus.sh
      shell: bash
    - uses: actions/upload-artifact@v4.6.0
      with:
        name: build-${{ env.PG_MAJOR }}
        path: |-
          ./build-${{ env.PG_MAJOR }}/*
          ./install-${{ env.PG_MAJOR }}.tar
  prepare_parallelization_matrix:
    name: Prepare parallelization matrix
    runs-on: ubuntu-latest
    outputs:
      json: ${{ steps.parallelization.outputs.json }}
    steps:
      - uses: actions/checkout@v4
      - uses: "./.github/actions/parallelization"
        id: parallelization
        with:
          count: ${{ inputs.flaky_test_parallel_jobs }}
  test_flakyness:
    name: Test flakyness
    runs-on: ubuntu-latest
    container:
      image: ${{ vars.fail_test_image_name }}:${{ vars.pg15_version  }}${{ vars.image_suffix }}
      options: --user root
    needs:
      [build, prepare_parallelization_matrix]
    env:
      test: "${{ inputs.flaky_test }}"
      runs: "${{ inputs.flaky_test_runs_per_job }}"
      skip: false
    strategy:
      fail-fast: false
      matrix: ${{ fromJson(needs.prepare_parallelization_matrix.outputs.json) }}
    steps:
    - uses: actions/checkout@v4
    - uses: "./.github/actions/setup_extension"
    - name: Run minimal tests
      run: |-
          gosu circleci src/test/regress/citus_tests/run_test.py ${{ env.test }} --repeat ${{ env.runs }} --use-whole-schedule-line
      shell: bash
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
          folder: check_flakyness_parallel_${{ matrix.id }}
--- a/.github/workflows/packaging-test-pipelines.yml
+++ b/.github/workflows/packaging-test-pipelines.yml
@ -1,177 +0,0 @@
 name: Build tests in packaging images
 on:
  pull_request:
    types: [opened, reopened,synchronize]
  merge_group:
  workflow_dispatch:
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  get_postgres_versions_from_file:
    runs-on: ubuntu-latest
    outputs:
      pg_versions: ${{ steps.get-postgres-versions.outputs.pg_versions }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 2
      - name: Get Postgres Versions
        id: get-postgres-versions
        run: |
          set -euxo pipefail
          # Postgres versions are stored in .github/workflows/build_and_test.yml
          # file in json strings with major and full keys.
          # Below command extracts the versions and get the unique values.
          pg_versions=$(cat .github/workflows/build_and_test.yml | grep -oE '"major": "[0-9]+", "full": "[0-9.]+"' | sed -E 's/"major": "([0-9]+)", "full": "([0-9.]+)"/\1/g' | sort | uniq | tr '\n', ',')
          pg_versions_array="[ ${pg_versions} ]"
          echo "Supported PG Versions: ${pg_versions_array}"
          # Below line is needed to set the output variable to be used in the next job
          echo "pg_versions=${pg_versions_array}" >> $GITHUB_OUTPUT
        shell: bash
  rpm_build_tests:
    name: rpm_build_tests
    needs: get_postgres_versions_from_file
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        # While we use separate images for different Postgres versions in rpm
        # based distros
        # For this reason, we need to use a "matrix" to generate names of
        # rpm images, e.g. citus/packaging:centos-7-pg12
        packaging_docker_image:
          - oraclelinux-8
          - almalinux-8
          - almalinux-9
        POSTGRES_VERSION: ${{ fromJson(needs.get_postgres_versions_from_file.outputs.pg_versions) }}
    container:
      image: citus/packaging:${{ matrix.packaging_docker_image }}-pg${{ matrix.POSTGRES_VERSION }}
      options: --user root
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      - name: Set Postgres and python parameters for rpm based distros
        run: |
          echo "/usr/pgsql-${{ matrix.POSTGRES_VERSION }}/bin" >> $GITHUB_PATH
          echo "/root/.pyenv/bin:$PATH" >> $GITHUB_PATH
          echo "PACKAGING_PYTHON_VERSION=3.8.16" >> $GITHUB_ENV
      - name: Configure
        run: |
          echo "Current Shell:$0"
          echo "GCC Version: $(gcc --version)"
          ./configure 2>&1 | tee output.log
      - name: Make clean
        run: |
          make clean
      - name: Make
        run: |
          git config --global --add safe.directory ${GITHUB_WORKSPACE}
          make CFLAGS="-Wno-missing-braces" -sj$(cat /proc/cpuinfo | grep "core id" | wc -l) 2>&1 | tee -a output.log
          # Check the exit code of the make command
          make_exit_code=${PIPESTATUS[0]}
          # If the make command returned a non-zero exit code, exit with the same code
          if [[ $make_exit_code -ne 0 ]]; then
              echo "make command failed with exit code $make_exit_code"
              exit $make_exit_code
          fi
      - name: Make install
        run: |
          make CFLAGS="-Wno-missing-braces" install 2>&1 | tee -a output.log
      - name: Validate output
        env:
          POSTGRES_VERSION: ${{ matrix.POSTGRES_VERSION }}
          PACKAGING_DOCKER_IMAGE: ${{ matrix.packaging_docker_image }}
        run: |
          echo "Postgres version: ${POSTGRES_VERSION}"
          ./.github/packaging/validate_build_output.sh "rpm"
  deb_build_tests:
    name: deb_build_tests
    needs: get_postgres_versions_from_file
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        # On deb based distros, we use the same docker image for
        # builds based on different Postgres versions because deb
        # based images include all postgres installations.
        # For this reason, we have multiple runs --which is 3 today--
        # for each deb based image and we use POSTGRES_VERSION to set
        # PG_CONFIG variable in each of those runs.
        packaging_docker_image:
          - debian-bookworm-all
          - debian-bullseye-all
          - ubuntu-focal-all
          - ubuntu-jammy-all
        POSTGRES_VERSION: ${{ fromJson(needs.get_postgres_versions_from_file.outputs.pg_versions) }}
    container:
      image: citus/packaging:${{ matrix.packaging_docker_image }}
      options: --user root
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      - name: Set pg_config path and python parameters for deb based distros
        run: |
          echo "PG_CONFIG=/usr/lib/postgresql/${{ matrix.POSTGRES_VERSION }}/bin/pg_config" >> $GITHUB_ENV
          echo "/root/.pyenv/bin:$PATH" >> $GITHUB_PATH
          echo "PACKAGING_PYTHON_VERSION=3.8.16" >> $GITHUB_ENV
      - name: Configure
        run: |
          echo "Current Shell:$0"
          echo "GCC Version: $(gcc --version)"
          ./configure 2>&1 | tee output.log
      - name: Make clean
        run: |
          make clean
      - name: Make
        shell: bash
        run: |
          set -e
          git config --global --add safe.directory ${GITHUB_WORKSPACE}
          make -sj$(cat /proc/cpuinfo | grep "core id" | wc -l) 2>&1 | tee -a output.log
          # Check the exit code of the make command
          make_exit_code=${PIPESTATUS[0]}
          # If the make command returned a non-zero exit code, exit with the same code
          if [[ $make_exit_code -ne 0 ]]; then
              echo "make command failed with exit code $make_exit_code"
              exit $make_exit_code
          fi
      - name: Make install
        run: |
          make install 2>&1 | tee -a output.log
      - name: Validate output
        env:
          POSTGRES_VERSION: ${{ matrix.POSTGRES_VERSION }}
          PACKAGING_DOCKER_IMAGE: ${{ matrix.packaging_docker_image }}
        run: |
          echo "Postgres version: ${POSTGRES_VERSION}"
          ./.github/packaging/validate_build_output.sh "deb"
--- a/.gitignore
+++ b/.gitignore
@ -25,7 +25,6 @@ win32ver.rc
 *.exe
 lib*dll.def
 lib*.pc
 *.bc
 # Local excludes in root directory
 /config.log
@ -37,24 +36,3 @@ lib*.pc
 /autom4te.cache
 /Makefile.global
 /src/Makefile.custom
 /compile_commands.json
 /src/backend/distributed/cdc/build-cdc-*/*
 /src/test/cdc/tmp_check/*
 # temporary files vim creates
 *.swp
 # vscode
 .vscode/*
 # output from diff normalization that shouldn't be commited
 *.unmodified
 *.modified
 # style related temporary outputs
 *.uncrustify
 .venv
 # added output when modifying check_gucs_are_alphabetically_sorted.sh
 guc.out
--- a/.ignore
+++ b/.ignore
@ -1 +0,0 @@
 /vendor
--- a/.travis.yml
+++ b/.travis.yml
@ -0,0 +1,19 @@
 sudo: required
 dist: trusty
 language: c
 cache: apt
 branches:
  except: [ /^open-.*$/ ]
 env:
  global:
    secure: degV+qb2xHiea7E2dGk/WLvmYjq4ZsBn6ZPko+YhRcNm2GRXRaU3FqMBIecPtsEEFYaL5GwCQq/CgBf9aQxgDQ+t2CrmtGTtI9AGAbVBl//amNeJOoLe6QvrDpSQX5pUxwDLCng8cvoQK7ZxGlNCzDKiu4Ep4DUWgQVpauJkQ9nHjtSMZvUqCoI9h1lBy9Mxh7YFfHPW2PAXCqpV4VlNiIYF84UKdX3MXKLy9Yt0JBSNTWLZFp/fFw2qNwzFvN94rF3ZvFSD7Wp6CIhT6R5/6k6Zx8YQIrjWhgm6OVy1osUA8X7W79h2ISPqKqMNVJkjJ+N8S4xuQU0kfejnQ74Ie/uJiHCmbW5W2TjpL1aU3FQpPsGwR8h0rSeHhJAJzd8Ma+z8vvnnQHDyvetPBB0WgA/VMQCu8uEutyfYw2hDmB2+l2dDwkViaI7R95bReAGrpd5uNqklAXuR7yOeArz0ZZpHV0aZHGcNBxznMaZExSVZ5DVPW38UPn7Kgse8BnOWeLgnA1hJVp6CmBCtu+hKYt+atBPgRbM8IUINnKKZf/Sk6HeJIJZs662jD8/X93vFi0ZtyV2jEKJpouWw8j4vrGGsaDzTEUcyJgDqZj7tPJptM2L5B3BcFJmkGj2HO3N+LGDarJrVBBSiEjhTgx4NnLiKZnUbMx547mCRg2akk2w=
  matrix:
    - PGVERSION=9.5
 before_install:
  - git clone -b v0.3.3 --depth 1 https://github.com/citusdata/tools.git
  - tools/travis/setup_apt.sh
  - tools/travis/nuke_pg.sh
 install:
  - tools/travis/install_pg.sh
 script: tools/travis/pg_travis_multi_test.sh
 after_success: tools/travis/sync_to_enterprise
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@ -1,9 +0,0 @@
 # Microsoft Open Source Code of Conduct
 This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 Resources:
 - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/)
 - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
 - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -6,70 +6,13 @@ We're happy you want to contribute! You can help us in different ways:
  suggestions for improvements
 * Fork this repository and submit a pull request
-Before accepting any code contributions we ask that contributors
+Before accepting any code contributions we ask that Citus contributors
 sign a Contributor License Agreement (CLA). For an explanation of
 why we ask this as well as instructions for how to proceed, see the
-[Microsoft CLA](https://cla.opensource.microsoft.com/).
+[Citus CLA](https://cla.citusdata.com).
 ### Devcontainer / Github Codespaces
 The easiest way to start contributing is via our devcontainer. This container works both locally in visual studio code with docker-desktop/docker-for-mac as well as [Github Codespaces](https://github.com/features/codespaces). To open the project in vscode you will need the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers). For codespaces you will need to [create a new codespace](https://codespace.new/citusdata/citus).
 With the extension installed you can run the following from the command pallet to get started
 ```
 > Dev Containers: Clone Repository in Container Volume...
 ```
 In the subsequent popup paste the url to the repo and hit enter.
 ```
 https://github.com/citusdata/citus
 ```
 This will create an isolated Workspace in vscode, complete with all tools required to build, test and run the Citus extension. We keep this container up to date with the supported postgres versions as well as the exact versions of tooling we use.
 To quickly start we suggest splitting your terminal once to have two shells. The left one in the `/workspaces/citus`, the second one changed to `/data`. The left terminal will be used to interact with the project, the right one with a testing cluster.
 To get citus installed from source we run `make install -s` in the first terminal. Once installed you can start a Citus cluster in the second terminal via `citus_dev make citus`. The cluster will run in the background, and can be interacted with via `citus_dev`. To get an overview of the available commands.
 With the Citus cluster running you can connect to the coordinator in the first terminal via `psql -p9700`. Because the coordinator is the most common entrypoint the `PGPORT` environment is set accordingly, so a simple `psql` will connect directly to the coordinator.
 ### Debugging in the VS code
 1. Start Debugging: Press F5 in VS Code to start debugging. When prompted, you'll need to attach the debugger to the appropriate PostgreSQL process.
 2. Identify the Process: If you're running a psql command, take note of the PID that appears in your psql prompt. For example:
 ```
 [local] citus@citus:9700 (PID: 5436)=#
 ```
 This PID (5436 in this case) indicates the process that you should attach the debugger to.
 If you are uncertain about which process to attach, you can list all running PostgreSQL processes using the following command:
 ```
 ps aux | grep postgres
 ```
 Look for the process associated with the PID you noted. For example:
 ```
 citus      5436  0.0  0.0  0  0 ?        S    14:00   0:00 postgres: citus citus
 ```
 4. Attach the Debugger: Once you've identified the correct PID, select that process when prompted in VS Code to attach the debugger. You should now be able to debug the PostgreSQL session tied to the psql command.
 5. Set Breakpoints and Debug: With the debugger attached, you can set breakpoints within the code. This allows you to step through the code execution, inspect variables, and fully debug the PostgreSQL instance running in your container.
 ### Getting and building
 [PostgreSQL documentation](https://www.postgresql.org/support/versioning/) has a
 section on upgrade policy.
 	We always recommend that all users run the latest available minor release [for PostgreSQL] for whatever major version is in use.
 We expect Citus users to honor this recommendation and use latest available
 PostgreSQL minor release. Failure to do so may result in failures in our test
 suite. There are some known improvements in PG test architecture such as
 [this commit](https://github.com/postgres/postgres/commit/3f323956128ff8589ce4d3a14e8b950837831803)
 that are missing in earlier minor versions.
 #### Mac
 1. Install Xcode
@ -77,7 +20,7 @@ that are missing in earlier minor versions.
  ```bash
  brew update
-  brew install git postgresql python
+  brew install git postgresql
  ```
 3. Get, build, and test the code
@ -87,19 +30,9 @@ that are missing in earlier minor versions.
  cd citus
  ./configure
  # If you have already installed the project, you need to clean it first
  make clean
  make
  make install
  # Optionally, you might instead want to use `make install-all`
  # since `multi_extension` regression test would fail due to missing downgrade scripts.
  cd src/test/regress
  pip install pipenv
  pipenv --rm
  pipenv install
  pipenv shell
  make check
  ```
@ -114,11 +47,9 @@ that are missing in earlier minor versions.
       sudo apt-key add -
  sudo apt-get update
-  sudo apt-get install -y postgresql-server-dev-14 postgresql-14 \
+  sudo apt-get install -y postgresql-server-dev-9.5 postgresql-9.5 \
-                          autoconf flex git libcurl4-gnutls-dev libicu-dev \
+                          libedit-dev libselinux1-dev libxslt-dev  \
-                          libkrb5-dev liblz4-dev libpam0g-dev libreadline-dev \
+                          libpam0g-dev git flex make
                          libselinux1-dev libssl-dev libxslt1-dev libzstd-dev \
                          make uuid-dev
  ```
 2. Get, build, and test the code
@ -127,157 +58,35 @@ that are missing in earlier minor versions.
  git clone https://github.com/citusdata/citus.git
  cd citus
  ./configure
  # If you have already installed the project previously, you need to clean it first
  make clean
  make
  sudo make install
  # Optionally, you might instead want to use `sudo make install-all`
  # since `multi_extension` regression test would fail due to missing downgrade scripts.
  cd src/test/regress
  pip install pipenv
  pipenv --rm
  pipenv install
  pipenv shell
  make check
  ```
 #### Red Hat-based Linux (RHEL, CentOS, Fedora)
-1. Find the RPM URL for your repo at [yum.postgresql.org](http://yum.postgresql.org/repopackages.php)
+1. Find the PostgreSQL 9.5 RPM URL for your repo at [yum.postgresql.org](http://yum.postgresql.org/repopackages.php#pg95)
 2. Register its contents with Yum:
  ```bash
  sudo yum install -y <url>
  ```
-3. Register EPEL and SCL repositories for your distro.
+3. Install build dependencies
  On CentOS:
  ```bash
  yum install -y centos-release-scl-rh epel-release
  ```
  On RHEL, see [this RedHat blog post](https://developers.redhat.com/blog/2018/07/07/yum-install-gcc7-clang/) to install set-up SCL first. Then run:
  ```bash
  yum install -y epel-release
  ```
 4. Install build dependencies
  ```bash
  sudo yum update -y
  sudo yum groupinstall -y 'Development Tools'
-  sudo yum install -y postgresql14-devel postgresql14-server     \
+  sudo yum install -y postgresql95-devel postgresql95-server    \
-                      git libcurl-devel libxml2-devel libxslt-devel \
+                      libxml2-devel libxslt-devel openssl-devel \
-                      libzstd-devel llvm-toolset-7-clang llvm5.0 lz4-devel \
+                      pam-devel readline-devel git
                      openssl-devel pam-devel readline-devel
  git clone https://github.com/citusdata/citus.git
  cd citus
-  PG_CONFIG=/usr/pgsql-14/bin/pg_config ./configure
+  PG_CONFIG=/usr/pgsql-9.5/bin/pg_config ./configure
  # If you have already installed the project previously, you need to clean it first
  make clean
  make
  sudo make install
  # Optionally, you might instead want to use `sudo make install-all`
  # since `multi_extension` regression test would fail due to missing downgrade scripts.
  cd src/test/regress
  pip install pipenv
  pipenv --rm
  pipenv install
  pipenv shell
  make check
  ```
 ### Following our coding conventions
 Our coding conventions are documented in [STYLEGUIDE.md](STYLEGUIDE.md).
 ### Making SQL changes
 Sometimes you need to make change to the SQL that the citus extension runs upon
 creations. The way this is done is by changing the last file in
 `src/backend/distributed/sql`, or creating it if the last file is from a
 published release. If you needed to create a new file, also change the
 `default_version` field in `src/backend/distributed/citus.control` to match your
 new version. All the files in this directory are run in order based on
 their name. See [this page in the Postgres
 docs](https://www.postgresql.org/docs/current/extend-extensions.html) for more
 information on how Postgres runs these files.
 #### Changing or creating functions
 If you need to change any functions defined by Citus. You should check inside
 `src/backend/distributed/sql/udfs` to see if there is already a directory for
 this function, if not create one. Then change or create the file called
 `latest.sql` in that directory to match how it should create the function. This
 should be including any DROP (IF EXISTS), COMMENT and REVOKE statements for this
 function.
 Then copy the `latest.sql` file to `{version}.sql`, where `{version}` is the
 version for which this sql change is, e.g. `{9.0-1.sql}`. Now that you've
 created this stable snapshot of the function definition for your version you
 should use it in your actual sql file, e.g.
 `src/backend/distributed/sql/citus--8.3-1--9.0-1.sql`. You do this by using C
 style `#include` statements like this:
 ```
 #include "udfs/myudf/9.0-1.sql"
 ```
 #### Other SQL
 Any other SQL you can put directly in the main sql file, e.g.
 `src/backend/distributed/sql/citus--8.3-1--9.0-1.sql`.
 ### Backporting a commit to a release branch
 1. Check out the release branch that you want to backport to `git checkout release-11.3`
 2. Make sure you have the latest changes `git pull`
 3. Create a new release branch with a unique name `git checkout -b release-11.3-<yourname>`
 4. Cherry-pick the commit that you want to backport `git cherry-pick -x <sha>` (the `-x` is important)
 5. Push the branch `git push`
 6. Wait for tests to pass
 7. If the cherry-pick required non-trivial merge conflicts, create a PR and ask
   for a review.
 8. After the tests pass on CI, fast-forward the release branch `git push origin release-11.3-<yourname>:release-11.3`
 ### Running tests
 See [`src/test/regress/README.md`](https://github.com/citusdata/citus/blob/master/src/test/regress/README.md)
 ### Documentation
 User-facing documentation is published on [docs.citusdata.com](https://docs.citusdata.com/). When adding a new feature, function, or setting, you can open a pull request or issue against the [Citus docs repo](https://github.com/citusdata/citus_docs/).
 Detailed descriptions of the implementation for Citus developers are provided in the [Citus Technical Documentation](src/backend/distributed/README.md). It is currently a single file for ease of searching. Please update the documentation if you make any changes that affect the design or add major new features.
 # Making a pull request ready for reviews
 Asking for help and asking for reviews are two different things. When you're asking for help, you're asking for someone to help you with something that you're not expected to know.
 But when you're asking for a review, you're asking for someone to review your work and provide feedback. So, when you're asking for a review, you're expected to make sure that:
 * Your changes don't perform **unnecessary line addition / deletions / style changes on unrelated files / lines**.
 * All CI jobs are **passing**, including **style checks** and **flaky test detection jobs**. Note that if you're an external contributor, you don't have to wait CI jobs to run (and finish) because they don't get automatically triggered for external contributors.
 * Your PR has necessary amount of **tests** and that they're passing.
 * You separated as much as possible work into **separate PRs**, e.g., a prerequisite bugfix, a refactoring etc..
 * Your PR doesn't introduce a typo or something that you can easily fix yourself.
 * After all CI jobs pass, code-coverage measurement job (CodeCov as of today) then kicks in. That's why it's important to make the **tests passing** first. At that point, you're expected to check **CodeCov annotations** that can be seen in the **Files Changed** tab and expected to make sure that it doesn't complain about any lines that are not covered. For example, it's ok if CodeCov complains about an `ereport()` call that you put for an "unexpected-but-better-than-crashing" case, but it's not ok if it complains about an uncovered `if` branch that you added.
 * And finally, perform a **self-review** to make sure that:
  * Code and code-comments reflects the idea **without requiring an extra explanation** via a chat message / email / PR comment.
    This is important because we don't expect developers to reach out to author / read about the whole discussion in the PR to understand the idea behind a commit merged into `main` branch.
  * PR description is clear enough.
  * If-and-only-if you're **introducing a user facing change / bugfix**, your PR has a line that starts with `DESCRIPTION: <Present simple tense word that starts with a capital letter, e.g., Adds support for / Fixes / Disallows>`.
  * **Commit messages** are clear enough if the commits are doing logically different things.
--- a/DEVCONTAINER.md
+++ b/DEVCONTAINER.md
@ -1,43 +0,0 @@
 # Devcontainer
 ## Coredumps
 When postgres/citus crashes, there is the option to create a coredump. This is useful for debugging the issue. Coredumps are enabled in the devcontainer by default. However, not all environments are configured correctly out of the box. The most important configuration that is not standardized is the `core_pattern`. The configuration can be verified from the container, however, you cannot change this setting from inside the container as the filesystem containing this setting is in read only mode while inside the container.
 To verify if corefiles are written run the following command in a terminal. This shows the filename pattern with which the corefile will be written.
 ```bash
 cat /proc/sys/kernel/core_pattern
 ```
 This should be configured with a relative path or simply a simple filename, such as `core`. When your environment shows an absolute path you will need to change this setting. How to change this setting depends highly on the underlying system as the setting needs to be changed on the kernel of the host running the container.
 You can put any pattern in `/proc/sys/kernel/core_pattern` as you see fit. eg. You can add the PID to the core pattern in one of two ways;
 - You either include `%p` in the core_pattern. This gets substituted with the PID of the crashing process.
 - Alternatively you could set `/proc/sys/kernel/core_uses_pid` to `1` in the same way as you set `core_pattern`. This will append the PID to the corefile if `%p` is not explicitly contained in the core_pattern.
 When a coredump is written you can use the debug/launch configuration `Open core file` which is preconfigured in the devcontainer. This will open a fileprompt that lists all coredumps that are found in your workspace. When you want to debug coredumps from `citus_dev` that are run in your `/data` directory, you can add the data directory to your workspace. In the command pallet of vscode you can run `>Workspace: Add Folder to Workspace...` and select the `/data` directory. This will allow you to open the coredumps from the `/data` directory in the `Open core file` debug configuration.
 ### Windows (docker desktop)
 When running in docker desktop on windows you will most likely need to change this setting. The linux guest in WSL2 that runs your container is the `docker-desktop` environment. The easiest way to get onto the host, where you can change this setting, is to open a powershell window and verify you have the docker-desktop environment listed.
 ```powershell
 wsl --list
 ```
 Among others this should list both `docker-desktop` and `docker-desktop-data`. You can then open a shell in the `docker-desktop` environment.
 ```powershell
 wsl -d docker-desktop
 ```
 Inside this shell you can verify that you have the right environment by running
 ```bash
 cat /proc/sys/kernel/core_pattern
 ```
 This should show the same configuration as the one you see inside the devcontainer. You can then change the setting by running the following command.
 This will change the setting for the current session. If you want to make the change permanent you will need to add this to a startup script.
 ```bash
 echo "core" > /proc/sys/kernel/core_pattern
 ```
--- a/2
+++ b/2
@ -658,4 +658,4 @@ specific requirements.
  You should also get your employer (if you work as a programmer) or school,
 if any, to sign a "copyright disclaimer" for the program, if necessary.
 For more information on this, and how to apply and follow the GNU AGPL, see
-<http://www.gnu.org/licenses/>.
+<http://www.gnu.org/licenses/>.
--- a/55
+++ b/55
@ -2,7 +2,6 @@
 citus_subdir = .
 citus_top_builddir = .
 extension_dir = $(shell $(PG_CONFIG) --sharedir)/extension
 # Hint that configure should be run first
 ifeq (,$(wildcard Makefile.global))
@ -11,57 +10,47 @@ endif
 include Makefile.global
-all: extension
+all: extension csql
 # build columnar only
 columnar:
 	$(MAKE) -C src/backend/columnar all
 # build extension
-extension: $(citus_top_builddir)/src/include/citus_version.h columnar
+extension:
 	$(MAKE) -C src/backend/distributed/ all
-install-columnar: columnar
+install-extension: extension
 	$(MAKE) -C src/backend/columnar install
 install-extension: extension install-columnar
 	$(MAKE) -C src/backend/distributed/ install
 install-headers: extension
 	$(MKDIR_P) '$(DESTDIR)$(includedir_server)/distributed/'
 # generated headers are located in the build directory
-	$(INSTALL_DATA) $(citus_top_builddir)/src/include/citus_version.h '$(DESTDIR)$(includedir_server)/'
+	$(INSTALL_DATA) src/include/citus_config.h '$(DESTDIR)$(includedir_server)/'
 # the rest in the source tree
 	$(INSTALL_DATA) $(citus_abs_srcdir)/src/include/distributed/*.h '$(DESTDIR)$(includedir_server)/distributed/'
 clean-extension:
 	$(MAKE) -C src/backend/distributed/ clean
-	$(MAKE) -C src/backend/columnar/ clean
+.PHONY: extension install-extension clean-extension
 clean-full:
 	$(MAKE) -C src/backend/distributed/ clean-full
 .PHONY: extension install-extension clean-extension clean-full
 install-downgrades:
 	$(MAKE) -C src/backend/distributed/ install-downgrades
 install-all: install-headers
 	$(MAKE) -C src/backend/columnar/ install-all
 	$(MAKE) -C src/backend/distributed/ install-all
 # Add to generic targets
 install: install-extension install-headers
 clean: clean-extension
 # build csql binary
 csql:
 	$(MAKE) -C src/bin/csql/ all
 install-csql: csql
 	$(MAKE) -C src/bin/csql/ install
 clean-csql:
 	$(MAKE) -C src/bin/csql/ clean
 .PHONY: csql install-csql clean-csql
 # Add to generic targets
 install: install-csql
 clean: clean-csql
 # apply or check style
 reindent:
-	${citus_abs_top_srcdir}/ci/fix_style.sh
+	cd ${citus_abs_top_srcdir} && citus_indent --quiet
 check-style:
 	black . --check --quiet
 	isort . --check --quiet
 	flake8
 	cd ${citus_abs_top_srcdir} && citus_indent --quiet --check
 .PHONY: reindent check-style
-# depend on install-all so that downgrade scripts are installed as well
+# depend on install for now
-check: all install-all
+check: all install
-	# explicetely does not use $(MAKE) to avoid parallelism
+	$(MAKE) -C src/test/regress check-full
 	make -C src/test/regress check
-.PHONY: all check clean install install-downgrades install-all
+.PHONY: all check install clean
--- a/Makefile.global.in
+++ b/Makefile.global.in
@ -11,29 +11,9 @@
 citus_abs_srcdir:=@abs_top_srcdir@/${citus_subdir}
 citus_abs_top_srcdir:=@abs_top_srcdir@
 postgres_abs_srcdir:=@POSTGRES_SRCDIR@
 postgres_abs_builddir:=@POSTGRES_BUILDDIR@
 PG_CONFIG:=@PG_CONFIG@
 PGXS:=$(shell $(PG_CONFIG) --pgxs)
 # if both, git is installed and there is a .git directory in the working dir we set the
 # GIT_VERSION to a human readable gitref that resembles the version from which citus is
 # built. During releases it will show the tagname which by convention is the verion of the
 # release
 ifneq (@GIT_BIN@,)
 ifneq (@HAS_DOTGIT@,)
 	# try to find a tag that exactly matches the current branch, swallow the error if cannot find such a tag
 	GIT_VERSION := "$(shell @GIT_BIN@ describe --exact-match --dirty --always --tags 2>/dev/null)"
 	# if there is not a tag that exactly matches the branch, then GIT_VERSION would still be empty
 	# in that case, set GIT_VERSION with current branch's name and the short sha of the HEAD
 ifeq ($(GIT_VERSION),"")
 	GIT_VERSION := "$(shell @GIT_BIN@ rev-parse --abbrev-ref HEAD)(sha: $(shell @GIT_BIN@ rev-parse --short HEAD))"
 endif
 endif
 endif
 # Support for VPATH builds (i.e. builds from outside the source tree)
 vpath_build=@vpath_build@
 ifeq ($(vpath_build),yes)
@ -61,11 +41,11 @@ $(citus_top_builddir)/Makefile.global: $(citus_abs_top_srcdir)/configure $(citus
 # Ensure configuration is generated by the most recent configure,
 # useful for longer existing build directories.
-$(citus_top_builddir)/config.status: $(citus_abs_top_srcdir)/configure $(citus_abs_top_srcdir)/src/backend/distributed/citus.control
+$(citus_top_builddir)/config.status: $(citus_abs_top_srcdir)/configure
-	cd @abs_top_builddir@ && ./config.status --recheck && ./config.status
+	cd @abs_top_builddir@ && ./config.status --recheck
-# Regenerate configure if configure.ac changed
+# Regenerate configure if configure.in changed
-$(citus_abs_top_srcdir)/configure: $(citus_abs_top_srcdir)/configure.ac
+$(citus_abs_top_srcdir)/configure: $(citus_abs_top_srcdir)/configure.in
 	cd ${citus_abs_top_srcdir} && ./autogen.sh
 # If specified via configure, replace the default compiler. Normally
@ -86,12 +66,8 @@ endif
 # Add options passed to configure or computed therein, to CFLAGS/CPPFLAGS/...
 override CFLAGS += @CFLAGS@ @CITUS_CFLAGS@
-override BITCODE_CFLAGS := $(BITCODE_CFLAGS) @CITUS_BITCODE_CFLAGS@
+override CPPFLAGS := @CPPFLAGS@ -I '${citus_abs_top_srcdir}/src/include' $(CPPFLAGS)
-ifneq ($(GIT_VERSION),)
+override LDFLAGS += @LDFLAGS@
    override CFLAGS += -DGIT_VERSION=\"$(GIT_VERSION)\"
 endif
 override CPPFLAGS := @CPPFLAGS@ @CITUS_CPPFLAGS@ -I '${citus_abs_top_srcdir}/src/include' -I'${citus_top_builddir}/src/include' $(CPPFLAGS)
 override LDFLAGS += @LDFLAGS@ @CITUS_LDFLAGS@
 # optional file with user defined, additional, rules
 -include ${citus_abs_srcdir}/src/Makefile.custom
--- a/99
+++ b/99
@ -1,99 +0,0 @@
 NOTICES AND INFORMATION
 Do Not Translate or Localize
 This software incorporates material from third parties.
 Microsoft makes certain open source code available at https://3rdpartysource.microsoft.com,
 or you may send a check or money order for US $5.00, including the product name,
 the open source component name, platform, and version number, to:
 Source Code Compliance Team
 Microsoft Corporation
 One Microsoft Way
 Redmond, WA 98052
 USA
 Notwithstanding any other terms, you may reverse engineer this software to the extent
 required to debug changes to any libraries licensed under the GNU Lesser General Public License.
 ---------------------------------------------------------
 ---------------------------------------------------------
 intel/safestringlib 245c4b8cff1d2e7338b7f3a82828fc8e72b29549 - MIT
 Copyright (c) 2014-2018 Intel Corporation
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 ================================================================================
 Copyright (C) 2012, 2013 Cisco Systems
 All rights reserved.
 Permission is hereby granted, free of charge, to any person
 obtaining a copy of this software and associated documentation
 files (the "Software"), to deal in the Software without
 restriction, including without limitation the rights to use,
 copy, modify, merge, publish, distribute, sublicense, and/or
 sell copies of the Software, and to permit persons to whom the
 Software is furnished to do so, subject to the following
 conditions:
 The above copyright notice and this permission notice shall be
 included in all copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
 OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
 NONINFRINGEMENT.  IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
 WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
 OTHER DEALINGS IN THE SOFTWARE.
 ---------------------------------------------------------
 postgres/postgres 29be9983a64c011eac0b9ee29895cce71e15ea77
 PostgreSQL Database Management System
 (formerly known as Postgres, then as Postgres95)
 Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
 Portions Copyright (c) 1994, The Regents of the University of California
 Permission to use, copy, modify, and distribute this software and its
 documentation for any purpose, without fee, and without a written agreement
 is hereby granted, provided that the above copyright notice and this
 paragraph and the following two paragraphs appear in all copies.
 IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR
 DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
 LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
 DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
 INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
 AND FITNESS FOR A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS
 ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO
 PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
 ---------------------------------------------------------
--- a/README.md
+++ b/README.md
@ -1,496 +1,148 @@
-| **<br/>The Citus database is 100% open source.<br/><img width=1000/><br/>Learn what's new in the [Citus 13.0 release blog](https://www.citusdata.com/blog/2025/02/06/distribute-postgresql-17-with-citus-13/) and the [Citus Updates page](https://www.citusdata.com/updates/).<br/><br/>**|
+![Citus Banner](/github-banner.png)
-|---|
+
-<br/>
+[![Build Status](https://travis-ci.org/citusdata/citus.svg?branch=master)](https://travis-ci.org/citusdata/citus)
-
+[![Slack Status](http://slack.citusdata.com/badge.svg)](https://slack.citusdata.com)
-
+[![Latest Docs](https://img.shields.io/badge/docs-latest-brightgreen.svg)](http://docs.citusdata.com/en/v5.1/index.html)
-
+
-![Citus Banner](images/citus-readme-banner.png)
+### What is Citus?
-
+
-[![Latest Docs](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://docs.citusdata.com/)
+* **Open-source** PostgreSQL extension (not a fork)
-[![Stack Overflow](https://img.shields.io/badge/Stack%20Overflow-%20-545353?logo=Stack%20Overflow)](https://stackoverflow.com/questions/tagged/citus)
+* **Scalable** across multiple hosts through sharding and replication
-[![Slack](https://cituscdn.azureedge.net/images/social/slack-badge.svg)](https://slack.citusdata.com/)
+* **Distributed** engine for query parallelization
-[![Code Coverage](https://codecov.io/gh/citusdata/citus/branch/master/graph/badge.svg)](https://app.codecov.io/gh/citusdata/citus)
+* **Highly available** in the face of host failures
-[![Twitter](https://img.shields.io/twitter/follow/citusdata.svg?label=Follow%20@citusdata)](https://twitter.com/intent/follow?screen_name=citusdata)
+
-
+Citus horizontally scales PostgreSQL across commodity servers using
-[![Citus Deb Packages](https://img.shields.io/badge/deb-packagecloud.io-844fec.svg)](https://packagecloud.io/app/citusdata/community/search?q=&filter=debs)
+sharding and replication. Its query engine parallelizes incoming
-[![Citus Rpm Packages](https://img.shields.io/badge/rpm-packagecloud.io-844fec.svg)](https://packagecloud.io/app/citusdata/community/search?q=&filter=rpms)
+SQL queries across these servers to enable real-time responses on
-
+large datasets.
-## What is Citus?
+
-
+Citus extends the underlying database rather than forking it, which
-Citus is a [PostgreSQL extension](https://www.citusdata.com/blog/2017/10/25/what-it-means-to-be-a-postgresql-extension/) that transforms Postgres into a distributed database—so you can achieve high performance at any scale.
+gives developers and enterprises the power and familiarity of a
-
+traditional relational database. As an extension, Citus supports
-With Citus, you extend your PostgreSQL database with new superpowers:
+new PostgreSQL releases, allowing users to benefit from new features
-
+while maintaining compatibility with existing PostgreSQL tools.
- **Distributed tables** are sharded across a cluster of PostgreSQL nodes to combine their CPU, memory, storage and I/O capacity.
+Note that Citus supports many (but not all) SQL commands; see the
- **References tables** are replicated to all nodes for joins and foreign keys from distributed tables and maximum read performance.
+[FAQ][faq] for more details.
- **Distributed query engine** routes and parallelizes SELECT, DML, and other operations on distributed tables across the cluster.
+
- **Columnar storage** compresses data, speeds up scans, and supports fast projections, both on regular and distributed tables.
+Common Use-Cases:
- **Query from any node** enables you to utilize the full capacity of your cluster for distributed queries
+* Powering real-time analytic dashboards
-
+* Exploratory queries on events as they happen
-You can use these Citus superpowers to make your Postgres database scale-out ready on a single Citus node. Or you can build a large cluster capable of handling **high transaction throughputs**, especially in **multi-tenant apps**, run **fast analytical queries**, and process large amounts of **time series** or **IoT data** for **real-time analytics**. When your data size and volume grow, you can easily add more worker nodes to the cluster and rebalance the shards.
+* Large dataset archival and reporting
-
+* Session analytics (funnels, segmentation, and cohorts)
-Our [SIGMOD '21](https://2021.sigmod.org/) paper [Citus: Distributed PostgreSQL for Data-Intensive Applications](https://doi.org/10.1145/3448016.3457551) gives a more detailed look into what Citus is, how it works, and why it works that way.
+
-
+To learn more, visit [citusdata.com](https://www.citusdata.com) and join
-![Citus scales out from a single node](images/citus-scale-out.png)
+the [mailing list](https://groups.google.com/forum/#!forum/citus-users) to
-
+stay on top of the latest developments.
-Since Citus is an extension to Postgres, you can use Citus with the latest Postgres versions. And Citus works seamlessly with the PostgreSQL tools and extensions you are already familiar with.
+
-
+### Quickstart
- [Why Citus?](#why-citus)
+
- [Getting Started](#getting-started)
+#### Local Citus Cluster
- [Using Citus](#using-citus)
+
- [Schema-based sharding](#schema-based-sharding)
+* Install docker-compose: [Mac][mac_install] | [Linux][linux_install]
- [Setting up with High Availability](#setting-up-with-high-availability)
+* (Mac only) connect to Docker VM
- [Documentation](#documentation)
+  ```bash
- [Architecture](#architecture)
+  eval $(docker-machine env default)
- [When to Use Citus](#when-to-use-citus)
+  ```
- [Need Help?](#need-help)
+
- [Contributing](#contributing)
+* Pull and start the docker images
- [Stay Connected](#stay-connected)
+  ```bash
-
+  wget https://raw.githubusercontent.com/citusdata/docker/master/docker-compose.yml
-## Why Citus?
+  docker-compose -p citus up -d
-
+  ```
-Developers choose Citus for two reasons:
+
-
+* Connect to the master database
-1. Your application is outgrowing a single PostgreSQL node
+  ```bash
-
+  docker exec -it citus_master psql -U postgres -d postgres
-	If the size and volume of your data increases over time, you may start seeing any number of performance and scalability problems on a single PostgreSQL node. For example: High CPU utilization and I/O wait times slow down your queries, SQL queries return out of memory errors, autovacuum cannot keep up and increases table bloat, etc.
+  ```
-
+
-	With Citus you can distribute and optionally compress your tables to always have enough memory, CPU, and I/O capacity to achieve high performance at scale. The distributed query engine can efficiently route transactions across the cluster, while parallelizing analytical queries and batch operations across all cores. Moreover, you can still use the PostgreSQL features and tools you know and love.
+* Follow the [first tutorial][tutorial] instructions
-
+* To shut the cluster down, run
-2. PostgreSQL can do things other systems can’t
+
-
+  ```bash
-	There are many data processing systems that are built to scale out, but few have as many powerful capabilities as PostgreSQL, including: Advanced joins and subqueries, user-defined functions, update/delete/upsert, constraints and foreign keys, powerful extensions (e.g. PostGIS, HyperLogLog), many types of indexes, time-partitioning, and sophisticated JSON support.
+  docker-compose -p citus down
-
+  ```
-	Citus makes PostgreSQL’s most powerful capabilities work at any scale, allowing you to handle complex data-intensive workloads on a single database system.
+
-
+### Talk to Contributors and Learn More
-## Getting Started
+
-
+<table class="tg">
-The quickest way to get started with Citus is to use the [Azure Cosmos DB for PostgreSQL](https://learn.microsoft.com/azure/cosmos-db/postgresql/quickstart-create-portal) managed service in the cloud—or [set up Citus locally](https://docs.citusdata.com/en/stable/installation/single_node.html).
+<col width="45%">
-
+<col width="65%">
-### Citus Managed Service on Azure
+<tr>
-
+  <td>Documentation</td>
-You can get a fully-managed Citus cluster in minutes through the [Azure Cosmos DB for PostgreSQL portal](https://azure.microsoft.com/products/cosmos-db/). Azure will manage your backups, high availability through auto-failover, software updates, monitoring, and more for all of your servers. To get started Citus on Azure, use the [Azure Cosmos DB for PostgreSQL Quickstart](https://learn.microsoft.com/azure/cosmos-db/postgresql/quickstart-create-portal).
+  <td>Try the <a
-
+  href="http://docs.citusdata.com/en/v5.1/tutorials/tut-cluster.html">Citus
-### Running Citus using Docker
+  tutorials</a> for a hands-on introduction or <br/>the <a
-
+  href="http://docs.citusdata.com/en/v5.1/index.html">documentation</a> for
-The smallest possible Citus cluster is a single PostgreSQL node with the Citus extension, which means you can try out Citus by running a single Docker container.
+  a more comprehensive reference.</td>
-
+</tr>
-```bash
+<tr>
-# run PostgreSQL with Citus on port 5500
+  <td>Google Groups</td>
-docker run -d --name citus -p 5500:5432 -e POSTGRES_PASSWORD=mypassword citusdata/citus
+  <td>The <a
-
+  href="https://groups.google.com/forum/#!forum/citus-users">Citus Google
-# connect using psql within the Docker container
+  Group</a> is our place for detailed questions and discussions.</td>
-docker exec -it citus psql -U postgres
+</tr>
-
+<tr>
-# or, connect using local psql
+  <td>Slack</td>
-psql -U postgres -d postgres -h localhost -p 5500
+  <td>Chat with us in our community <a
-```
+  href="https://slack.citusdata.com">Slack channel</a>.</td>
-
+</tr>
-### Install Citus locally
+<tr>
-
+  <td>Github Issues</td>
-If you already have a local PostgreSQL installation, the easiest way to install Citus is to use our packaging repo
+  <td>We track specific bug reports and feature requests on our <a
-
+  href="https://github.com/citusdata/citus/issues">project
-Install packages on Ubuntu / Debian:
+  issues</a>.</td>
-
+</tr>
-```bash
+<tr>
-curl https://install.citusdata.com/community/deb.sh > add-citus-repo.sh
+  <td>Twitter</td>
-sudo bash add-citus-repo.sh
+  <td>Follow <a href="https://twitter.com/citusdata">@citusdata</a>
-sudo apt-get -y install postgresql-17-citus-13.0
+  for general updates and PostgreSQL scaling tips.</td>
-```
+</tr>
-
+<tr>
-Install packages on Red Hat:
+  <td>Training and Support</td>
-```bash
+  <td>See our <a
-curl https://install.citusdata.com/community/rpm.sh > add-citus-repo.sh
+  href="https://www.citusdata.com/citus-products/citus-data-pricing">support
-sudo bash add-citus-repo.sh
+  page</a> for training and dedicated support options.</td>
-sudo yum install -y citus130_17
+</tr>
-```
+</table>
-
+
-To add Citus to your local PostgreSQL database, add the following to `postgresql.conf`:
+### Contributing
-
+
-```
+Citus is built on and of open source. We welcome your contributions,
-shared_preload_libraries = 'citus'
+and have added a
-```
+[helpwanted](https://github.com/citusdata/citus/labels/helpwanted) label
-
+to issues which are accessible to new contributors. The
-After restarting PostgreSQL, connect using `psql` and run:
+[CONTRIBUTING.md](CONTRIBUTING.md) file explains how to get started
-
+developing the Citus extension itself and our code quality guidelines.
-```sql
+
-CREATE EXTENSION citus;
+### Who is Using Citus?
-````
+
-You’re now ready to get started and use Citus tables on a single node.
+Citus is deployed in production by many customers, ranging from
-
+technology start-ups to large enterprises. Here are some examples:
-### Install Citus on multiple nodes
+
-
+* [CloudFlare](https://www.cloudflare.com/) uses Citus to provide
-If you want to set up a multi-node cluster, you can also set up additional PostgreSQL nodes with the Citus extensions and add them to form a Citus cluster:
+real-time analytics on 100 TBs of data from over 4 million customer
-
+websites. [Case
-```sql
+Study](https://blog.cloudflare.com/scaling-out-postgresql-for-cloudflare-analytics-using-citusdb/)
-- before adding the first worker node, tell future worker nodes how to reach the coordinator
+* [MixRank](https://mixrank.com/) uses Citus to efficiently collect
-SELECT citus_set_coordinator_host('10.0.0.1', 5432);
+and analyze vast amounts of data to allow inside B2B sales teams
-
+to find new customers. [Case
-- add worker nodes
+Study](https://www.citusdata.com/solutions/case-studies/mixrank-case-study)
-SELECT citus_add_node('10.0.0.2', 5432);
+* [Neustar](https://www.neustar.biz/) builds and maintains scalable
-SELECT citus_add_node('10.0.0.3', 5432);
+ad-tech infrastructure that counts billions of events per day using
-
+Citus and HyperLogLog.
-- rebalance the shards over the new worker nodes
+* [Agari](https://www.agari.com/) uses Citus to secure more than
-SELECT rebalance_table_shards();
+85 percent of U.S. consumer emails on two 6-8 TB clusters. [Case
-```
+Study](https://www.citusdata.com/solutions/case-studies/agari-case-study)
-
+* [Heap](https://heapanalytics.com/) uses Citus to run dynamic
-For more details, see our [documentation on how to set up a multi-node Citus cluster](https://docs.citusdata.com/en/stable/installation/multi_node.html) on various operating systems.
+funnel, segmentation, and cohort queries across billions of users
-
+and tens of billions of events. [Watch
-## Using Citus
+Video](https://www.youtube.com/watch?v=NVl9_6J1G60&list=PLixnExCn6lRpP10ZlpJwx6AuU3XIgNWpL)
 Once you have your Citus cluster, you can start creating distributed tables, reference tables and use columnar storage.
 ### Creating Distributed Tables
 The `create_distributed_table` UDF will transparently shard your table locally or across the worker nodes:
 ```sql
 CREATE TABLE events (
  device_id bigint,
  event_id bigserial,
  event_time timestamptz default now(),
  data jsonb not null,
  PRIMARY KEY (device_id, event_id)
 );
 -- distribute the events table across shards placed locally or on the worker nodes
 SELECT create_distributed_table('events', 'device_id');
 ```
 After this operation, queries for a specific device ID will be efficiently routed to a single worker node, while queries across device IDs will be parallelized across the cluster.
 ```sql
 -- insert some events
 INSERT INTO events (device_id, data)
 SELECT s % 100, ('{"measurement":'||random()||'}')::jsonb FROM generate_series(1,1000000) s;
 -- get the last 3 events for device 1, routed to a single node
 SELECT * FROM events WHERE device_id = 1 ORDER BY event_time DESC, event_id DESC LIMIT 3;
 ┌───────────┬──────────┬───────────────────────────────┬───────────────────────────────────────┐
 │ device_id │ event_id │          event_time           │                 data                  │
 ├───────────┼──────────┼───────────────────────────────┼───────────────────────────────────────┤
 │         1 │  1999901 │ 2021-03-04 16:00:31.189963+00 │ {"measurement": 0.88722643925054}     │
 │         1 │  1999801 │ 2021-03-04 16:00:31.189963+00 │ {"measurement": 0.6512231304621992}   │
 │         1 │  1999701 │ 2021-03-04 16:00:31.189963+00 │ {"measurement": 0.019368766051897524} │
 └───────────┴──────────┴───────────────────────────────┴───────────────────────────────────────┘
 (3 rows)
 Time: 4.588 ms
 -- explain plan for a query that is parallelized across shards, which shows the plan for
 -- a query one of the shards and how the aggregation across shards is done
 EXPLAIN (VERBOSE ON) SELECT count(*) FROM events;
 ┌────────────────────────────────────────────────────────────────────────────────────┐
 │                                     QUERY PLAN                                     │
 ├────────────────────────────────────────────────────────────────────────────────────┤
 │ Aggregate                                                                          │
 │   Output: COALESCE((pg_catalog.sum(remote_scan.count))::bigint, '0'::bigint)       │
 │   ->  Custom Scan (Citus Adaptive)                                                 │
 │         ...                                                                        │
 │         ->  Task                                                                   │
 │               Query: SELECT count(*) AS count FROM events_102008 events WHERE true │
 │               Node: host=localhost port=5432 dbname=postgres                       │
 │               ->  Aggregate                                                        │
 │                     ->  Seq Scan on public.events_102008 events                    │
 └────────────────────────────────────────────────────────────────────────────────────┘
 ```
 ### Creating Distributed Tables with Co-location
 Distributed tables that have the same distribution column can be co-located to enable high performance distributed joins and foreign keys between distributed tables.
 By default, distributed tables will be co-located based on the type of the distribution column, but you define co-location explicitly with the `colocate_with` argument in `create_distributed_table`.
 ```sql
 CREATE TABLE devices (
  device_id bigint primary key,
  device_name text,
  device_type_id int
 );
 CREATE INDEX ON devices (device_type_id);
 -- co-locate the devices table with the events table
 SELECT create_distributed_table('devices', 'device_id', colocate_with := 'events');
 -- insert device metadata
 INSERT INTO devices (device_id, device_name, device_type_id)
 SELECT s, 'device-'||s, 55 FROM generate_series(0, 99) s;
 -- optionally: make sure the application can only insert events for a known device
 ALTER TABLE events ADD CONSTRAINT device_id_fk
 FOREIGN KEY (device_id) REFERENCES devices (device_id);
 -- get the average measurement across all devices of type 55, parallelized across shards
 SELECT avg((data->>'measurement')::double precision)
 FROM events JOIN devices USING (device_id)
 WHERE device_type_id = 55;
 ┌────────────────────┐
 │        avg         │
 ├────────────────────┤
 │ 0.5000191877513974 │
 └────────────────────┘
 (1 row)
 Time: 209.961 ms
 ```
 Co-location also helps you scale [INSERT..SELECT](https://docs.citusdata.com/en/stable/articles/aggregation.html), [stored procedures](https://www.citusdata.com/blog/2020/11/21/making-postgres-stored-procedures-9x-faster-in-citus/), and [distributed transactions](https://www.citusdata.com/blog/2017/06/02/scaling-complex-sql-transactions/).
 ### Distributing Tables without interrupting the application
 Some of you already start with Postgres, and decide to distribute tables later on while your application using the tables. In that case, you want to avoid downtime for both reads and writes. `create_distributed_table` command block writes (e.g., DML commands) on the table until the command is finished. Instead, with `create_distributed_table_concurrently` command, your application can continue to read and write the data even during the command.
 ```sql
 CREATE TABLE device_logs (
  device_id bigint primary key,
  log text
 );
 -- insert device logs
 INSERT INTO device_logs (device_id, log)
 SELECT s, 'device log:'||s FROM generate_series(0, 99) s;
 -- convert device_logs into a distributed table without interrupting the application
 SELECT create_distributed_table_concurrently('device_logs', 'device_id', colocate_with := 'devices');
 -- get the count of the logs, parallelized across shards
 SELECT count(*) FROM device_logs;
 ┌───────┐
 │ count │
 ├───────┤
 │   100 │
 └───────┘
 (1 row)
 Time: 48.734 ms
 ```
 ### Creating Reference Tables
 When you need fast joins or foreign keys that do not include the distribution column, you can use `create_reference_table` to replicate a table across all nodes in the cluster.
 ```sql
 CREATE TABLE device_types (
  device_type_id int primary key,
  device_type_name text not null unique
 );
 -- replicate the table across all nodes to enable foreign keys and joins on any column
 SELECT create_reference_table('device_types');
 -- insert a device type
 INSERT INTO device_types (device_type_id, device_type_name) VALUES (55, 'laptop');
 -- optionally: make sure the application can only insert devices with known types
 ALTER TABLE devices ADD CONSTRAINT device_type_fk
 FOREIGN KEY (device_type_id) REFERENCES device_types (device_type_id);
 -- get the last 3 events for devices whose type name starts with laptop, parallelized across shards
 SELECT device_id, event_time, data->>'measurement' AS value, device_name, device_type_name
 FROM events JOIN devices USING (device_id) JOIN device_types USING (device_type_id)
 WHERE device_type_name LIKE 'laptop%' ORDER BY event_time DESC LIMIT 3;
 ┌───────────┬───────────────────────────────┬─────────────────────┬─────────────┬──────────────────┐
 │ device_id │          event_time           │        value        │ device_name │ device_type_name │
 ├───────────┼───────────────────────────────┼─────────────────────┼─────────────┼──────────────────┤
 │        60 │ 2021-03-04 16:00:31.189963+00 │ 0.28902084163415864 │ device-60   │ laptop           │
 │         8 │ 2021-03-04 16:00:31.189963+00 │ 0.8723803076285073  │ device-8    │ laptop           │
 │        20 │ 2021-03-04 16:00:31.189963+00 │ 0.8177634801548557  │ device-20   │ laptop           │
 └───────────┴───────────────────────────────┴─────────────────────┴─────────────┴──────────────────┘
 (3 rows)
 Time: 146.063 ms
 ```
 Reference tables enable you to scale out complex data models and take full advantage of relational database features.
 ### Creating Tables with Columnar Storage
 To use columnar storage in your PostgreSQL database, all you need to do is add `USING columnar` to your `CREATE TABLE` statements and your data will be automatically compressed using the columnar access method.
 ```sql
 CREATE TABLE events_columnar (
  device_id bigint,
  event_id bigserial,
  event_time timestamptz default now(),
  data jsonb not null
 )
 USING columnar;
 -- insert some data
 INSERT INTO events_columnar (device_id, data)
 SELECT d, '{"hello":"columnar"}' FROM generate_series(1,10000000) d;
 -- create a row-based table to compare
 CREATE TABLE events_row AS SELECT * FROM events_columnar;
 -- see the huge size difference!
 \d+
                                          List of relations
 ┌────────┬──────────────────────────────┬──────────┬───────┬─────────────┬────────────┬─────────────┐
 │ Schema │             Name             │   Type   │ Owner │ Persistence │    Size    │ Description │
 ├────────┼──────────────────────────────┼──────────┼───────┼─────────────┼────────────┼─────────────┤
 │ public │ events_columnar              │ table    │ marco │ permanent   │ 25 MB      │             │
 │ public │ events_row                   │ table    │ marco │ permanent   │ 651 MB     │             │
 └────────┴──────────────────────────────┴──────────┴───────┴─────────────┴────────────┴─────────────┘
 (2 rows)
 ```
 You can use columnar storage by itself, or in a distributed table to combine the benefits of compression and the distributed query engine.
 When using columnar storage, you should only load data in batch using `COPY` or `INSERT..SELECT` to achieve good  compression. Update, delete, and foreign keys are currently unsupported on columnar tables. However, you can use partitioned tables in which newer partitions use row-based storage, and older partitions are compressed using columnar storage.
 To learn more about columnar storage, check out the [columnar storage README](https://github.com/citusdata/citus/blob/master/src/backend/columnar/README.md).
 ## Schema-based sharding
 Available since Citus 12.0, [schema-based sharding](https://docs.citusdata.com/en/stable/get_started/concepts.html#schema-based-sharding) is the shared database, separate schema model, the schema becomes the logical shard within the database. Multi-tenant apps can a use a schema per tenant to easily shard along the tenant dimension. Query changes are not required and the application usually only needs a small modification to set the proper search_path when switching tenants. Schema-based sharding is an ideal solution for microservices, and for ISVs deploying applications that cannot undergo the changes required to onboard row-based sharding.
 ### Creating distributed schemas
 You can turn an existing schema into a distributed schema by calling `citus_schema_distribute`:
 ```sql
 SELECT citus_schema_distribute('user_service');
 ```
 Alternatively, you can set `citus.enable_schema_based_sharding` to have all newly created schemas be automatically converted into distributed schemas:
 ```sql
 SET citus.enable_schema_based_sharding TO ON;
 CREATE SCHEMA AUTHORIZATION user_service;
 CREATE SCHEMA AUTHORIZATION time_service;
 CREATE SCHEMA AUTHORIZATION ping_service;
 ```
 ### Running queries
 Queries will be properly routed to schemas based on `search_path` or by explicitly using the schema name in the query.
 For [microservices](https://docs.citusdata.com/en/stable/get_started/tutorial_microservices.html) you would create a USER per service matching the schema name, hence the default `search_path` would contain the schema name. When connected the user queries would be automatically routed and no changes to the microservice would be required.
 ```sql
 CREATE USER user_service;
 CREATE SCHEMA AUTHORIZATION user_service;
 ```
 For typical multi-tenant applications, you would set the search path to the tenant schema name in your application:
 ```sql
 SET search_path = tenant_name, public;
 ```
 ## Setting up with High Availability
 One of the most popular high availability solutions for PostgreSQL, [Patroni 3.0](https://github.com/zalando/patroni), has [first class support for Citus 10.0 and above](https://patroni.readthedocs.io/en/latest/citus.html#citus), additionally since Citus 11.2 ships with improvements for smoother node switchover in Patroni.
 An example of patronictl list output for the Citus cluster:
 ```bash
 postgres@coord1:~$ patronictl list demo
 ```
 ```text
 + Citus cluster: demo ----------+--------------+---------+----+-----------+
 | Group | Member  | Host        | Role         | State   | TL | Lag in MB |
 +-------+---------+-------------+--------------+---------+----+-----------+
 |     0 | coord1  | 172.27.0.10 | Replica      | running |  1 |         0 |
 |     0 | coord2  | 172.27.0.6  | Sync Standby | running |  1 |         0 |
 |     0 | coord3  | 172.27.0.4  | Leader       | running |  1 |           |
 |     1 | work1-1 | 172.27.0.8  | Sync Standby | running |  1 |         0 |
 |     1 | work1-2 | 172.27.0.2  | Leader       | running |  1 |           |
 |     2 | work2-1 | 172.27.0.5  | Sync Standby | running |  1 |         0 |
 |     2 | work2-2 | 172.27.0.7  | Leader       | running |  1 |           |
 +-------+---------+-------------+--------------+---------+----+-----------+
 ```
 ## Documentation
 If you’re ready to get started with Citus or want to know more, we recommend reading the [Citus open source documentation](https://docs.citusdata.com/en/stable/). Or, if you are using Citus on Azure, then the [Azure Cosmos DB for PostgreSQL](https://learn.microsoft.com/azure/cosmos-db/postgresql/introduction) is the place to start.
 Our Citus docs contain comprehensive use case guides on how to build a [multi-tenant SaaS application](https://docs.citusdata.com/en/stable/use_cases/multi_tenant.html), [real-time analytics dashboard]( https://docs.citusdata.com/en/stable/use_cases/realtime_analytics.html), or work with [time series data](https://docs.citusdata.com/en/stable/use_cases/timeseries.html).
 ## Architecture
 A Citus database cluster grows from a single PostgreSQL node into a cluster by adding worker nodes. In a Citus cluster, the original node to which the application connects is referred to as the coordinator node. The Citus coordinator contains both the metadata of distributed tables and reference tables, as well as regular (local) tables, sequences, and other database objects (e.g. foreign tables).
 Data in distributed tables is stored in “shards”, which are actually just regular PostgreSQL tables on the worker nodes. When querying a distributed table on the coordinator node, Citus will send regular SQL queries to the worker nodes. That way, all the usual PostgreSQL optimizations and extensions can automatically be used with Citus.
 ![Citus architecture](images/citus-architecture.png)
 When you send a query in which all (co-located) distributed tables have the same filter on the distribution column, Citus will automatically detect that and send the whole query to the worker node that stores the data. That way, arbitrarily complex queries are supported with minimal routing overhead, which is especially useful for scaling transactional workloads. If queries do not have a specific filter, each shard is queried in parallel, which is especially useful in analytical workloads. The Citus distributed executor is adaptive and is designed to handle both query types at the same time on the same system under high concurrency, which enables large-scale mixed workloads.
 The schema and metadata of distributed tables and reference tables are automatically synchronized to all the nodes in the cluster. That way, you can connect to any node to run distributed queries. Schema changes and cluster administration still need to go through the coordinator.
 Detailed descriptions of the implementation for Citus developers are provided in the [Citus Technical Documentation](src/backend/distributed/README.md).
 ## When to use Citus
 Citus is uniquely capable of scaling both analytical and transactional workloads with up to petabytes of data. Use cases in which Citus is commonly used:
 - **[Customer-facing analytics dashboards](http://docs.citusdata.com/en/stable/use_cases/realtime_analytics.html)**:
  Citus enables you to build analytics dashboards that simultaneously ingest and process large amounts of data in the database and give sub-second response times even with a large number of concurrent users.
  The advanced parallel, distributed query engine in Citus combined with PostgreSQL features such as [array types](https://www.postgresql.org/docs/current/arrays.html), [JSONB](https://www.postgresql.org/docs/current/datatype-json.html), [lateral joins](https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral), and extensions like [HyperLogLog](https://github.com/citusdata/postgresql-hll) and [TopN](https://github.com/citusdata/postgresql-topn) allow you to build responsive analytics dashboards no matter how many customers or how much data you have.
  Example real-time analytics users: [Algolia](https://www.citusdata.com/customers/algolia)
 - **[Time series data](http://docs.citusdata.com/en/stable/use_cases/timeseries.html)**:
  Citus enables you to process and analyze very large amounts of time series data. The biggest Citus clusters store well over a petabyte of time series data and ingest terabytes per day.
  Citus integrates seamlessly with [Postgres table partitioning](https://www.postgresql.org/docs/current/ddl-partitioning.html) and has [built-in functions for partitioning by time](https://www.citusdata.com/blog/2021/10/22/how-to-scale-postgres-for-time-series-data-with-citus/), which can speed up queries and writes on time series tables. You can take advantage of Citus’s parallel, distributed query engine for fast analytical queries, and use the built-in *columnar storage* to compress old partitions.
  Example users: [MixRank](https://www.citusdata.com/customers/mixrank)
 - **[Software-as-a-service (SaaS) applications](http://docs.citusdata.com/en/stable/use_cases/multi_tenant.html)**:
  SaaS and other multi-tenant applications need to be able to scale their database as the number of tenants/customers grows. Citus enables you to transparently shard a complex data model by the tenant dimension, so your database can grow along with your business.
  By distributing tables along a tenant ID column and co-locating data for the same tenant, Citus can horizontally scale complex (tenant-scoped) queries, transactions, and foreign key graphs. Reference tables and distributed DDL commands make database management a breeze compared to manual sharding. On top of that, you have a built-in distributed query engine for doing cross-tenant analytics inside the database.
  Example multi-tenant SaaS users: [Salesloft](https://fivetran.com/case-studies/replicating-sharded-databases-a-case-study-of-salesloft-citus-data-and-fivetran), [ConvertFlow](https://www.citusdata.com/customers/convertflow)
 - **[Microservices](https://docs.citusdata.com/en/stable/get_started/tutorial_microservices.html)**: Citus supports schema based sharding, which allows distributing regular database schemas across many machines. This sharding methodology fits nicely with typical Microservices architecture, where storage is fully owned by the service hence can’t share the same schema definition with other tenants. Citus allows distributing horizontally scalable state across services, solving one of the [main problems](https://stackoverflow.blog/2020/11/23/the-macro-problem-with-microservices/) of microservices.
 - **Geospatial**:
  Because of the powerful [PostGIS](https://postgis.net/) extension to Postgres that adds support for geographic objects into Postgres, many people run spatial/GIS applications on top of Postgres. And since spatial location information has become part of our daily life, well, there are more geospatial applications than ever. When your Postgres database needs to scale out to handle an increased workload, Citus is a good fit.
  Example geospatial users: [Helsinki Regional Transportation Authority (HSL)](https://customers.microsoft.com/story/845146-transit-authority-improves-traffic-monitoring-with-azure-database-for-postgresql-hyperscale), [MobilityDB](https://www.citusdata.com/blog/2020/11/09/analyzing-gps-trajectories-at-scale-with-postgres-mobilitydb/).
 ## Need Help?
 - **Slack**: Ask questions in our Citus community [Slack channel](https://slack.citusdata.com).
 - **GitHub issues**: Please submit issues via [GitHub issues](https://github.com/citusdata/citus/issues).
 - **Documentation**: Our [Citus docs](https://docs.citusdata.com ) have a wealth of resources, including sections on [query performance tuning](https://docs.citusdata.com/en/stable/performance/performance_tuning.html), [useful diagnostic queries](https://docs.citusdata.com/en/stable/admin_guide/diagnostic_queries.html), and [common error messages](https://docs.citusdata.com/en/stable/reference/common_errors.html).
 - **Docs issues**: You can also submit documentation issues via [GitHub issues for our Citus docs](https://github.com/citusdata/citus_docs/issues).
 - **Updates & Release Notes**: Learn about what's new in each Citus version on the [Citus Updates page](https://www.citusdata.com/updates/).
 ## Contributing
 Citus is built on and of open source, and we welcome your contributions. The [CONTRIBUTING.md](CONTRIBUTING.md) file explains how to get started developing the Citus extension itself and our code quality guidelines.
 ## Code of Conduct
 This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
 contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
 ## Stay Connected
 - **Twitter**: Follow us [@citusdata](https://twitter.com/citusdata) to track the latest posts & updates on what’s happening.
 - **Citus Blog**: Read our popular [Citus Open Source Blog](https://www.citusdata.com/blog/) for posts about PostgreSQL and Citus.
 - **Citus Newsletter**: Subscribe to our monthly technical [Citus Newsletter](https://www.citusdata.com/join-newsletter) to get a curated collection of our favorite posts, videos, docs, talks, & other Postgres goodies.
 - **Slack**: Our [Citus Public slack](https://slack.citusdata.com/) is a good way to stay connected, not just with us but with other Citus users.
 - **Sister Blog**: Read the PostgreSQL posts on the [Azure Cosmos DB for PostgreSQL blog](https://devblogs.microsoft.com/cosmosdb/category/postgresql/) about our managed service on Azure.
 - **Videos**: Check out this [YouTube playlist](https://www.youtube.com/playlist?list=PLixnExCn6lRq261O0iwo4ClYxHpM9qfVy) of some of our favorite Citus videos and demos. If you want to deep dive into how Citus extends PostgreSQL, you might want to check out Marco Slot’s talk at Carnegie Mellon titled [Citus: Distributed PostgreSQL as an Extension](https://youtu.be/X-aAgXJZRqM) that was part of Andy Pavlo’s Vaccination Database Talks series at CMUDB.
 - **Our other Postgres projects**: Our team also works on other awesome PostgreSQL open source extensions & projects, including: [pg_cron](https://github.com/citusdata/pg_cron), [HyperLogLog](https://github.com/citusdata/postgresql-hll), [TopN](https://github.com/citusdata/postgresql-topn), [pg_auto_failover](https://github.com/citusdata/pg_auto_failover), [activerecord-multi-tenant](https://github.com/citusdata/activerecord-multi-tenant), and [django-multitenant](https://github.com/citusdata/django-multitenant).
 ___
-Copyright © Citus Data, Inc.
+Copyright © 2012–2016 Citus Data, Inc.
 [faq]: https://www.citusdata.com/frequently-asked-questions
 [linux_install]: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-compose-on-ubuntu-14-04
 [mac_install]: https://www.docker.com/products/docker-toolbox
 [tutorial]: http://docs.citusdata.com/en/v5.1/tutorials/tut-hash-distribution.html
--- a/SECURITY.md
+++ b/SECURITY.md
@ -1,41 +0,0 @@
 <!-- BEGIN MICROSOFT SECURITY.MD V0.0.8 BLOCK -->
 ## Security
 Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
 If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below.
 ## Reporting Security Issues
 **Please do not report security vulnerabilities through public GitHub issues.**
 Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report).
 If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey).
 You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc).
 Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
  * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
  * Full paths of source file(s) related to the manifestation of the issue
  * The location of the affected source code (tag/branch/commit or direct URL)
  * Any special configuration required to reproduce the issue
  * Step-by-step instructions to reproduce the issue
  * Proof-of-concept or exploit code (if possible)
  * Impact of the issue, including how an attacker might exploit the issue
 This information will help us triage your report more quickly.
 If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs.
 ## Preferred Languages
 We prefer all communications to be in English.
 ## Policy
 Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd).
 <!-- END MICROSOFT SECURITY.MD BLOCK -->
--- a/STYLEGUIDE.md
+++ b/STYLEGUIDE.md
@ -1,160 +0,0 @@
 # Coding style
 The existing code-style in our code-base is not super consistent. There are multiple reasons for that. One big reason is because our code-base is relatively old and our standards have changed over time. The second big reason is that our style-guide is different from style-guide of Postgres and some code is copied from Postgres source code and is slightly modified. The below rules are for new code. If you're changing existing code that uses a different style, use your best judgement to decide if you use the rules here or if you match the existing style.
 ## Using citus_indent
 CI pipeline will automatically reject any PRs which do not follow our coding
 conventions. The easiest way to ensure your PR adheres to those conventions is
 to use the [citus_indent](https://github.com/citusdata/tools/tree/develop/uncrustify)
 tool. This tool uses `uncrustify` under the hood.
 ```bash
 # Uncrustify changes the way it formats code every release a bit. To make sure
 # everyone formats consistently we use version 0.68.1:
 curl -L https://github.com/uncrustify/uncrustify/archive/uncrustify-0.68.1.tar.gz | tar xz
 cd uncrustify-uncrustify-0.68.1/
 mkdir build
 cd build
 cmake ..
 make -j5
 sudo make install
 cd ../..
 git clone https://github.com/citusdata/tools.git
 cd tools
 make uncrustify/.install
 ```
 Once you've done that, you can run the `make reindent` command from the top
 directory to recursively check and correct the style of any source files in the
 current directory. Under the hood, `make reindent` will run `citus_indent` and
 some other style corrections for you.
 You can also run the following in the directory of this repository to
 automatically format all the files that you have changed before committing:
 ```bash
 cat > .git/hooks/pre-commit << __EOF__
 #!/bin/bash
 citus_indent --check --diff || { citus_indent --diff; exit 1; }
 __EOF__
 chmod +x .git/hooks/pre-commit
 ```
 ## Other rules we follow that citus_indent does not enforce
 * We almost always use **CamelCase**, when naming functions, variables etc., **not snake_case**.
 * We also have the habits of using a **lowerCamelCase** for some variables named from their type or from their function name, as shown in the examples:
  ```c
  bool IsCitusExtensionLoaded = false;
  bool
  IsAlterTableRenameStmt(RenameStmt *renameStmt)
  {
    AlterTableCmd *alterTableCommand = NULL;
    ..
    ..
    bool isAlterTableRenameStmt = false;
    ..
  }
  ```
 * We **start functions with a comment**:
  ```c
  /*
   * MyNiceFunction <something in present simple tense, e.g., processes / returns / checks / takes X as input / does Y> ..
   * <some more nice words> ..
   * <some more nice words> ..
   */
  <static?> <return type>
  MyNiceFunction(..)
  {
    ..
    ..
  }
  ```
 * `#includes` needs to be sorted based on below ordering and then alphabetically and we should not include what we don't need in a file:
  * System includes (eg. #include<...>)
  * Postgres.h (eg. #include "postgres.h")
  * Toplevel imports from postgres, not contained in a directory (eg. #include "miscadmin.h")
  * General postgres includes (eg . #include "nodes/...")
  * Toplevel citus includes, not contained in a directory (eg. #include "citus_verion.h")
  * Columnar includes (eg. #include "columnar/...")
  * Distributed includes (eg. #include "distributed/...")
 * Comments:
  ```c
  /* single line comments start with a lower-case */
  /*
   * We start multi-line comments with a capital letter
   * and keep adding a star to the beginning of each line
   * until we close the comment with a star and a slash.
   */
  ```
 * Order of function implementations and their declarations in a file:
  We define static functions after the functions that call them. For example:
  ```c
  #include<..>
  #include<..>
  ..
  ..
  typedef struct
  {
    ..
    ..
  } MyNiceStruct;
  ..
  ..
  PG_FUNCTION_INFO_V1(my_nice_udf1);
  PG_FUNCTION_INFO_V1(my_nice_udf2);
  ..
  ..
  // ..  somewhere on top of the file …
  static void MyNiceStaticlyDeclaredFunction1(…);
  static void MyNiceStaticlyDeclaredFunction2(…);
  ..
  ..
  void
  MyNiceFunctionExternedViaHeaderFile(..)
  {
    ..
    ..
    MyNiceStaticlyDeclaredFunction1(..);
    ..
    ..
    MyNiceStaticlyDeclaredFunction2(..);
    ..
  }
  ..
  ..
  // we define this first because it's called by MyNiceFunctionExternedViaHeaderFile()
  // before MyNiceStaticlyDeclaredFunction2()
  static void
  MyNiceStaticlyDeclaredFunction1(…)
  {
  }
  ..
  ..
  // then we define this
  static void
  MyNiceStaticlyDeclaredFunction2(…)
  {
  }
  ```
--- a/aclocal.m4
+++ b/aclocal.m4
@ -1,2 +0,0 @@
 dnl aclocal.m4
 m4_include([config/general.m4])
--- a/autogen.sh
+++ b/autogen.sh
@ -1,6 +1,6 @@
 #!/bin/bash
 #
-# autogen.sh converts configure.ac to configure and creates
+# autogen.sh converts configure.in to configure and creates
 # citus_config.h.in. The resuting resulting files are checked into
 # the SCM, to avoid everyone needing autoconf installed.
--- a/cgmanifest.json
+++ b/cgmanifest.json
@ -1,47 +0,0 @@
 {
    "Registrations": [
        {
            "Component": {
                "Type": "git",
                "git": {
                    "RepositoryUrl": "https://github.com/intel/safestringlib",
                    "CommitHash": "245c4b8cff1d2e7338b7f3a82828fc8e72b29549"
                }
            },
            "DevelopmentDependency": false
        },
        {
            "Component": {
                "Type": "git",
                "git": {
                    "RepositoryUrl": "https://github.com/postgres/postgres",
                    "CommitHash": "29be9983a64c011eac0b9ee29895cce71e15ea77"
                }
            },
            "license": "PostgreSQL",
            "licenseDetail": [
 				"Portions Copyright (c) 1996-2010, The PostgreSQL Global Development Group",
 				"",
 				"Portions Copyright (c) 1994, The Regents of the University of California",
                "",
                "Permission to use, copy, modify, and distribute this software and its documentation for ",
                "any purpose, without fee, and without a written agreement is hereby granted, provided ",
                "that the above copyright notice and this paragraph and the following two paragraphs appear ",
                "in all copies.",
                "",
                "IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, ",
                "INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS ",
                "SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE ",
                "POSSIBILITY OF SUCH DAMAGE.",
                "",
                "THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, ",
                "THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED ",
                "HEREUNDER IS ON AN \"AS IS\" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE ",
                "MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS."
            ],
            "version": "0.0.1",
            "DevelopmentDependency": false
        }
    ]
 }
--- a/ci/README.md
+++ b/ci/README.md
@ -1,402 +0,0 @@
 # CI scripts
 We have a few scripts that we run in CI to confirm that code confirms to our
 standards. Be sure you have followed the setup in the [Following our coding
 conventions](https://github.com/citusdata/citus/blob/master/CONTRIBUTING.md#following-our-coding-conventions)
 section of `CONTRIBUTING.md`. Once you've done that, most of them should be
 fixed automatically, when running:
 ```
 make reindent
 ```
 See the sections below for details on what a specific failing script means.
 ## `citus_indent`
 We format all our code using the coding conventions in the
 [citus_indent](https://github.com/citusdata/tools/tree/develop/uncrustify)
 tool. This tool uses `uncrustify` under the hood. See [Following our coding
 conventions](https://github.com/citusdata/citus/blob/master/CONTRIBUTING.md#following-our-coding-conventions) on how to install this.
 ## `editorconfig.sh`
 You should install the Editorconfig plugin for your editor/IDE
 https://editorconfig.org/
 ## `banned.h.sh`
 You're using a C library function that is banned by Microsoft, mostly because of
 risk for buffer overflows. This page lists the Microsoft suggested replacements:
 https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10082#guide
 These replacements are only available on Windows normally. Since we build for
 Linux we make most of them available with this header file:
 ```c
 #include "distributed/citus_safe_lib.h"
 ```
 This uses https://github.com/intel/safestringlib to provide them.
 However, still not all of them are available. For those cases we provide
 some extra functions in `citus_safe_lib.h`, with similar functionality.
 If none of those replacements match your requirements you have to do one of the
 following:
 1. Add a replacement to `citus_safe_lib.{c,h}` that handles the same error cases
   that the `{func_name}_s` function that Microsoft suggests.
 2. Add a `/* IGNORE-BANNED */` comment to the line that complains. Doing this
   requires also adding a comment before explaining why this specific use of the
   function is safe.
 ## `build-citus.sh`
 This is the script used during the build phase of the extension. Historically this script
 was embedded in the docker images. This made maintenance a hassle. Now it lives in tree
 with the rest of the source code.
 When this script fails you most likely have a build error on the postgres version it was
 building at the time of the failure. Fix the compile error and push a new version of your
 code to fix.
 ## `check_enterprise_merge.sh`
 This check exists to make sure that we can always merge the `master` branch of
 `community` into the `enterprise-master` branch of the `enterprise` repo.
 There are two conditions in which this check passes:
 1. There are no merge conflicts between your PR branch and `enterprise-master` and after this merge the code compiles.
 2. There are merge conflicts, but there is a branch with the same name in the
   enterprise repo that:
   1. Contains the last commit of the community branch with the same name.
   2. Merges cleanly into `enterprise-master`
 3. After merging, the code can be compiled.
 If the job already passes, you are done, nothing further required! Otherwise
 follow the below steps.
 ### Prerequisites
 Before continuing with the real steps make sure you have done the following
 (this only needs to be done once):
 1. You have enabled `git rerere` in globally or in your enterprise repo
   ([docs](https://git-scm.com/docs/git-rerere), [very useful blog](https://medium.com/@porteneuve/fix-conflicts-only-once-with-git-rerere-7d116b2cec67#.3vui844dt)):
   ```bash
   # Enables it globally for all repos
   git config --global rerere.enabled true
   # Enables it only for the enterprise repo
   cd <enterprise-repo>
   git config rerere.enabled true
   ```
 2. You have set up the `community` remote on your enterprise as
   [described in CONTRIBUTING.md](https://github.com/citusdata/citus-enterprise/blob/enterprise-master/CONTRIBUTING.md#merging-community-changes-onto-enterprise).
 #### Important notes on `git rerere`
 This is very useful as it will make sure git will automatically redo merges that
 you have done before. However, this has a downside too. It will also redo merges
 that you did, but that were incorrect. Two work around this you can use these
 commands.
 1. Make `git rerere` forget a merge:
   ```bash
   git rerere forget <badly_merged_file>
   ```
 2. During conflict resolution where `git rerere` already applied the bad merge,
   simply forgetting it is not enough. Since it is already applied. In that case
   you also have to undo the apply using:
   ```bash
   git checkout --conflict=merge <badly_merged_file>
   ```
 ### Actual steps
 After the prerequisites are met we continue on to the real steps. Say your
 branch name is `$PR_BRANCH`, we will refer to `$PR_BRANCH` on community as
 `community/$PR_BRANCH` and on enterprise as `enterprise/$PR_BRANCH`. First make
 sure these two things are the case:
 1. Get approval from your reviewer for `community/$PR_BRANCH`. Only follow the
   next steps after you are about to merge the branch to community master.
 2. Make sure your commits are in a nice state, since you should not do
   "squash and merge" on Github later. Otherwise you will certainly get
   duplicate commits and possibly get merge conflicts with enterprise again.
 Once that's done, you need to create a merged version of your PR branch on the
 enterprise repo. For example if `community` is added as a remote in
 your enterprise repo, you can do the following:
 ```bash
 export PR_BRANCH=<YOUR BRANCHNAME OF THE PR HERE>
 git checkout enterprise-master
 git pull # Make sure your local enterprise-master is up to date
 git fetch community # Fetch your up to date branch name
 git checkout -b "$PR_BRANCH" enterprise-master
 ```
 Now you have X in your enterprise repo, which we refer to as
 `enterprise/$PR_BRANCH` (even though in git commands you would reference it as
 `origin/$PR_BRANCH`). This branch is currently the same as `enterprise-master`.
 First to make review easier, you should merge community master into it. This
 should apply without any merge conflicts:
 ```bash
 git merge community/master
 ```
 Now you need to merge `community/$PR_BRANCH` to `enterprise/$PR_BRANCH`. Solve
 any conflicts and make sure to remove any parts that should not be in enterprise
 even though it doesn't have a conflict, on enterprise repository:
 ```bash
 git merge "community/$PR_BRANCH"
 ```
 1. You should push this branch to the enterprise repo. This is so that the job
   on community will see this branch.
 2. Wait until tests on `enterprise/$PR_BRANCH` pass.
 3. Create a PR on the enterprise repo for your `enterprise/$PR_BRANCH` branch.
 4. You should get approval for the merge conflict changes on
   `enterprise/$PR_BRANCH`, preferably from the same reviewer as they are
   familiar with the change.
 5. You should rerun the `check-merge-to-enterprise` check on
   `community/$PR_BRANCH`. You can use re-run from failed option in circle CI.
 6. You can now merge the PR on community. Be sure to NOT use "squash and merge",
   but instead use the regular "merge commit" mode.
 7. You can now merge the PR on enterprise. Be sure to NOT use "squash and merge",
   but instead use the regular "merge commit" mode.
 The subsequent PRs on community will be able to pass the
 `check-merge-to-enterprise` check as long as they don't have a conflict with
 `enterprise-master`.
 ### What to do when your branch got outdated?
 So there's one issue that can occur. Your branch will become outdated with
 master and you have to make it up to date. There are two ways to do this using
 `git merge` or `git rebase`. As usual, `git merge` is a bit easier than `git
 rebase`, but clutters git history. This section will explain both. If you don't
 know which one makes the most sense, start with `git rebase`. It's possible that
 for whatever reason this doesn't work or becomes very complex, for instance when
 new merge conflicts appear. Feel free to fall back to `git merge` in that case,
 by using `git rebase --abort`.
 #### Updating both branches with `git rebase`
 In the community repo, first update the outdated branch using `rebase`:
 ```bash
 git checkout $PR_BRANCH
 # Keep a backup in case you want to fallback to the merge approach
 git checkout -b ${PR_BRANCH}-backup
 git checkout $PR_BRANCH
 # Actually update the branch
 git fetch origin
 git rebase origin/master
 git push origin $PR_BRANCH --force-with-lease
 ```
 In the enterprise repo, rebase onto the new community branch with
 `--preserve-merges`:
 ```bash
 git checkout $PR_BRANCH
 git fetch community
 git rebase community/$PR_BRANCH --preserve-merges
 ```
 Automatic merge might have failed with the above command. However, because of
 `git rerere` it should have re-applied your original merge resolution. If this
 is indeed the case it should show something like this in the output of the
 previous command (note the `Resolved ...` line):
 ```
 CONFLICT (content): Merge conflict in <file_path>
 Resolved '<file_path>' using previous resolution.
 Automatic merge failed; fix conflicts and then commit the result.
 Error redoing merge <merge_sha>
 ```
 Confirm that the merge conflict is indeed resolved correctly. In that case you
 can do the following:
 ```bash
 # Add files that were conflicting
 git add "$(git diff --name-only --diff-filter=U)"
 git rebase --continue
 ```
 Before pushing you should do a final check that the commit hash of your final
 non merge commit matches the commit hash that's on the community repo. If that's
 not the case, you should fallback to the `git merge` approach.
 ```bash
 git reset origin/$PR_BRANCH --hard
 ```
 If the commit hashes were as expected, push the branch:
 ```bash
 git push origin $PR_BRANCH --force-with-lease
 ```
 #### Updating both branches with `git merge`
 If you are falling back to the `git merge` approach after trying the
 `git rebase` approach, you should first restore the original branch on the
 community repo.
 ```bash
 git checkout $PR_BRANCH
 git reset ${PR_BRANCH}-backup --hard
 git push origin $PR_BRANCH --force-with-lease
 ```
 In the community repo, first update the outdated branch using `merge`:
 ```bash
 git checkout $PR_BRANCH
 git fetch origin
 git merge origin/master
 git push origin $PR_BRANCH
 ```
 In the enterprise repo, merge with the updated `community/$PR_BRANCH`:
 ```bash
 git checkout $PR_BRANCH
 git fetch community
 git merge community/$PR_BRANCH
 git push origin $PR_BRANCH
 ```
 ## `check_sql_snapshots.sh`
 To allow for better diffs during review we have snapshots of SQL UDFs. This
 means that `latest.sql` is not up to date with the SQL file of the highest
 version number in the directory. The output of the script shows you what is
 different.
 ## `check_all_tests_are_run.sh`
 A test should always be included in a schedule file, otherwise it will not be
 run in CI. This is most commonly forgotten for newly added tests. In that case
 the dev ran it locally without running a full schedule with something like:
 ```bash
 make -C src/test/regress/ check-minimal EXTRA_TESTS='multi_create_table_new_features'
 ```
 ## `check_all_ci_scripts_are_run.sh`
 This is the meta CI script. This checks that all existing CI scripts are
 actually run in CI. This is most commonly forgotten for newly added CI tests
 that the developer only ran locally. It also checks that all CI scripts have a
 section in this `README.md` file and that they include `ci/ci_helpers.sh`.
 ## `check_migration_files.sh`
 A branch that touches a set of upgrade scripts is also expected to touch
 corresponding downgrade scripts as well. If this script fails, read the output
 and make sure you update the downgrade scripts in the printed list. If you
 really don't need a downgrade to run any SQL. You can write a comment in the
 file explaining why a downgrade step is not necessary.
 ## `disallow_c_comments_in_migrations.sh`
 We do not use C-style comments in migration files as the stripped
 zero-length migration files cause warning during packaging.
 Instead use SQL type comments, i.e:
 ```
 -- this is a comment
 ```
 See [#3115](https://github.com/citusdata/citus/pull/3115) for more info.
 ## `disallow_hash_comments_in_spec_files.sh`
 We do not use comments starting with # in spec files because it creates errors
 from C preprocessor that expects directives after this character.
 Instead use C type comments, i.e:
 ```
 // this is a single line comment
 /*
 * this is a multi line comment
 */
 ```
 ## `disallow_long_changelog_entries.sh`
 Having changelog items with entries that are longer than 80 characters are
 forbidden. It's allowed to split up the entry over multiple lines, as long as
 each line of the entry is 80 characters or less.
 ## `normalize_expected.sh`
 All files in `src/test/expected` should be committed in normalized form.
 This error mostly happens if someone added a new normalization rule and you have
 not rerun tests that you have added.
 We normalize the test output files using a `sed` script called
 [`normalize.sed`](https://github.com/citusdata/citus/blob/master/src/test/regress/bin/normalize.sed).
 The reason for this is that some output changes randomly in ways we don't care
 about. An example of this is when an error happens on a different port number,
 or a different worker shard, or a different placement, etc. Either randomly or
 because we are running the tests in a slightly different configuration.
 ## `remove_useless_declarations.sh`
 This script tries to make sure that we don't add useless declarations to our
 code. What it effectively does is replace this:
 ```c
 int a = 0;
 int b = 2;
 Assert(b == 2);
 a = b + b;
 ```
 With this equivalent, but shorter version:
 ```c
 int b = 2;
 Assert(b == 2);
 int a = b + b;
 ```
 It relies on the fact that `citus_indent` formats our code in certain ways. So
 before running this script, make sure that you've done that.
 This replacement is all done using a [regex replace](xkcd.com/1171), so it's
 definitely possible there's a bug in there. So far no bad ones have been found.
 A known issue is that it does not replace code in a block after an `#ifdef` like
 this.
 ```c
 int foo = 0;
 #ifdef SOMETHING
 foo = 1
 #else
 foo = 2
 #endif
 ```
 This was deemed to be error prone and not worth the effort.
 ## `fix_gitignore.sh`
 This script checks and fixes issues with `.gitignore` rules:
 1. Makes sure we do not commit any generated files that should be ignored. If there is an
   ignored file in the git tree, the user is expected to review the files that are removed
   from the git tree and commit them.
 ## `check_gucs_are_alphabetically_sorted.sh`
 This script checks the order of the GUCs defined in `shared_library_init.c`.
 To solve this failure, please check `shared_library_init.c` and make sure that the GUC
 definitions are in alphabetical order.
 ## `print_stack_trace.sh`
 This script prints stack traces for failed tests, if they left core files.
 ## `sort_and_group_includes.sh`
 This script checks and fixes issues with include grouping and sorting in C files.
 Includes are grouped in the following groups:
 - System includes (eg. `#include <math>`)
 - Postgres.h include (eg. `#include "postgres.h"`)
 - Toplevel postgres includes (includes not in a directory eg. `#include "miscadmin.h`)
 - Postgres includes in a directory (eg. `#include "catalog/pg_type.h"`)
 - Toplevel citus includes (includes not in a directory eg. `#include "pg_version_constants.h"`)
 - Columnar includes (eg. `#include "columnar/columnar.h"`)
 - Distributed includes (eg. `#include "distributed/maintenanced.h"`)
 Within every group the include lines are sorted alphabetically.
--- a/ci/banned.h.sh
+++ b/ci/banned.h.sh
@ -1,56 +0,0 @@
 #!/bin/bash
 # Checks for the APIs that are banned by microsoft. Since we compile for Linux
 # we use the replacements from https://github.com/intel/safestringlib
 # Not all replacement functions are available in safestringlib. If it doesn't
 # exist and you cannot rewrite the code to not use the banned API, then you can
 # add a comment containing "IGNORE-BANNED" to the line where the error is and
 # this check will ignore that match.
 #
 # The replacement function that you should use are listed here:
 # https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10082#guide
 set -eu
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 files=$(find src -iname '*.[ch]' | git check-attr --stdin citus-style | grep -v ': unset$' | sed 's/: citus-style: set$//')
 # grep is allowed to fail, that means no banned matches are found
 set +e
 # Required banned from banned.h. These functions are not allowed to be used at
 # all.
 # shellcheck disable=SC2086
 grep -E '\b(strcpy|strcpyA|strcpyW|wcscpy|_tcscpy|_mbscpy|StrCpy|StrCpyA|StrCpyW|lstrcpy|lstrcpyA|lstrcpyW|_tccpy|_mbccpy|_ftcscpy|strcat|strcatA|strcatW|wcscat|_tcscat|_mbscat|StrCat|StrCatA|StrCatW|lstrcat|lstrcatA|lstrcatW|StrCatBuff|StrCatBuffA|StrCatBuffW|StrCatChainW|_tccat|_mbccat|_ftcscat|sprintfW|sprintfA|wsprintf|wsprintfW|wsprintfA|sprintf|swprintf|_stprintf|wvsprintf|wvsprintfA|wvsprintfW|vsprintf|_vstprintf|vswprintf|strncpy|wcsncpy|_tcsncpy|_mbsncpy|_mbsnbcpy|StrCpyN|StrCpyNA|StrCpyNW|StrNCpy|strcpynA|StrNCpyA|StrNCpyW|lstrcpyn|lstrcpynA|lstrcpynW|strncat|wcsncat|_tcsncat|_mbsncat|_mbsnbcat|StrCatN|StrCatNA|StrCatNW|StrNCat|StrNCatA|StrNCatW|lstrncat|lstrcatnA|lstrcatnW|lstrcatn|gets|_getts|_gettws|IsBadWritePtr|IsBadHugeWritePtr|IsBadReadPtr|IsBadHugeReadPtr|IsBadCodePtr|IsBadStringPtr|memcpy|RtlCopyMemory|CopyMemory|wmemcpy|lstrlen)\(' $files \
    | grep -v "IGNORE-BANNED" \
    && echo "ERROR: Required banned API usage detected" && exit 1
 # Required banned from table on liquid. These functions are not allowed to be
 # used at all.
 # shellcheck disable=SC2086
 grep -E  '\b(strcat|strcpy|strerror|strncat|strncpy|strtok|wcscat|wcscpy|wcsncat|wcsncpy|wcstok|fprintf|fwprintf|printf|snprintf|sprintf|swprintf|vfprintf|vprintf|vsnprintf|vsprintf|vswprintf|vwprintf|wprintf|fscanf|fwscanf|gets|scanf|sscanf|swscanf|vfscanf|vfwscanf|vscanf|vsscanf|vswscanf|vwscanf|wscanf|asctime|atof|atoi|atol|atoll|bsearch|ctime|fopen|freopen|getenv|gmtime|localtime|mbsrtowcs|mbstowcs|memcpy|memmove|qsort|rewind|setbuf|wmemcpy|wmemmove)\(' $files \
    | grep -v "IGNORE-BANNED" \
    && echo "ERROR: Required banned API usage from table detected" && exit 1
 # Recommended banned from banned.h. If you can change the code not to use these
 # that would be great. You can use IGNORE-BANNED if you need to use it anyway.
 # You can also remove it from the regex, if you want to mark the API as allowed
 # throughout the codebase (to not have to add IGNORED-BANNED everywhere). In
 # that case note it in this comment that you did so.
 # shellcheck disable=SC2086
 grep -E '\b(wnsprintf|wnsprintfA|wnsprintfW|_snwprintf|_snprintf|_sntprintf|_vsnprintf|vsnprintf|_vsnwprintf|_vsntprintf|wvnsprintf|wvnsprintfA|wvnsprintfW|strtok|_tcstok|wcstok|_mbstok|makepath|_tmakepath| _makepath|_wmakepath|_splitpath|_tsplitpath|_wsplitpath|scanf|wscanf|_tscanf|sscanf|swscanf|_stscanf|snscanf|snwscanf|_sntscanf|_itoa|_itow|_i64toa|_i64tow|_ui64toa|_ui64tot|_ui64tow|_ultoa|_ultot|_ultow|CharToOem|CharToOemA|CharToOemW|OemToChar|OemToCharA|OemToCharW|CharToOemBuffA|CharToOemBuffW|alloca|_alloca|ChangeWindowMessageFilter)\(' $files  \
    | grep -v "IGNORE-BANNED" \
    && echo "ERROR: Recomended banned API usage detected" && exit 1
 # Recommended banned from table on liquid. If you can change the code not to use these
 # that would be great. You can use IGNORE-BANNED if you need to use it anyway.
 # You can also remove it from the regex, if you want to mark the API as allowed
 # throughout the codebase (to not have to add IGNORED-BANNED everywhere). In
 # that case note it in this comment that you did so.
 # Banned APIs ignored throughout the codebase:
 # - strlen
 # shellcheck disable=SC2086
 grep -E '\b(alloca|getwd|mktemp|tmpnam|wcrtomb|wcrtombs|wcslen|wcsrtombs|wcstombs|wctomb|class_addMethod|class_replaceMethod)\(' $files  \
    | grep -v "IGNORE-BANNED" \
    && echo "ERROR: Recomended banned API usage detected" && exit 1
 exit 0
--- a/ci/build-citus.sh
+++ b/ci/build-citus.sh
@ -1,44 +0,0 @@
 #!/bin/bash
 # make bash behave
 set -euo pipefail
 IFS=$'\n\t'
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # read pg major version, error if not provided
 PG_MAJOR=${PG_MAJOR:?please provide the postgres major version}
 # get codename from release file
 . /etc/os-release
 codename=${VERSION#*(}
 codename=${codename%)*}
 # we'll do everything with absolute paths
 basedir="$(pwd)"
 # get the project and clear out the git repo (reduce workspace size
 rm -rf "${basedir}/.git"
 build_ext() {
  pg_major="$1"
  builddir="${basedir}/build-${pg_major}"
  echo "Beginning build for PostgreSQL ${pg_major}..." >&2
  # do everything in a subdirectory to avoid clutter in current directory
  mkdir -p "${builddir}" && cd "${builddir}"
  CFLAGS=-Werror "${basedir}/configure" PG_CONFIG="/usr/lib/postgresql/${pg_major}/bin/pg_config" --enable-coverage --with-security-flags
  installdir="${builddir}/install"
  make -j$(nproc) && mkdir -p "${installdir}" && { make DESTDIR="${installdir}" install-all || make DESTDIR="${installdir}" install ; }
  cd "${installdir}" && find . -type f -print > "${builddir}/files.lst"
  tar cvf "${basedir}/install-${pg_major}.tar" `cat ${builddir}/files.lst`
  cd "${builddir}" && rm -rf install files.lst && make clean
 }
 build_ext "${PG_MAJOR}"
--- a/ci/check_all_ci_scripts_are_run.sh
+++ b/ci/check_all_ci_scripts_are_run.sh
@ -1,29 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # 1. Find all *.sh files in the ci directory
 # 2. Strip the directory
 # 3. Exclude some scripts that we should not run in CI directly
 ci_scripts=$(
    find ci/ -iname "*.sh" |
    sed -E 's#^ci/##g' |
    grep -v -E '^(ci_helpers.sh|fix_style.sh)$'
 )
 for script in $ci_scripts; do
    if ! grep "\\bci/$script\\b" -r .github > /dev/null; then
        echo "ERROR: CI script with name \"$script\" is not actually used in .github folder"
        exit 1
    fi
    if ! grep "^## \`$script\`\$" ci/README.md > /dev/null; then
        echo "ERROR: CI script with name \"$script\" does not have a section in ci/README.md"
        exit 1
    fi
    if ! grep "source ci/ci_helpers.sh" "ci/$script" > /dev/null; then
        echo "ERROR: CI script with name \"$script\" does not include ci/ci_helpers.sh"
        exit 1
    fi
 done
--- a/ci/check_all_tests_are_run.sh
+++ b/ci/check_all_tests_are_run.sh
@ -1,24 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 cd src/test/regress
 # 1. Find all *.sql and *.spec files in the sql, and spec directories
 # 2. Strip the extension and the directory
 # 3. Ignore names that end with .include, those files are meant to be in an C
 #    preprocessor #include statement. They should not be in schedules.
 test_names=$(
    find sql spec -iname "*.sql" -o -iname "*.spec" |
    sed -E 's#^\w+/([^/]+)\.[^.]+$#\1#g' |
    grep -v '.include$'
 )
 for name in $test_names; do
    if ! grep "\\b$name\\b" ./*_schedule > /dev/null; then
        echo "ERROR: Test with name \"$name\" is not used in any of the schedule files"
        exit 1
    fi
 done
--- a/ci/check_gucs_are_alphabetically_sorted.sh
+++ b/ci/check_gucs_are_alphabetically_sorted.sh
@ -1,25 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # Find the line that exactly matches "RegisterCitusConfigVariables(void)" in
 # shared_library_init.c. grep command returns something like
 # "934:RegisterCitusConfigVariables(void)" and we extract the line number
 # with cut.
 RegisterCitusConfigVariables_begin_linenumber=$(grep -n "^RegisterCitusConfigVariables(void)$" src/backend/distributed/shared_library_init.c | cut -d: -f1)
 # Consider the lines starting from $RegisterCitusConfigVariables_begin_linenumber,
 # grep the first line that starts with "}" and extract the line number with cut
 # as in the previous step.
 RegisterCitusConfigVariables_length=$(tail -n +$RegisterCitusConfigVariables_begin_linenumber src/backend/distributed/shared_library_init.c | grep -n -m 1 "^}$" | cut -d: -f1)
 # extract the function definition of RegisterCitusConfigVariables into a temp file
 tail -n +$RegisterCitusConfigVariables_begin_linenumber src/backend/distributed/shared_library_init.c | head -n $(($RegisterCitusConfigVariables_length)) > RegisterCitusConfigVariables_func_def.out
 # extract citus gucs in the form of <tab><tab>"citus.X"
 grep -P "^[\t][\t]\"citus\.[a-zA-Z_0-9]+\"" RegisterCitusConfigVariables_func_def.out > gucs.out
 LC_COLLATE=C sort -c gucs.out
 rm gucs.out
 rm RegisterCitusConfigVariables_func_def.out
--- a/ci/check_migration_files.sh
+++ b/ci/check_migration_files.sh
@ -1,33 +0,0 @@
 #! /bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # This file checks for the existence of downgrade scripts for every upgrade script that is changed in the branch.
 # create list of migration files for upgrades
 upgrade_files=$(git diff --name-only origin/main | { grep "src/backend/distributed/sql/citus--.*sql" || exit 0 ; })
 downgrade_files=$(git diff --name-only origin/main | { grep "src/backend/distributed/sql/downgrades/citus--.*sql" || exit 0 ; })
 ret_value=0
 for file in $upgrade_files
 do
    # There should always be 2 matches, and no need to avoid splitting here
    # shellcheck disable=SC2207
    versions=($(grep --only-matching --extended-regexp "[0-9]+\.[0-9]+[-.][0-9]+" <<< "$file"))
    from_version=${versions[0]};
    to_version=${versions[1]};
    downgrade_migration_file="src/backend/distributed/sql/downgrades/citus--$to_version--$from_version.sql"
    # check for the existence of migration scripts
    if [[ $(grep --line-regexp --count "$downgrade_migration_file" <<< "$downgrade_files") == 0 ]]
    then
        echo "$file is updated, but $downgrade_migration_file is not updated in branch"
        ret_value=1
    fi
 done
 exit $ret_value;
--- a/ci/check_sql_snapshots.sh
+++ b/ci/check_sql_snapshots.sh
@ -1,20 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 for udf_dir in src/backend/distributed/sql/udfs/* src/backend/columnar/sql/udfs/*; do
    # We want to find the last snapshotted sql file, to make sure it's the same
    # as "latest.sql". This is done by:
    # 1. Getting the filenames in the UDF directory (using find instead of ls, to keep shellcheck happy)
    # 2. Filter out latest.sql
    # 3. Sort using "version sort"
    # 4. Get the last one using tail
    latest_snapshot=$(\
        find "$udf_dir" -iname "*.sql" -exec basename {} \; \
        | { grep --invert-match latest.sql || true; } \
        | sort --version-sort \
        | tail --lines 1);
    diff --unified --color=auto "$udf_dir/latest.sql" "$udf_dir/$latest_snapshot"; \
 done
--- a/ci/ci_helpers.sh
+++ b/ci/ci_helpers.sh
@ -1,32 +0,0 @@
 #!/bin/bash
 # For echo commands "set -x" would show the message effectively twice. Once as
 # part of the echo command shown by "set -x" and once because of the output of
 # the echo command. We do not want "set -x" to show the echo command. We only
 # want to see the actual message in the output of echo itself. This function is
 # a trick to do so. Read the StackOverflow post below to understand why this
 # works and what this works around.
 # Source: https://superuser.com/a/1141026/242593
 shopt -s expand_aliases
 alias echo='{ save_flags="$-"; set +x;} 2> /dev/null && echo_and_restore'
 echo_and_restore() {
        builtin echo "$*"
        #shellcheck disable=SC2154
        case "$save_flags" in
         (*x*)  set -x
        esac
 }
 # Make sure that on a failing exit we show a useful message
 hint_on_fail() {
    exit_code=$?
    # Get filename of the currently running script
    # Source: https://stackoverflow.com/a/192337/2570866
    filename=$(basename "$0")
    if [ $exit_code != 0 ]; then
        echo "HINT: To solve this failure look here: https://github.com/citusdata/citus/blob/master/ci/README.md#$filename"
    fi
    exit $exit_code
 }
 trap hint_on_fail EXIT
--- a/ci/disallow_c_comments_in_migrations.sh
+++ b/ci/disallow_c_comments_in_migrations.sh
@ -1,32 +0,0 @@
 #! /bin/bash
 set -euo pipefail
 # make ** match all directories and subdirectories
 shopt -s globstar
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # We do not use c-style comments in migration files as the stripped
 # zero-length migration files cause warning during packaging
 # See #3115 for more info
 # In this file, we aim to keep the indentation intact by capturing whitespaces,
 # and reusing them if needed. GNU sed unfortunately does not support lookaround assertions.
 # /* -> --
 find src/backend/{distributed,columnar}/sql/**/*.sql -print0 | xargs -0 sed -i 's#/\*#--#g'
 # */ -> `` (empty string)
 # remove all whitespaces immediately before the match
 find src/backend/{distributed,columnar}/sql/**/*.sql -print0 | xargs -0 sed -i 's#\s*\*/\s*##g'
 # * -> --
 # keep the indentation
 # allow only whitespaces before the match
 find src/backend/{distributed,columnar}/sql/**/*.sql -print0 | xargs -0 sed -i 's#^\(\s*\) \*#\1--#g'
 # // -> --
 # do not touch http:// or similar by allowing only whitespaces before //
 find src/backend/{distributed,columnar}/sql/**/*.sql -print0 | xargs -0 sed -i 's#^\(\s*\)//#\1--#g'
--- a/ci/disallow_hash_comments_in_spec_files.sh
+++ b/ci/disallow_hash_comments_in_spec_files.sh
@ -1,12 +0,0 @@
 #! /bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # We do not use comments starting with # in spec files because it creates warnings from
 # preprocessor that expects directives after this character.
 # `# ` -> `-- `
 find src/test/regress/spec/*.spec -print0 | xargs -0 sed -i 's!# !// !g'
--- a/ci/disallow_long_changelog_entries.sh
+++ b/ci/disallow_long_changelog_entries.sh
@ -1,19 +0,0 @@
 #! /bin/bash
 set -eu
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # Having changelog items with entries that are longer than 80 characters are forbidden.
 # Find all lines with disallowed length, and for all such lines store
 #  - line number
 #  - length of the line
 #  - the line content
 too_long_lines=$(awk 'length() > 80 {print NR,"(",length(),"characters ) :",$0}' CHANGELOG.md)
 if [[ -n $too_long_lines ]]
 then
    echo "We allow at most 80 characters in CHANGELOG.md."
    echo "${too_long_lines}"
    exit 1
 fi
--- a/ci/editorconfig.sh
+++ b/ci/editorconfig.sh
@ -1,22 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 for f in $(git ls-tree -r HEAD --name-only); do
    if [ "$f" = "${f%.out}" ]  &&
        [ "$f" = "${f%.data}" ] &&
        [ "$f" = "${f%.png}" ] &&
        [ -f "$f" ] &&
        [ "$(echo "$f" | cut -d / -f1)" != "vendor" ] &&
        [ "$(dirname "$f")" != "src/test/regress/output" ]
    then
        # Trim trailing whitespace
        sed -e 's/[[:space:]]*$//' -i "./$f"
        # Add final newline if not there
        if [ -n "$(tail -c1 "$f")" ]; then
            echo >> "$f"
        fi
    fi
 done
--- a/ci/fix_gitignore.sh
+++ b/ci/fix_gitignore.sh
@ -1,19 +0,0 @@
 #! /bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # Remove all the ignored files from git tree, and error out
 # find all ignored files in git tree, and use quotation marks to prevent word splitting on filenames with spaces in them
 # NOTE: Option --cached is needed to avoid a bug in git ls-files command.
 ignored_lines_in_git_tree=$(git ls-files --ignored --cached --exclude-standard | sed 's/.*/"&"/')
 if [[ -n $ignored_lines_in_git_tree ]]
 then
    echo "Ignored files should not be in git tree!"
    echo "${ignored_lines_in_git_tree}"
    echo "Removing these files from git tree, please review and commit"
    echo "$ignored_lines_in_git_tree" | xargs git rm -r --cached
    exit 1
 fi
--- a/ci/fix_style.sh
+++ b/ci/fix_style.sh
@ -1,22 +0,0 @@
 #!/bin/sh
 # fail if trying to reference a variable that is not set.
 set -u / set -o nounset
 # exit immediately if a command fails
 set -e
 cidir="${0%/*}"
 cd ${cidir}/..
 citus_indent . --quiet
 black . --quiet
 isort . --quiet
 ci/editorconfig.sh
 ci/remove_useless_declarations.sh
 ci/disallow_c_comments_in_migrations.sh
 ci/disallow_hash_comments_in_spec_files.sh
 ci/disallow_long_changelog_entries.sh
 ci/normalize_expected.sh
 ci/fix_gitignore.sh
 ci/print_stack_trace.sh
 ci/sort_and_group_includes.sh
--- a/ci/include_grouping.py
+++ b/ci/include_grouping.py
@ -1,157 +0,0 @@
 #!/usr/bin/env python3
 """
 easy command line to run against all citus-style checked files:
 $ git ls-files \
  | git check-attr --stdin citus-style \
  | grep 'citus-style: set' \
  | awk '{print $1}' \
  | cut -d':' -f1 \
  | xargs -n1 ./ci/include_grouping.py
 """
 import collections
 import os
 import sys
 def main(args):
    if len(args) < 2:
        print("Usage: include_grouping.py <file>")
        return
    file = args[1]
    if not os.path.isfile(file):
        sys.exit(f"File '{file}' does not exist")
    with open(file, "r") as in_file:
        with open(file + ".tmp", "w") as out_file:
            includes = []
            skipped_lines = []
            # This calls print_sorted_includes on a set of consecutive #include lines.
            # This implicitly keeps separation of any #include lines that are contained in
            # an #ifdef, because it will order the #include lines inside and after the
            # #ifdef completely separately.
            for line in in_file:
                # if a line starts with #include we don't want to print it yet, instead we
                # want to collect all consecutive #include lines
                if line.startswith("#include"):
                    includes.append(line)
                    skipped_lines = []
                    continue
                # if we have collected any #include lines, we want to print them sorted
                # before printing the current line. However, if the current line is empty
                # we want to perform a lookahead to see if the next line is an #include.
                # To maintain any separation between #include lines and their subsequent
                # lines we keep track of all lines we have skipped inbetween.
                if len(includes) > 0:
                    if len(line.strip()) == 0:
                        skipped_lines.append(line)
                        continue
                    # we have includes that need to be grouped before printing the current
                    # line.
                    print_sorted_includes(includes, file=out_file)
                    includes = []
                    # print any skipped lines
                    print("".join(skipped_lines), end="", file=out_file)
                    skipped_lines = []
                print(line, end="", file=out_file)
    # move out_file to file
    os.rename(file + ".tmp", file)
 def print_sorted_includes(includes, file=sys.stdout):
    default_group_key = 1
    groups = collections.defaultdict(set)
    # define the groups that we separate correctly. The matchers are tested in the order
    # of their priority field. The first matcher that matches the include is used to
    # assign the include to a group.
    # The groups are printed in the order of their group_key.
    matchers = [
        {
            "name": "system includes",
            "matcher": lambda x: x.startswith("<"),
            "group_key": -2,
            "priority": 0,
        },
        {
            "name": "toplevel postgres includes",
            "matcher": lambda x: "/" not in x,
            "group_key": 0,
            "priority": 9,
        },
        {
            "name": "postgres.h",
            "matcher": lambda x: x.strip() in ['"postgres.h"'],
            "group_key": -1,
            "priority": -1,
        },
        {
            "name": "toplevel citus inlcudes",
            "matcher": lambda x: x.strip()
            in [
                '"citus_version.h"',
                '"pg_version_compat.h"',
                '"pg_version_constants.h"',
            ],
            "group_key": 3,
            "priority": 0,
        },
        {
            "name": "columnar includes",
            "matcher": lambda x: x.startswith('"columnar/'),
            "group_key": 4,
            "priority": 1,
        },
        {
            "name": "distributed includes",
            "matcher": lambda x: x.startswith('"distributed/'),
            "group_key": 5,
            "priority": 1,
        },
    ]
    matchers.sort(key=lambda x: x["priority"])
    # throughout our codebase we have some includes where either postgres or citus
    # includes are wrongfully included with the syntax for system includes. Before we
    # try to match those we will change the <> to "" to make them match our system. This
    # will also rewrite the include to the correct syntax.
    common_system_include_error_prefixes = ["<nodes/", "<distributed/"]
    # assign every include to a group
    for include in includes:
        # extract the group key from the include
        include_content = include.split(" ")[1]
        # fix common system includes which are secretly postgres or citus includes
        for common_prefix in common_system_include_error_prefixes:
            if include_content.startswith(common_prefix):
                include_content = '"' + include_content.strip()[1:-1] + '"'
                include = include.split(" ")[0] + " " + include_content + "\n"
                break
        group_key = default_group_key
        for matcher in matchers:
            if matcher["matcher"](include_content):
                group_key = matcher["group_key"]
                break
        groups[group_key].add(include)
    # iterate over all groups in the natural order of its keys
    for i, group in enumerate(sorted(groups.items())):
        if i > 0:
            print(file=file)
        includes = group[1]
        print("".join(sorted(includes)), end="", file=file)
 if __name__ == "__main__":
    main(sys.argv)
--- a/ci/normalize_expected.sh
+++ b/ci/normalize_expected.sh
@ -1,10 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 for f in $(git ls-tree -r HEAD --name-only src/test/regress/expected/*.out); do
 	sed -Ef src/test/regress/bin/normalize.sed < "$f" > "$f.modified"
 	mv "$f.modified" "$f"
 done
--- a/ci/print_stack_trace.sh
+++ b/ci/print_stack_trace.sh
@ -1,25 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 # find all core files
 core_files=( $(find . -type f -regex .*core.*\d*.*postgres) )
 if [ ${#core_files[@]} -gt 0 ]; then
    # print stack traces for the core files
    for core_file in "${core_files[@]}"
    do
        # set print frame-arguments all: show all scalars + structures in the frame
        # set print pretty on:           show structures in indented mode
        # set print addr off:            do not show pointer address
        # thread apply all bt full:      show stack traces for all threads
        gdb --batch \
            -ex "set print frame-arguments all" \
            -ex "set print pretty on" \
            -ex "set print addr off" \
            -ex "thread apply all bt full" \
            postgres "${core_file}"
    done
 fi
--- a/ci/remove_useless_declarations.sh
+++ b/ci/remove_useless_declarations.sh
@ -1,34 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 files=$(find src -iname '*.c' -type f | git check-attr --stdin citus-style | grep -v ': unset$' | sed 's/: citus-style: set$//')
 while true; do
    # A visual version of this regex can be seen here (it is MUCH clearer):
    # https://www.debuggex.com/r/XodMNE9auT9e-bTx
    # This visual version only contains the search bit, the replacement bit is
    # quite simple. It looks like when extracted from the command below:
    # \n$+{code_between}\t$+{type}$+{variable} =
    # shellcheck disable=SC2086
    perl -i -p0e 's/\n\t(?!return )(?P<type>(\w+ )+\**)(?>(?P<variable>\w+)( = *[\w>\s\n-]*?)?;\n(?P<code_between>(?>(?P<comment_or_string_or_not_preprocessor>\/\*.*?\*\/|"(?>\\"|.)*?"|[^#]))*?)(\t)?(?=\b(?P=variable)\b))(?<=\n\t)(?P=variable) =(?![^;]*?[^>_]\b(?P=variable)\b[^_])/\n$+{code_between}\t$+{type}$+{variable} =/sg' $files
    # The following are simply the same regex, but repeated for different
    # indentation levels, i.e. finding declarations indented using 2, 3, 4, 5
    # and 6 tabs. More than 6 don't really occur in the wild.
    # (this is needed because variable sized backtracking is not supported in perl)
    # shellcheck disable=SC2086
    perl -i -p0e 's/\n\t\t(?!return )(?P<type>(\w+ )+\**)(?>(?P<variable>\w+)( = *[\w>\s\n-]*?)?;\n(?P<code_between>(?>(?P<comment_or_string_or_not_preprocessor>\/\*.*?\*\/|"(?>\\"|.)*?"|[^#]))*?)(\t\t)?(?=\b(?P=variable)\b))(?<=\n\t\t)(?P=variable) =(?![^;]*?[^>_]\b(?P=variable)\b[^_])/\n$+{code_between}\t\t$+{type}$+{variable} =/sg' $files
    # shellcheck disable=SC2086
    perl -i -p0e 's/\n\t\t\t(?!return )(?P<type>(\w+ )+\**)(?>(?P<variable>\w+)( = *[\w>\s\n-]*?)?;\n(?P<code_between>(?>(?P<comment_or_string_or_not_preprocessor>\/\*.*?\*\/|"(?>\\"|.)*?"|[^#]))*?)(\t\t\t)?(?=\b(?P=variable)\b))(?<=\n\t\t\t)(?P=variable) =(?![^;]*?[^>_]\b(?P=variable)\b[^_])/\n$+{code_between}\t\t\t$+{type}$+{variable} =/sg' $files
    # shellcheck disable=SC2086
    perl -i -p0e 's/\n\t\t\t\t(?!return )(?P<type>(\w+ )+\**)(?>(?P<variable>\w+)( = *[\w>\s\n-]*?)?;\n(?P<code_between>(?>(?P<comment_or_string_or_not_preprocessor>\/\*.*?\*\/|"(?>\\"|.)*?"|[^#]))*?)(\t\t\t\t)?(?=\b(?P=variable)\b))(?<=\n\t\t\t\t)(?P=variable) =(?![^;]*?[^>_]\b(?P=variable)\b[^_])/\n$+{code_between}\t\t\t\t$+{type}$+{variable} =/sg' $files
    # shellcheck disable=SC2086
    perl -i -p0e 's/\n\t\t\t\t\t(?!return )(?P<type>(\w+ )+\**)(?>(?P<variable>\w+)( = *[\w>\s\n-]*?)?;\n(?P<code_between>(?>(?P<comment_or_string_or_not_preprocessor>\/\*.*?\*\/|"(?>\\"|.)*?"|[^#]))*?)(\t\t\t\t\t)?(?=\b(?P=variable)\b))(?<=\n\t\t\t\t\t)(?P=variable) =(?![^;]*?[^>_]\b(?P=variable)\b[^_])/\n$+{code_between}\t\t\t\t\t$+{type}$+{variable} =/sg' $files
    # shellcheck disable=SC2086
    perl -i -p0e 's/\n\t\t\t\t\t\t(?!return )(?P<type>(\w+ )+\**)(?>(?P<variable>\w+)( = *[\w>\s\n-]*?)?;\n(?P<code_between>(?>(?P<comment_or_string_or_not_preprocessor>\/\*.*?\*\/|"(?>\\"|.)*?"|[^#]))*?)(\t\t\t\t\t\t)?(?=\b(?P=variable)\b))(?<=\n\t\t\t\t\t\t)(?P=variable) =(?![^;]*?[^>_]\b(?P=variable)\b[^_])/\n$+{code_between}\t\t\t\t\t\t$+{type}$+{variable} =/sg' $files
    # shellcheck disable=SC2086
    git diff --quiet $files && break;
    # shellcheck disable=SC2086
    git add $files;
 done
--- a/ci/sort_and_group_includes.sh
+++ b/ci/sort_and_group_includes.sh
@ -1,12 +0,0 @@
 #!/bin/bash
 set -euo pipefail
 # shellcheck disable=SC1091
 source ci/ci_helpers.sh
 git ls-files \
  | git check-attr --stdin citus-style \
  | grep 'citus-style: set' \
  | awk '{print $1}' \
  | cut -d':' -f1 \
  | xargs -n1 ./ci/include_grouping.py
--- a/config/config.guess
+++ b/config/config.guess
--- a/config/general.m4
+++ b/config/general.m4
@ -1,151 +0,0 @@
 # config/general.m4
 # Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
 # Portions Copyright (c) 1994, The Regents of the University of California
 # This file defines new macros to process configure command line
 # arguments, to replace the brain-dead AC_ARG_WITH and AC_ARG_ENABLE.
 # The flaw in these is particularly that they only differentiate
 # between "given" and "not given" and do not provide enough help to
 # process arguments that only accept "yes/no", that require an
 # argument (other than "yes/no"), etc.
 #
 # The point of this implementation is to reduce code size and
 # redundancy in configure.ac and to improve robustness and consistency
 # in the option evaluation code.
 # Convert type and name to shell variable name (e.g., "enable_long_strings")
 m4_define([pgac_arg_to_variable],
          [$1[]_[]patsubst($2, -, _)])
 # PGAC_ARG(TYPE, NAME, HELP-STRING-LHS-EXTRA, HELP-STRING-RHS,
 #          [ACTION-IF-YES], [ACTION-IF-NO], [ACTION-IF-ARG],
 #          [ACTION-IF-OMITTED])
 # ------------------------------------------------------------
 # This is the base layer. TYPE is either "with" or "enable", depending
 # on what you like.  NAME is the rest of the option name.
 # HELP-STRING-LHS-EXTRA is a string to append to the option name on
 # the left-hand side of the help output, e.g., an argument name.  If
 # set to "-", append nothing, but let the option appear in the
 # negative form (disable/without).  HELP-STRING-RHS is the option
 # description, for the right-hand side of the help output.
 # ACTION-IF-YES is executed if the option is given without an argument
 # (or "yes", which is the same); similar for ACTION-IF-NO.
 AC_DEFUN([PGAC_ARG],
 [
 m4_case([$1],
 enable, [
 AC_ARG_ENABLE([$2], [AS_HELP_STRING([--]m4_if($3, -, disable, enable)[-$2]m4_if($3, -, , $3), [$4])], [
  case [$]enableval in
    yes)
      m4_default([$5], :)
      ;;
    no)
      m4_default([$6], :)
      ;;
    *)
      $7
      ;;
  esac
 ],
 [$8])[]dnl AC_ARG_ENABLE
 ],
 with, [
 AC_ARG_WITH([$2], [AS_HELP_STRING([--]m4_if($3, -, without, with)[-$2]m4_if($3, -, , $3), [$4])], [
  case [$]withval in
    yes)
      m4_default([$5], :)
      ;;
    no)
      m4_default([$6], :)
      ;;
    *)
      $7
      ;;
  esac
 ],
 [$8])[]dnl AC_ARG_WITH
 ],
 [m4_fatal([first argument of $0 must be 'enable' or 'with', not '$1'])]
 )
 ])# PGAC_ARG
 # PGAC_ARG_BOOL(TYPE, NAME, DEFAULT, HELP-STRING-RHS,
 #               [ACTION-IF-YES], [ACTION-IF-NO])
 # ---------------------------------------------------
 # Accept a boolean option, that is, one that only takes yes or no.
 # ("no" is equivalent to "disable" or "without"). DEFAULT is what
 # should be done if the option is omitted; it should be "yes" or "no".
 # (Consequently, one of ACTION-IF-YES and ACTION-IF-NO will always
 # execute.)
 AC_DEFUN([PGAC_ARG_BOOL],
 [dnl The following hack is necessary because in a few instances this
 dnl macro is called twice for the same option with different default
 dnl values.  But we only want it to appear once in the help.  We achieve
 dnl that by making the help string look the same, which is why we need to
 dnl save the default that was passed in previously.
 m4_define([_pgac_helpdefault], m4_ifdef([pgac_defined_$1_$2_bool], [m4_defn([pgac_defined_$1_$2_bool])], [$3]))dnl
 PGAC_ARG([$1], [$2], [m4_if(_pgac_helpdefault, yes, -)], [$4], [$5], [$6],
          [AC_MSG_ERROR([no argument expected for --$1-$2 option])],
          [m4_case([$3],
                   yes, [pgac_arg_to_variable([$1], [$2])=yes
 $5],
                   no,  [pgac_arg_to_variable([$1], [$2])=no
 $6],
                   [m4_fatal([third argument of $0 must be 'yes' or 'no', not '$3'])])])[]dnl
 m4_define([pgac_defined_$1_$2_bool], [$3])dnl
 ])# PGAC_ARG_BOOL
 # PGAC_ARG_REQ(TYPE, NAME, HELP-ARGNAME, HELP-STRING-RHS,
 #              [ACTION-IF-GIVEN], [ACTION-IF-NOT-GIVEN])
 # -------------------------------------------------------
 # This option will require an argument; "yes" or "no" will not be
 # accepted.  HELP-ARGNAME is a name for the argument for the help output.
 AC_DEFUN([PGAC_ARG_REQ],
 [PGAC_ARG([$1], [$2], [=$3], [$4],
          [AC_MSG_ERROR([argument required for --$1-$2 option])],
          [AC_MSG_ERROR([argument required for --$1-$2 option])],
          [$5],
          [$6])])# PGAC_ARG_REQ
 # PGAC_ARG_OPTARG(TYPE, NAME, HELP-ARGNAME, HELP-STRING-RHS,
 #                 [DEFAULT-ACTION], [ARG-ACTION],
 #                 [ACTION-ENABLED], [ACTION-DISABLED])
 # ----------------------------------------------------------
 # This will create an option that behaves as follows: If omitted, or
 # called with "no", then set the enable_variable to "no" and do
 # nothing else. If called with "yes", then execute DEFAULT-ACTION. If
 # called with argument, set enable_variable to "yes" and execute
 # ARG-ACTION. Additionally, execute ACTION-ENABLED if we ended up with
 # "yes" either way, else ACTION-DISABLED.
 #
 # The intent is to allow enabling a feature, and optionally pass an
 # additional piece of information.
 AC_DEFUN([PGAC_ARG_OPTARG],
 [PGAC_ARG([$1], [$2], [@<:@=$3@:>@], [$4], [$5], [],
          [pgac_arg_to_variable([$1], [$2])=yes
 $6],
          [pgac_arg_to_variable([$1], [$2])=no])
 dnl Add this code only if there's a ACTION-ENABLED or ACTION-DISABLED.
 m4_ifval([$7[]$8],
 [
 if test "[$]pgac_arg_to_variable([$1], [$2])" = yes; then
  m4_default([$7], :)
 m4_ifval([$8],
 [else
  $8
 ])[]dnl
 fi
 ])[]dnl
 ])# PGAC_ARG_OPTARG
--- a/2067
+++ b/2067
--- a/configure.ac
+++ b/configure.ac
@ -1,311 +0,0 @@
 # Citus autoconf input script.
 #
 # Converted into an actual configure script by autogen.sh. This
 # conversion only has to be done when configure.in changes. To avoid
 # everyone needing autoconf installed, the resulting files are checked
 # into the SCM.
 AC_INIT([Citus], [13.2devel])
 AC_COPYRIGHT([Copyright (c) Citus Data, Inc.])
 # we'll need sed and awk for some of the version commands
 AC_PROG_SED
 AC_PROG_AWK
 # CITUS_NAME definition
 AC_DEFINE_UNQUOTED(CITUS_NAME, "$PACKAGE_NAME", [Citus full name as a string])
 case $PACKAGE_NAME in
  'Citus Enterprise') citus_edition=enterprise ;;
               Citus) citus_edition=community ;;
                   *) AC_MSG_ERROR([Unrecognized package name.]) ;;
 esac
 # CITUS_EDITION definition
 AC_DEFINE_UNQUOTED(CITUS_EDITION, "$citus_edition", [Citus edition as a string])
 # CITUS_MAJORVERSION definition
 [CITUS_MAJORVERSION=`expr "$PACKAGE_VERSION" : '\([0-9][0-9]*\.[0-9][0-9]*\)'`]
 AC_DEFINE_UNQUOTED(CITUS_MAJORVERSION, "$CITUS_MAJORVERSION", [Citus major version as a string])
 # CITUS_VERSION definition
 PGAC_ARG_REQ(with, extra-version, [STRING], [append STRING to version],
             [CITUS_VERSION="$PACKAGE_VERSION$withval"],
             [CITUS_VERSION="$PACKAGE_VERSION"])
 AC_DEFINE_UNQUOTED(CITUS_VERSION, "$CITUS_VERSION", [Citus version as a string])
 # CITUS_VERSION_NUM definition
 # awk -F is a regex on some platforms, and not on others, so make "." a tab
 [CITUS_VERSION_NUM="`echo "$PACKAGE_VERSION" | sed 's/[A-Za-z].*$//' |
 tr '.' '	' |
 $AWK '{printf "%d%02d%02d", $1, $2, (NF >= 3) ? $3 : 0}'`"]
 AC_DEFINE_UNQUOTED(CITUS_VERSION_NUM, $CITUS_VERSION_NUM, [Citus version as a number])
 # CITUS_EXTENSIONVERSION definition
 [CITUS_EXTENSIONVERSION="`grep '^default_version' $srcdir/src/backend/distributed/citus.control | cut -d\' -f2`"]
 AC_DEFINE_UNQUOTED([CITUS_EXTENSIONVERSION], "$CITUS_EXTENSIONVERSION", [Extension version expected by this Citus build])
 # Re-check for flex. That allows to compile citus against a postgres
 # which was built without flex available (possible because generated
 # files are included)
 AC_PATH_PROG([FLEX], [flex])
 # Locate pg_config binary
 AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL instalation (default PATH)])
 AC_ARG_VAR([PATH], [PATH for target PostgreSQL install pg_config])
 if test -z "$PG_CONFIG"; then
  AC_PATH_PROG(PG_CONFIG, pg_config)
 fi
 if test -z "$PG_CONFIG"; then
   AC_MSG_ERROR([Could not find pg_config. Set PG_CONFIG or PATH.])
 fi
 # check we're building against a supported version of PostgreSQL
 citusac_pg_config_version=$($PG_CONFIG --version 2>/dev/null)
 version_num=$(echo "$citusac_pg_config_version"|
              $SED -e 's/^PostgreSQL \([[0-9]]*\)\(\.[[0-9]]*\)\{0,1\}\(.*\)$/\1\2/')
 # if PostgreSQL version starts with two digits, the major version is those digits
 version_num=$(echo "$version_num"| $SED -e 's/^\([[0-9]]\{2\}\)\(.*\)$/\1/')
 if test -z "$version_num"; then
  AC_MSG_ERROR([Could not detect PostgreSQL version from pg_config.])
 fi
 PGAC_ARG_BOOL(with, pg-version-check, yes,
              [do not check postgres version during configure])
 AC_SUBST(with_pg_version_check)
 if test "$with_pg_version_check" = no; then
    AC_MSG_NOTICE([building against PostgreSQL $version_num (skipped compatibility check)])
 elif test "$version_num" != '15' -a  "$version_num" != '16' -a  "$version_num" != '17'; then
   AC_MSG_ERROR([Citus is not compatible with the detected PostgreSQL version ${version_num}.])
 else
   AC_MSG_NOTICE([building against PostgreSQL $version_num])
 fi;
 # Check whether we're building inside the source tree, if not, prepare
 # the build directory.
 if test "$srcdir" -ef '.' ; then
  vpath_build=no
 else
  vpath_build=yes
  _AS_ECHO_N([preparing build tree... ])
  citusac_abs_top_srcdir=`cd "$srcdir" && pwd`
  $SHELL "$citusac_abs_top_srcdir/prep_buildtree" "$citusac_abs_top_srcdir" "." \
      || AC_MSG_ERROR(failed)
  AC_MSG_RESULT(done)
 fi
 AC_SUBST(vpath_build)
 # Allow to overwrite the C compiler, default to the one postgres was
 # compiled with. We don't want autoconf's default CFLAGS though, so save
 # those.
 SAVE_CFLAGS="$CFLAGS"
 AC_PROG_CC([$($PG_CONFIG --cc)])
 CFLAGS="$SAVE_CFLAGS"
 host_guess=`${SHELL} $srcdir/config/config.guess`
 # Create compiler version string
 if test x"$GCC" = x"yes" ; then
  cc_string=`${CC} --version | sed q`
  case $cc_string in [[A-Za-z]]*) ;; *) cc_string="GCC $cc_string";; esac
 elif test x"$SUN_STUDIO_CC" = x"yes" ; then
  cc_string=`${CC} -V 2>&1 | sed q`
 else
  cc_string=$CC
 fi
 AC_CHECK_SIZEOF([void *])
 AC_DEFINE_UNQUOTED(CITUS_VERSION_STR,
                   ["$PACKAGE_NAME $CITUS_VERSION on $host_guess, compiled by $cc_string, `expr $ac_cv_sizeof_void_p \* 8`-bit"],
                   [A string containing the version number, platform, and C compiler])
 # Locate source and build directory of the postgres we're building
 # against. Can't rely on either still being present, but e.g. optional
 # test infrastructure can rely on it.
 POSTGRES_SRCDIR=$(grep ^abs_top_srcdir $(dirname $($PG_CONFIG --pgxs))/../Makefile.global|cut -d ' ' -f3-)
 POSTGRES_BUILDDIR=$(grep ^abs_top_builddir $(dirname $($PG_CONFIG --pgxs))/../Makefile.global|cut -d ' ' -f3-)
 # check for a number of CFLAGS that make development easier
 # CITUSAC_PROG_CC_CFLAGS_OPT
 # -----------------------
 # Given a string, check if the compiler supports the string as a
 # command-line option. If it does, add the string to CFLAGS.
 AC_DEFUN([CITUSAC_PROG_CC_CFLAGS_OPT],
 [define([Ac_cachevar], [AS_TR_SH([citusac_cv_prog_cc_cflags_$1])])dnl
 AC_CACHE_CHECK([whether $CC supports $1], [Ac_cachevar],
 [citusac_save_CFLAGS=$CFLAGS
 flag=$1
 case $flag in -Wno*)
 	 flag=-W$(echo $flag | cut -c 6-)
 esac
 CFLAGS="$citusac_save_CFLAGS $flag"
 ac_save_c_werror_flag=$ac_c_werror_flag
 ac_c_werror_flag=yes
 _AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],
                   [Ac_cachevar=yes],
                   [Ac_cachevar=no])
 ac_c_werror_flag=$ac_save_c_werror_flag
 CFLAGS="$citusac_save_CFLAGS"])
 if test x"$Ac_cachevar" = x"yes"; then
  CITUS_CFLAGS="$CITUS_CFLAGS $1"
 fi
 undefine([Ac_cachevar])dnl
 ])# CITUSAC_PROG_CC_CFLAGS_OPT
 CITUSAC_PROG_CC_CFLAGS_OPT([-std=gnu99])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wall])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wextra])
 # disarm options included in the above, which are too noisy for now
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-unused-parameter])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-sign-compare])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-missing-field-initializers])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-clobbered])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-gnu-variable-sized-type-not-at-end])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-declaration-after-statement])
 # And add a few extra warnings
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wendif-labels])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wmissing-format-attribute])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wmissing-declarations])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wmissing-prototypes])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wshadow])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=vla])  # visual studio does not support these
 CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=implicit-int])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=implicit-function-declaration])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=return-type])
 # Security flags
 # Flags taken from: https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10203#guide
 # We do not enforce the following flag because it is only available on GCC>=8
 CITUSAC_PROG_CC_CFLAGS_OPT([-fstack-clash-protection])
 #
 # --enable-coverage enables generation of code coverage metrics with gcov
 #
 AC_ARG_ENABLE([coverage], AS_HELP_STRING([--enable-coverage], [build with coverage testing instrumentation]))
 if test "$enable_coverage" = yes; then
    CITUS_CFLAGS="$CITUS_CFLAGS -O0 -g --coverage"
    CITUS_CPPFLAGS="$CITUS_CPPFLAGS -DNDEBUG"
    CITUS_LDFLAGS="$CITUS_LDFLAGS --coverage"
 fi
 #
 # libcurl
 #
 PGAC_ARG_BOOL(with, libcurl, yes,
              [do not use libcurl for anonymous statistics collection],
              [AC_DEFINE([HAVE_LIBCURL], 1, [Define to 1 to build with libcurl support. (--with-libcurl)])])
 if test "$with_libcurl" = yes; then
  AC_CHECK_LIB(curl, curl_global_init, [],
              [AC_MSG_ERROR([libcurl not found
 If you have libcurl already installed, see config.log for details on the
 failure. It is possible the compiler isn't looking in the proper directory.
 Use --without-libcurl to disable anonymous statistics collection.])])
  AC_CHECK_HEADER(curl/curl.h, [], [AC_MSG_ERROR([libcurl header not found
 If you have libcurl already installed, see config.log for details on the
 failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-libcurl to disable libcurl support.])])
 fi
 # REPORTS_BASE_URL definition
 PGAC_ARG_REQ(with, reports-hostname, [HOSTNAME],
             [Use HOSTNAME as hostname for statistics collection and update checks],
             [REPORTS_BASE_URL="https://${withval}"],
             [REPORTS_BASE_URL="https://reports.citusdata.com"])
 AC_DEFINE_UNQUOTED(REPORTS_BASE_URL, "$REPORTS_BASE_URL",
                   [Base URL for statistics collection and update checks])
 #
 # LZ4
 #
 PGAC_ARG_BOOL(with, lz4, yes,
              [do not use lz4],
              [AC_DEFINE([HAVE_CITUS_LIBLZ4], 1, [Define to 1 to build with lz4 support. (--with-lz4)])])
 AC_SUBST(with_lz4)
 if test "$with_lz4" = yes; then
  AC_CHECK_LIB(lz4, LZ4_compress_default, [],
              [AC_MSG_ERROR([lz4 library not found
 If you have lz4 installed, see config.log for details on the
 failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-lz4 to disable lz4 support.])])
  AC_CHECK_HEADER(lz4.h, [], [AC_MSG_ERROR([lz4 header not found
 If you have lz4 already installed, see config.log for details on the
 failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-lz4 to disable lz4 support.])])
 fi
 #
 # ZSTD
 #
 PGAC_ARG_BOOL(with, zstd, yes,
              [do not use zstd])
 AC_SUBST(with_zstd)
 if test "$with_zstd" = yes; then
  AC_CHECK_LIB(zstd, ZSTD_decompress, [],
              [AC_MSG_ERROR([zstd library not found
 If you have zstd installed, see config.log for details on the
 failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zstd to disable zstd support.])])
  AC_CHECK_HEADER(zstd.h, [], [AC_MSG_ERROR([zstd header not found
 If you have zstd already installed, see config.log for details on the
 failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zstd to disable zstd support.])])
 fi
 PGAC_ARG_BOOL(with, security-flags, no,
              [use security flags])
 AC_SUBST(with_security_flags)
 if test "$with_security_flags" = yes; then
 # Flags taken from: https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10203#guide
 # We always want to have some compiler flags for security concerns.
 SECURITY_CFLAGS="-fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2 -z noexecstack -fpic -shared -Wl,-z,relro -Wl,-z,now -Wformat -Wformat-security -Werror=format-security"
 CITUS_CFLAGS="$CITUS_CFLAGS $SECURITY_CFLAGS"
 AC_MSG_NOTICE([Blindly added security flags for linker: $SECURITY_CFLAGS])
 # We always want to have some clang flags for security concerns.
 # This doesn't include "-Wl,-z,relro -Wl,-z,now" on purpuse, because bitcode is not linked.
 # This doesn't include -fsanitize=cfi because it breaks builds on many distros including
 # Debian/Buster, Debian/Stretch, Ubuntu/Bionic, Ubuntu/Xenial and EL7.
 SECURITY_BITCODE_CFLAGS="-fsanitize=safe-stack -fstack-protector-strong -flto -fPIC -Wformat -Wformat-security -Werror=format-security"
 CITUS_BITCODE_CFLAGS="$CITUS_BITCODE_CFLAGS $SECURITY_BITCODE_CFLAGS"
 AC_MSG_NOTICE([Blindly added security flags for llvm: $SECURITY_BITCODE_CFLAGS])
 AC_MSG_WARN([If you run into issues during linking or bitcode compilation, you can use --without-security-flags.])
 fi
 # Check if git is installed, when installed the gitref of the checkout will be baked in the application
 AC_PATH_PROG(GIT_BIN, git)
 AC_CHECK_FILE(.git,[HAS_DOTGIT=yes], [HAS_DOTGIT=])
 AC_SUBST(CITUS_CFLAGS, "$CITUS_CFLAGS")
 AC_SUBST(CITUS_BITCODE_CFLAGS, "$CITUS_BITCODE_CFLAGS")
 AC_SUBST(CITUS_CPPFLAGS, "$CITUS_CPPFLAGS")
 AC_SUBST(CITUS_LDFLAGS, "$LIBS $CITUS_LDFLAGS")
 AC_SUBST(POSTGRES_SRCDIR, "$POSTGRES_SRCDIR")
 AC_SUBST(POSTGRES_BUILDDIR, "$POSTGRES_BUILDDIR")
 AC_SUBST(HAS_DOTGIT, "$HAS_DOTGIT")
 AC_CONFIG_FILES([Makefile.global])
 AC_CONFIG_HEADERS([src/include/citus_config.h] [src/include/citus_version.h])
 AH_TOP([
 /*
 * citus_config.h.in is generated by autoconf/autoheader and
 * converted into citus_config.h by configure.  Include when code needs to
 * depend on determinations made by configure.
 *
 * Do not manually edit!
 */
 ])
 AC_OUTPUT
--- a/configure.in
+++ b/configure.in
@ -0,0 +1,117 @@
 # Citus autoconf input script.
 #
 # Converted into an actual configure script by autogen.sh. This
 # conversion only has to be done when configure.in changes. To avoid
 # everyone needing autoconf installed, the resulting files are checked
 # into the SCM.
 AC_INIT([Citus], [5.0], [], [citus], [])
 AC_COPYRIGHT([Copyright (c) 2012-2016, Citus Data, Inc.])
 AC_PROG_SED
 # Re-check for flex. That allows to compile citus against a postgres
 # which was built without flex available (possible because generated
 # files are included)
 AC_PATH_PROG([FLEX], [flex])
 # Locate pg_config binary
 AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL instalation (default PATH)])
 AC_ARG_VAR([PATH], [PATH for target PostgreSQL install pg_config])
 if test -z "$PG_CONFIG"; then
  AC_PATH_PROG(PG_CONFIG, pg_config)
 fi
 if test -z "$PG_CONFIG"; then
   AC_MSG_ERROR([Could not find pg_config. Set PG_CONFIG or PATH.])
 fi
 # check we're building against a supported version of PostgreSQL
 citusac_pg_config_version=$($PG_CONFIG --version 2>/dev/null)
 version_num=$(echo "$citusac_pg_config_version"|
              $SED -e 's/^PostgreSQL \([[0-9]]*\)\.\([[0-9]]*\)\([[a-zA-Z0-9.]]*\)$/\1.\2/')
 if test -z "$version_num"; then
  AC_MSG_ERROR([Could not detect PostgreSQL version from pg_config.])
 fi
 if test "$version_num" != '9.5'; then
   AC_MSG_ERROR([Citus is not compatible with the detected PostgreSQL version ${version_num}.])
 else
   AC_MSG_NOTICE([building against PostgreSQL $version_num])
 fi;
 # Check whether we're building inside the source tree, if not, prepare
 # the build directory.
 if test "$srcdir" -ef '.' ; then
  vpath_build=no
 else
  vpath_build=yes
  _AS_ECHO_N([preparing build tree... ])
  citusac_abs_top_srcdir=`cd "$srcdir" && pwd`
  $SHELL "$citusac_abs_top_srcdir/prep_buildtree" "$citusac_abs_top_srcdir" "." \
      || AC_MSG_ERROR(failed)
  AC_MSG_RESULT(done)
 fi
 AC_SUBST(vpath_build)
 # Allow to overwrite the C compiler, default to the one postgres was
 # compiled with. We don't want autoconf's default CFLAGS though, so save
 # those.
 SAVE_CFLAGS="$CFLAGS"
 AC_PROG_CC([$($PG_CONFIG --cc)])
 CFLAGS="$SAVE_CFLAGS"
 # check for a number of CFLAGS that make development easier
 # CITUSAC_PROG_CC_CFLAGS_OPT
 # -----------------------
 # Given a string, check if the compiler supports the string as a
 # command-line option. If it does, add the string to CFLAGS.
 AC_DEFUN([CITUSAC_PROG_CC_CFLAGS_OPT],
 [define([Ac_cachevar], [AS_TR_SH([citusac_cv_prog_cc_cflags_$1])])dnl
 AC_CACHE_CHECK([whether $CC supports $1], [Ac_cachevar],
 [citusac_save_CFLAGS=$CFLAGS
 CFLAGS="$citusac_save_CFLAGS $1"
 ac_save_c_werror_flag=$ac_c_werror_flag
 ac_c_werror_flag=yes
 _AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],
                   [Ac_cachevar=yes],
                   [Ac_cachevar=no])
 ac_c_werror_flag=$ac_save_c_werror_flag
 CFLAGS="$citusac_save_CFLAGS"])
 if test x"$Ac_cachevar" = x"yes"; then
  CITUS_CFLAGS="$CITUS_CFLAGS $1"
 fi
 undefine([Ac_cachevar])dnl
 ])# CITUSAC_PROG_CC_CFLAGS_OPT
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wall])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wextra])
 # disarm options included in the above, which are too noisy for now
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-unused-parameter])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-sign-compare])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-missing-field-initializers])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wno-clobbered])
 # And add a few extra warnings
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wdeclaration-after-statement])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wendif-labels])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wmissing-format-attribute])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wmissing-declarations])
 CITUSAC_PROG_CC_CFLAGS_OPT([-Wmissing-prototypes])
 AC_SUBST(CITUS_CFLAGS, "$CITUS_CFLAGS")
 AC_CONFIG_FILES([Makefile.global])
 AC_CONFIG_HEADERS([src/include/citus_config.h])
 AH_TOP([
 /*
 * citus_config.h.in is generated by autoconf/autoheader and
 * converted into citus_config.h by configure.  Include when code needs to
 * depend on determinations made by configure.
 *
 * Do not manually edit!
 */
 ])
 AC_OUTPUT
--- a/github-banner.png
+++ b/github-banner.png
--- a/images/2pc-recovery.png
+++ b/images/2pc-recovery.png
--- a/images/citus-architecture.png
+++ b/images/citus-architecture.png
--- a/images/citus-readme-banner.png
+++ b/images/citus-readme-banner.png
--- a/images/citus-scale-out.png
+++ b/images/citus-scale-out.png
--- a/images/coordinator_delegates_stored_procedure.png
+++ b/images/coordinator_delegates_stored_procedure.png
--- a/images/deadlock-detection.png
+++ b/images/deadlock-detection.png
--- a/images/executor-connections.png
+++ b/images/executor-connections.png
--- a/images/executor-slow-start.png
+++ b/images/executor-slow-start.png
--- a/images/insert-select-modes.png
+++ b/images/insert-select-modes.png
--- a/images/mx-dedicated-query-nodes.png
+++ b/images/mx-dedicated-query-nodes.png
--- a/images/single-repartition-join.png
+++ b/images/single-repartition-join.png
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,40 +0,0 @@
 [tool.isort]
 profile = 'black'
 [tool.black]
 include = '(src/test/regress/bin/diff-filter|\.pyi?|\.ipynb)$'
 [tool.pytest.ini_options]
 addopts = [
    "--import-mode=importlib",
    "--showlocals",
    "--tb=short",
 ]
 pythonpath = 'src/test/regress/citus_tests'
 asyncio_mode = 'auto'
 # Make test discovery quicker from the root dir of the repo
 testpaths = ['src/test/regress/citus_tests/test']
 # Make test discovery quicker from other directories than root directory
 norecursedirs = [
    '*.egg',
    '.*',
    'build',
    'venv',
    'ci',
    'vendor',
    'backend',
    'bin',
    'include',
    'tmp_*',
    'results',
    'expected',
    'sql',
    'spec',
    'data',
    '__pycache__',
 ]
 # Don't find files with test at the end such as run_test.py
 python_files = ['test_*.py']
--- a/src/backend/columnar/.gitattributes
+++ b/src/backend/columnar/.gitattributes
@ -1,25 +0,0 @@
 *		whitespace=space-before-tab,trailing-space
 *.[chly]	whitespace=space-before-tab,trailing-space,indent-with-non-tab,tabwidth=4
 *.dsl		whitespace=space-before-tab,trailing-space,tab-in-indent
 *.patch		-whitespace
 *.pl		whitespace=space-before-tab,trailing-space,tabwidth=4
 *.po		whitespace=space-before-tab,trailing-space,tab-in-indent,-blank-at-eof
 *.sgml		whitespace=space-before-tab,trailing-space,tab-in-indent,-blank-at-eol
 *.x[ms]l	whitespace=space-before-tab,trailing-space,tab-in-indent
 # Avoid confusing ASCII underlines with leftover merge conflict markers
 README		conflict-marker-size=32
 README.*	conflict-marker-size=32
 # Certain data files that contain special whitespace, and other special cases
 *.data						-whitespace
 # Test output files that contain extra whitespace
 *.out					-whitespace
 # These files are maintained or generated elsewhere.  We take them as is.
 configure				-whitespace
 # all C files (implementation and header) use our style...
 *.[ch] citus-style
--- a/src/backend/columnar/.gitignore
+++ b/src/backend/columnar/.gitignore
@ -1,3 +0,0 @@
 # The directory used to store columnar sql files after pre-processing them
 # with 'cpp' in build-time, see src/backend/columnar/Makefile.
 /build/
--- a/src/backend/columnar/Makefile
+++ b/src/backend/columnar/Makefile
@ -1,60 +0,0 @@
 citus_subdir = src/backend/columnar
 citus_top_builddir = ../../..
 safestringlib_srcdir = $(citus_abs_top_srcdir)/vendor/safestringlib
 SUBDIRS = . safeclib
 SUBDIRS +=
 ENSURE_SUBDIRS_EXIST := $(shell mkdir -p $(SUBDIRS))
 OBJS += \
 	$(patsubst $(citus_abs_srcdir)/%.c,%.o,$(foreach dir,$(SUBDIRS), $(sort $(wildcard $(citus_abs_srcdir)/$(dir)/*.c))))
 MODULE_big = citus_columnar
 EXTENSION = citus_columnar
 template_sql_files = $(patsubst $(citus_abs_srcdir)/%,%,$(wildcard $(citus_abs_srcdir)/sql/*.sql))
 template_downgrade_sql_files = $(patsubst $(citus_abs_srcdir)/sql/downgrades/%,%,$(wildcard $(citus_abs_srcdir)/sql/downgrades/*.sql))
 generated_sql_files = $(patsubst %,$(citus_abs_srcdir)/build/%,$(template_sql_files))
 generated_downgrade_sql_files += $(patsubst %,$(citus_abs_srcdir)/build/sql/%,$(template_downgrade_sql_files))
 DATA_built = $(generated_sql_files)
 PG_CPPFLAGS += -I$(libpq_srcdir) -I$(safestringlib_srcdir)/include
 include $(citus_top_builddir)/Makefile.global
 SQL_DEPDIR=.deps/sql
 SQL_BUILDDIR=build/sql
 $(generated_sql_files): $(citus_abs_srcdir)/build/%: %
 	@mkdir -p $(citus_abs_srcdir)/$(SQL_DEPDIR) $(citus_abs_srcdir)/$(SQL_BUILDDIR)
 	@# -MF is used to store dependency files(.Po) in another directory for separation
 	@# -MT is used to change the target of the rule emitted by dependency generation.
 	@# -P is used to inhibit generation of linemarkers in the output from the preprocessor.
 	@# -undef is used to not predefine any system-specific or GCC-specific macros.
 	@# `man cpp` for further information
 	cd $(citus_abs_srcdir) && cpp -undef -w -P -MMD -MP -MF$(SQL_DEPDIR)/$(*F).Po -MT$@ $< > $@
 $(generated_downgrade_sql_files): $(citus_abs_srcdir)/build/sql/%: sql/downgrades/%
 	@mkdir -p $(citus_abs_srcdir)/$(SQL_DEPDIR) $(citus_abs_srcdir)/$(SQL_BUILDDIR)
 	@# -MF is used to store dependency files(.Po) in another directory for separation
 	@# -MT is used to change the target of the rule emitted by dependency generation.
 	@# -P is used to inhibit generation of linemarkers in the output from the preprocessor.
 	@# -undef is used to not predefine any system-specific or GCC-specific macros.
 	@# `man cpp` for further information
 	cd $(citus_abs_srcdir) && cpp -undef -w -P -MMD -MP -MF$(SQL_DEPDIR)/$(*F).Po -MT$@ $< > $@
 .PHONY: install install-downgrades install-all
 cleanup-before-install:
 	rm -f $(DESTDIR)$(datadir)/$(datamoduledir)/citus_columnar.control
 	rm -f $(DESTDIR)$(datadir)/$(datamoduledir)/columnar--*
 	rm -f $(DESTDIR)$(datadir)/$(datamoduledir)/citus_columnar--*
 install: cleanup-before-install
 # install and install-downgrades should be run sequentially
 install-all: install
 	$(MAKE) install-downgrades
 install-downgrades: $(generated_downgrade_sql_files)
 	$(INSTALL_DATA) $(generated_downgrade_sql_files) '$(DESTDIR)$(datadir)/$(datamoduledir)/'
--- a/src/backend/columnar/README.md
+++ b/src/backend/columnar/README.md
@ -1,321 +0,0 @@
 # Introduction
 Citus Columnar offers a per-table option for columnar storage to
 reduce IO requirements though compression and projection pushdown.
 # Design Trade-Offs
 Existing PostgreSQL row tables work well for OLTP:
 * Support `UPDATE`/`DELETE` efficiently
 * Efficient single-tuple lookups
 The Citus Columnar tables work best for analytic or DW workloads:
 * Compression
 * Doesn't read unnecessary columns
 * Efficient `VACUUM`
 # Next generation of cstore_fdw
 Citus Columnar is the next generation of
 [cstore_fdw](https://github.com/citusdata/cstore_fdw/).
 Benefits of Citus Columnar over cstore_fdw:
 * Citus Columnar is based on the [Table Access Method
  API](https://www.postgresql.org/docs/current/tableam.html), which
  allows it to behave exactly like an ordinary heap (row) table for
  most operations.
 * Supports Write-Ahead Log (WAL).
 * Supports ``ROLLBACK``.
 * Supports physical replication.
 * Supports recovery, including Point-In-Time Restore (PITR).
 * Supports ``pg_dump`` and ``pg_upgrade`` without the need for special
  options or extra steps.
 * Better user experience; simple ``USING``clause.
 * Supports more features that work on ordinary heap (row) tables.
 # Limitations
 * Append-only (no ``UPDATE``/``DELETE`` support)
 * No space reclamation (e.g. rolled-back transactions may still
  consume disk space)
 * No bitmap index scans
 * No tidscans
 * No sample scans
 * No TOAST support (large values supported inline)
 * No support for [``ON
  CONFLICT``](https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT)
  statements (except ``DO NOTHING`` actions with no target specified).
 * No support for tuple locks (``SELECT ... FOR SHARE``, ``SELECT
  ... FOR UPDATE``)
 * No support for serializable isolation level
 * Support for PostgreSQL server versions 12+ only
 * No support for foreign keys
 * No support for logical decoding
 * No support for intra-node parallel scans
 * No support for ``AFTER ... FOR EACH ROW`` triggers
 * No `UNLOGGED` columnar tables
 Future iterations will incrementally lift the limitations listed above.
 # User Experience
 Create a Columnar table by specifying ``USING columnar`` when creating
 the table.
 ```sql
 CREATE TABLE my_columnar_table
 (
    id INT,
    i1 INT,
    i2 INT8,
    n NUMERIC,
    t TEXT
 ) USING columnar;
 ```
 Insert data into the table and read from it like normal (subject to
 the limitations listed above).
 To see internal statistics about the table, use ``VACUUM
 VERBOSE``. Note that ``VACUUM`` (without ``FULL``) is much faster on a
 columnar table, because it scans only the metadata, and not the actual
 data.
 ## Options
 Set options using:
 ```sql
 ALTER TABLE my_columnar_table SET
  (columnar.compression = none, columnar.stripe_row_limit = 10000);
 ```
 The following options are available:
 * **columnar.compression**: `[none|pglz|zstd|lz4|lz4hc]` - set the compression type
  for _newly-inserted_ data. Existing data will not be
  recompressed/decompressed. The default value is `zstd` (if support
  has been compiled in).
 * **columnar.compression_level**: ``<integer>`` - Sets compression level. Valid
  settings are from 1 through 19. If the compression method does not
  support the level chosen, the closest level will be selected
  instead.
 * **columnar.stripe_row_limit**: ``<integer>`` - the maximum number of rows per
  stripe for _newly-inserted_ data. Existing stripes of data will not
  be changed and may have more rows than this maximum value. The
  default value is `150000`.
 * **columnar.chunk_group_row_limit**: ``<integer>`` - the maximum number of rows per
  chunk for _newly-inserted_ data. Existing chunks of data will not be
  changed and may have more rows than this maximum value. The default
  value is `10000`.
 View options for all tables with:
 ```sql
 SELECT * FROM columnar.options;
 ```
 You can also adjust options with a `SET` command of one of the
 following GUCs:
 * `columnar.compression`
 * `columnar.compression_level`
 * `columnar.stripe_row_limit`
 * `columnar.chunk_group_row_limit`
 GUCs only affect newly-created *tables*, not any newly-created
 *stripes* on an existing table.
 ## Partitioning
 Columnar tables can be used as partitions; and a partitioned table may
 be made up of any combination of row and columnar partitions.
 ```sql
 CREATE TABLE parent(ts timestamptz, i int, n numeric, s text)
  PARTITION BY RANGE (ts);
 -- columnar partition
 CREATE TABLE p0 PARTITION OF parent
  FOR VALUES FROM ('2020-01-01') TO ('2020-02-01')
  USING COLUMNAR;
 -- columnar partition
 CREATE TABLE p1 PARTITION OF parent
  FOR VALUES FROM ('2020-02-01') TO ('2020-03-01')
  USING COLUMNAR;
 -- row partition
 CREATE TABLE p2 PARTITION OF parent
  FOR VALUES FROM ('2020-03-01') TO ('2020-04-01');
 INSERT INTO parent VALUES ('2020-01-15', 10, 100, 'one thousand'); -- columnar
 INSERT INTO parent VALUES ('2020-02-15', 20, 200, 'two thousand'); -- columnar
 INSERT INTO parent VALUES ('2020-03-15', 30, 300, 'three thousand'); -- row
 ```
 When performing operations on a partitioned table with a mix of row
 and columnar partitions, take note of the following behaviors for
 operations that are supported on row tables but not columnar
 (e.g. ``UPDATE``, ``DELETE``, tuple locks, etc.):
 * If the operation is targeted at a specific row partition
  (e.g. ``UPDATE p2 SET i = i + 1``), it will succeed; if targeted at
  a specified columnar partition (e.g. ``UPDATE p1 SET i = i + 1``),
  it will fail.
 * If the operation is targeted at the partitioned table and has a
  ``WHERE`` clause that excludes all columnar partitions
  (e.g. ``UPDATE parent SET i = i + 1 WHERE ts = '2020-03-15'``), it
  will succeed.
 * If the operation is targeted at the partitioned table, but does not
  exclude all columnar partitions, it will fail; even if the actual
  data to be updated only affects row tables (e.g. ``UPDATE parent SET
  i = i + 1 WHERE n = 300``).
 Note that Citus Columnar supports `btree` and `hash `indexes (and
 the constraints requiring them) but does not support `gist`, `gin`,
 `spgist` and `brin` indexes.
 For this reason, if some partitions are columnar and if the index is
 not supported by Citus Columnar, then it's impossible to create indexes
 on the partitioned (parent) table directly. In that case, you need to
 create the index on the individual row partitions. Similarly for the
 constraints that require indexes, e.g.:
 ```sql
 CREATE INDEX p2_ts_idx ON p2 (ts);
 CREATE UNIQUE INDEX p2_i_unique ON p2 (i);
 ALTER TABLE p2 ADD UNIQUE (n);
 ```
 ## Converting Between Row and Columnar
 Note: ensure that you understand any advanced features that may be
 used with the table before converting it (e.g. row-level security,
 storage options, constraints, inheritance, etc.), and ensure that they
 are reproduced in the new table or partition appropriately. ``LIKE``,
 used below, is a shorthand that works only in simple cases.
 ```sql
 CREATE TABLE my_table(i INT8 DEFAULT '7');
 INSERT INTO my_table VALUES(1);
 -- convert to columnar
 SELECT alter_table_set_access_method('my_table', 'columnar');
 -- back to row
 SELECT alter_table_set_access_method('my_table', 'heap');
 ```
 # Performance Microbenchmark
 *Important*: This microbenchmark is not intended to represent any real
 workload. Compression ratios, and therefore performance, will depend
 heavily on the specific workload. This is only for the purpose of
 illustrating a "columnar friendly" contrived workload that showcases
 the benefits of columnar.
 ## Schema
 ```sql
 CREATE TABLE perf_row(
    id INT8,
    ts TIMESTAMPTZ,
    customer_id INT8,
    vendor_id INT8,
    name TEXT,
    description TEXT,
    value NUMERIC,
    quantity INT4
 );
 CREATE TABLE perf_columnar(LIKE perf_row) USING COLUMNAR;
 ```
 ## Data
 ```sql
 CREATE OR REPLACE FUNCTION random_words(n INT4) RETURNS TEXT LANGUAGE sql AS $$
  WITH words(w) AS (
    SELECT ARRAY['zero','one','two','three','four','five','six','seven','eight','nine','ten']
  ),
  random (word) AS (
    SELECT w[(random()*array_length(w, 1))::int] FROM generate_series(1, $1) AS i, words
  )
  SELECT string_agg(word, ' ') FROM random;
 $$;
 ```
 ```sql
 INSERT INTO perf_row
   SELECT
    g, -- id
    '2020-01-01'::timestamptz + ('1 minute'::interval * g), -- ts
    (random() * 1000000)::INT4, -- customer_id
    (random() * 100)::INT4, -- vendor_id
    random_words(7), -- name
    random_words(100), -- description
    (random() * 100000)::INT4/100.0, -- value
    (random() * 100)::INT4 -- quantity
   FROM generate_series(1,75000000) g;
 INSERT INTO perf_columnar SELECT * FROM perf_row;
 ```
 ## Compression Ratio
 ```
 => SELECT pg_total_relation_size('perf_row')::numeric/pg_total_relation_size('perf_columnar') AS compression_ratio;
 compression_ratio
 --------------------
 5.3958044063457513
 (1 row)
 ```
 The overall compression ratio of columnar table, versus the same data
 stored with row storage, is **5.4X**.
 ```
 => VACUUM VERBOSE perf_columnar;
 INFO:  statistics for "perf_columnar":
 storage id: 10000000000
 total file size: 8761368576, total data size: 8734266196
 compression rate: 5.01x
 total row count: 75000000, stripe count: 500, average rows per stripe: 150000
 chunk count: 60000, containing data for dropped columns: 0, zstd compressed: 60000
 ```
 ``VACUUM VERBOSE`` reports a smaller compression ratio, because it
 only averages the compression ratio of the individual chunks, and does
 not account for the metadata savings of the columnar format.
 ## System
 * Azure VM: Standard D2s v3 (2 vcpus, 8 GiB memory)
 * Linux (ubuntu 18.04)
 * Data Drive: Standard HDD (512GB, 500 IOPS Max, 60 MB/s Max)
 * PostgreSQL 13 (``--with-llvm``, ``--with-python``)
 * ``shared_buffers = 128MB``
 * ``max_parallel_workers_per_gather = 0``
 * ``jit = on``
 Note: because this was run on a system with enough physical memory to
 hold a substantial fraction of the table, the IO benefits of columnar
 won't be entirely realized by the query runtime unless the data size
 is substantially increased.
 ## Query
 ```sql
 -- OFFSET 1000 so that no rows are returned, and we collect only timings
 SELECT vendor_id, SUM(quantity) FROM perf_row GROUP BY vendor_id OFFSET 1000;
 SELECT vendor_id, SUM(quantity) FROM perf_row GROUP BY vendor_id OFFSET 1000;
 SELECT vendor_id, SUM(quantity) FROM perf_row GROUP BY vendor_id OFFSET 1000;
 SELECT vendor_id, SUM(quantity) FROM perf_columnar GROUP BY vendor_id OFFSET 1000;
 SELECT vendor_id, SUM(quantity) FROM perf_columnar GROUP BY vendor_id OFFSET 1000;
 SELECT vendor_id, SUM(quantity) FROM perf_columnar GROUP BY vendor_id OFFSET 1000;
 ```
 Timing (median of three runs):
 * row: 436s
 * columnar: 16s
 * speedup: **27X**
--- a/src/backend/columnar/citus_columnar.control
+++ b/src/backend/columnar/citus_columnar.control
@ -1,6 +0,0 @@
 # Columnar extension
 comment = 'Citus Columnar extension'
 default_version = '12.2-1'
 module_pathname = '$libdir/citus_columnar'
 relocatable = false
 schema = pg_catalog
--- a/src/backend/columnar/columnar.c
+++ b/src/backend/columnar/columnar.c
@ -1,169 +0,0 @@
 /*-------------------------------------------------------------------------
 *
 * columnar.c
 *
 * This file contains...
 *
 * Copyright (c) 2016, Citus Data, Inc.
 *
 * $Id$
 *
 *-------------------------------------------------------------------------
 */
 #include <sys/stat.h>
 #include <unistd.h>
 #include "postgres.h"
 #include "miscadmin.h"
 #include "utils/guc.h"
 #include "utils/rel.h"
 #include "citus_version.h"
 #include "columnar/columnar.h"
 #include "columnar/columnar_tableam.h"
 /* Default values for option parameters */
 #define DEFAULT_STRIPE_ROW_COUNT 150000
 #define DEFAULT_CHUNK_ROW_COUNT 10000
 #if HAVE_LIBZSTD
 #define DEFAULT_COMPRESSION_TYPE COMPRESSION_ZSTD
 #elif HAVE_CITUS_LIBLZ4
 #define DEFAULT_COMPRESSION_TYPE COMPRESSION_LZ4
 #else
 #define DEFAULT_COMPRESSION_TYPE COMPRESSION_PG_LZ
 #endif
 int columnar_compression = DEFAULT_COMPRESSION_TYPE;
 int columnar_stripe_row_limit = DEFAULT_STRIPE_ROW_COUNT;
 int columnar_chunk_group_row_limit = DEFAULT_CHUNK_ROW_COUNT;
 int columnar_compression_level = 3;
 static const struct config_enum_entry columnar_compression_options[] =
 {
 	{ "none", COMPRESSION_NONE, false },
 	{ "pglz", COMPRESSION_PG_LZ, false },
 #if HAVE_CITUS_LIBLZ4
 	{ "lz4", COMPRESSION_LZ4, false },
 #endif
 #if HAVE_LIBZSTD
 	{ "zstd", COMPRESSION_ZSTD, false },
 #endif
 	{ NULL, 0, false }
 };
 void
 columnar_init(void)
 {
 	columnar_init_gucs();
 	columnar_tableam_init();
 }
 void
 columnar_init_gucs()
 {
 	DefineCustomEnumVariable("columnar.compression",
 							 "Compression type for columnar.",
 							 NULL,
 							 &columnar_compression,
 							 DEFAULT_COMPRESSION_TYPE,
 							 columnar_compression_options,
 							 PGC_USERSET,
 							 0,
 							 NULL,
 							 NULL,
 							 NULL);
 	DefineCustomIntVariable("columnar.compression_level",
 							"Compression level to be used with zstd.",
 							NULL,
 							&columnar_compression_level,
 							3,
 							COMPRESSION_LEVEL_MIN,
 							COMPRESSION_LEVEL_MAX,
 							PGC_USERSET,
 							0,
 							NULL,
 							NULL,
 							NULL);
 	DefineCustomIntVariable("columnar.stripe_row_limit",
 							"Maximum number of tuples per stripe.",
 							NULL,
 							&columnar_stripe_row_limit,
 							DEFAULT_STRIPE_ROW_COUNT,
 							STRIPE_ROW_COUNT_MINIMUM,
 							STRIPE_ROW_COUNT_MAXIMUM,
 							PGC_USERSET,
 							0,
 							NULL,
 							NULL,
 							NULL);
 	DefineCustomIntVariable("columnar.chunk_group_row_limit",
 							"Maximum number of rows per chunk.",
 							NULL,
 							&columnar_chunk_group_row_limit,
 							DEFAULT_CHUNK_ROW_COUNT,
 							CHUNK_ROW_COUNT_MINIMUM,
 							CHUNK_ROW_COUNT_MAXIMUM,
 							PGC_USERSET,
 							0,
 							NULL,
 							NULL,
 							NULL);
 }
 /*
 * ParseCompressionType converts a string to a compression type.
 * For compression algorithms that are invalid or not compiled, it
 * returns COMPRESSION_TYPE_INVALID.
 */
 CompressionType
 ParseCompressionType(const char *compressionTypeString)
 {
 	Assert(compressionTypeString != NULL);
 	for (int compressionIndex = 0;
 		 columnar_compression_options[compressionIndex].name != NULL;
 		 compressionIndex++)
 	{
 		const char *compressionName = columnar_compression_options[compressionIndex].name;
 		if (strncmp(compressionTypeString, compressionName, NAMEDATALEN) == 0)
 		{
 			return columnar_compression_options[compressionIndex].val;
 		}
 	}
 	return COMPRESSION_TYPE_INVALID;
 }
 /*
 * CompressionTypeStr returns string representation of a compression type.
 * For compression algorithms that are invalid or not compiled, it
 * returns NULL.
 */
 const char *
 CompressionTypeStr(CompressionType requestedType)
 {
 	for (int compressionIndex = 0;
 		 columnar_compression_options[compressionIndex].name != NULL;
 		 compressionIndex++)
 	{
 		CompressionType compressionType =
 			columnar_compression_options[compressionIndex].val;
 		if (compressionType == requestedType)
 		{
 			return columnar_compression_options[compressionIndex].name;
 		}
 	}
 	return NULL;
 }
--- a/src/backend/columnar/columnar_compression.c
+++ b/src/backend/columnar/columnar_compression.c
@ -1,272 +0,0 @@
 /*-------------------------------------------------------------------------
 *
 * columnar_compression.c
 *
 * This file contains compression/decompression functions definitions
 * used for columnar.
 *
 * Copyright (c) 2016, Citus Data, Inc.
 *
 * $Id$
 *
 *-------------------------------------------------------------------------
 */
 #include "postgres.h"
 #include "common/pg_lzcompress.h"
 #include "lib/stringinfo.h"
 #include "citus_version.h"
 #include "pg_version_constants.h"
 #include "columnar/columnar_compression.h"
 #if HAVE_CITUS_LIBLZ4
 #include <lz4.h>
 #endif
 #if PG_VERSION_NUM >= PG_VERSION_16
 #include "varatt.h"
 #endif
 #if HAVE_LIBZSTD
 #include <zstd.h>
 #endif
 /*
 *	The information at the start of the compressed data. This decription is taken
 *	from pg_lzcompress in pre-9.5 version of PostgreSQL.
 */
 typedef struct ColumnarCompressHeader
 {
 	int32 vl_len_;              /* varlena header (do not touch directly!) */
 	int32 rawsize;
 } ColumnarCompressHeader;
 /*
 * Utilities for manipulation of header information for compressed data
 */
 #define COLUMNAR_COMPRESS_HDRSZ ((int32) sizeof(ColumnarCompressHeader))
 #define COLUMNAR_COMPRESS_RAWSIZE(ptr) (((ColumnarCompressHeader *) (ptr))->rawsize)
 #define COLUMNAR_COMPRESS_RAWDATA(ptr) (((char *) (ptr)) + COLUMNAR_COMPRESS_HDRSZ)
 #define COLUMNAR_COMPRESS_SET_RAWSIZE(ptr, \
 									  len) (((ColumnarCompressHeader *) (ptr))->rawsize = \
 												(len))
 /*
 * CompressBuffer compresses the given buffer with the given compression type
 * outputBuffer enlarged to contain compressed data. The function returns true
 * if compression is done, returns false if compression is not done.
 * outputBuffer is valid only if the function returns true.
 */
 bool
 CompressBuffer(StringInfo inputBuffer,
 			   StringInfo outputBuffer,
 			   CompressionType compressionType,
 			   int compressionLevel)
 {
 	switch (compressionType)
 	{
 #if HAVE_CITUS_LIBLZ4
 		case COMPRESSION_LZ4:
 		{
 			int maximumLength = LZ4_compressBound(inputBuffer->len);
 			resetStringInfo(outputBuffer);
 			enlargeStringInfo(outputBuffer, maximumLength);
 			int compressedSize = LZ4_compress_default(inputBuffer->data,
 													  outputBuffer->data,
 													  inputBuffer->len, maximumLength);
 			if (compressedSize <= 0)
 			{
 				elog(DEBUG1,
 					 "failure in LZ4_compress_default, input size=%d, output size=%d",
 					 inputBuffer->len, maximumLength);
 				return false;
 			}
 			elog(DEBUG1, "compressed %d bytes to %d bytes", inputBuffer->len,
 				 compressedSize);
 			outputBuffer->len = compressedSize;
 			return true;
 		}
 #endif
 #if HAVE_LIBZSTD
 		case COMPRESSION_ZSTD:
 		{
 			int maximumLength = ZSTD_compressBound(inputBuffer->len);
 			resetStringInfo(outputBuffer);
 			enlargeStringInfo(outputBuffer, maximumLength);
 			size_t compressedSize = ZSTD_compress(outputBuffer->data,
 												  outputBuffer->maxlen,
 												  inputBuffer->data,
 												  inputBuffer->len,
 												  compressionLevel);
 			if (ZSTD_isError(compressedSize))
 			{
 				ereport(WARNING, (errmsg("zstd compression failed"),
 								  (errdetail("%s", ZSTD_getErrorName(compressedSize)))));
 				return false;
 			}
 			outputBuffer->len = compressedSize;
 			return true;
 		}
 #endif
 		case COMPRESSION_PG_LZ:
 		{
 			uint64 maximumLength = PGLZ_MAX_OUTPUT(inputBuffer->len) +
 								   COLUMNAR_COMPRESS_HDRSZ;
 			bool compressionResult = false;
 			resetStringInfo(outputBuffer);
 			enlargeStringInfo(outputBuffer, maximumLength);
 			int32 compressedByteCount = pglz_compress((const char *) inputBuffer->data,
 													  inputBuffer->len,
 													  COLUMNAR_COMPRESS_RAWDATA(
 														  outputBuffer->data),
 													  PGLZ_strategy_always);
 			if (compressedByteCount >= 0)
 			{
 				COLUMNAR_COMPRESS_SET_RAWSIZE(outputBuffer->data, inputBuffer->len);
 				SET_VARSIZE_COMPRESSED(outputBuffer->data,
 									   compressedByteCount + COLUMNAR_COMPRESS_HDRSZ);
 				compressionResult = true;
 			}
 			if (compressionResult)
 			{
 				outputBuffer->len = VARSIZE(outputBuffer->data);
 			}
 			return compressionResult;
 		}
 		default:
 		{
 			return false;
 		}
 	}
 }
 /*
 * DecompressBuffer decompresses the given buffer with the given compression
 * type. This function returns the buffer as-is when no compression is applied.
 */
 StringInfo
 DecompressBuffer(StringInfo buffer,
 				 CompressionType compressionType,
 				 uint64 decompressedSize)
 {
 	switch (compressionType)
 	{
 		case COMPRESSION_NONE:
 		{
 			return buffer;
 		}
 #if HAVE_CITUS_LIBLZ4
 		case COMPRESSION_LZ4:
 		{
 			StringInfo decompressedBuffer = makeStringInfo();
 			enlargeStringInfo(decompressedBuffer, decompressedSize);
 			int lz4DecompressSize = LZ4_decompress_safe(buffer->data,
 														decompressedBuffer->data,
 														buffer->len,
 														decompressedSize);
 			if (lz4DecompressSize != decompressedSize)
 			{
 				ereport(ERROR, (errmsg("cannot decompress the buffer"),
 								errdetail("Expected %lu bytes, but received %d bytes",
 										  decompressedSize, lz4DecompressSize)));
 			}
 			decompressedBuffer->len = decompressedSize;
 			return decompressedBuffer;
 		}
 #endif
 #if HAVE_LIBZSTD
 		case COMPRESSION_ZSTD:
 		{
 			StringInfo decompressedBuffer = makeStringInfo();
 			enlargeStringInfo(decompressedBuffer, decompressedSize);
 			size_t zstdDecompressSize = ZSTD_decompress(decompressedBuffer->data,
 														decompressedSize,
 														buffer->data,
 														buffer->len);
 			if (ZSTD_isError(zstdDecompressSize))
 			{
 				ereport(ERROR, (errmsg("zstd decompression failed"),
 								(errdetail("%s", ZSTD_getErrorName(
 											   zstdDecompressSize)))));
 			}
 			if (zstdDecompressSize != decompressedSize)
 			{
 				ereport(ERROR, (errmsg("unexpected decompressed size"),
 								errdetail("Expected %ld, received %ld", decompressedSize,
 										  zstdDecompressSize)));
 			}
 			decompressedBuffer->len = decompressedSize;
 			return decompressedBuffer;
 		}
 #endif
 		case COMPRESSION_PG_LZ:
 		{
 			uint32 compressedDataSize = VARSIZE(buffer->data) - COLUMNAR_COMPRESS_HDRSZ;
 			uint32 decompressedDataSize = COLUMNAR_COMPRESS_RAWSIZE(buffer->data);
 			if (compressedDataSize + COLUMNAR_COMPRESS_HDRSZ != buffer->len)
 			{
 				ereport(ERROR, (errmsg("cannot decompress the buffer"),
 								errdetail("Expected %u bytes, but received %u bytes",
 										  compressedDataSize, buffer->len)));
 			}
 			char *decompressedData = palloc0(decompressedDataSize);
 			int32 decompressedByteCount = pglz_decompress(COLUMNAR_COMPRESS_RAWDATA(
 															  buffer->data),
 														  compressedDataSize,
 														  decompressedData,
 														  decompressedDataSize, true);
 			if (decompressedByteCount < 0)
 			{
 				ereport(ERROR, (errmsg("cannot decompress the buffer"),
 								errdetail("compressed data is corrupted")));
 			}
 			StringInfo decompressedBuffer = palloc0(sizeof(StringInfoData));
 			decompressedBuffer->data = decompressedData;
 			decompressedBuffer->len = decompressedDataSize;
 			decompressedBuffer->maxlen = decompressedDataSize;
 			return decompressedBuffer;
 		}
 		default:
 		{
 			ereport(ERROR, (errmsg("unexpected compression type: %d", compressionType)));
 		}
 	}
 }
--- a/src/backend/columnar/columnar_customscan.c
+++ b/src/backend/columnar/columnar_customscan.c
--- a/src/backend/columnar/columnar_debug.c
+++ b/src/backend/columnar/columnar_debug.c
@ -1,165 +0,0 @@
 /*-------------------------------------------------------------------------
 *
 * columnar_debug.c
 *
 * Helper functions to debug column store.
 *
 *-------------------------------------------------------------------------
 */
 #include "postgres.h"
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "access/nbtree.h"
 #include "access/table.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_type.h"
 #include "storage/fd.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 #include "utils/tuplestore.h"
 #include "pg_version_compat.h"
 #include "pg_version_constants.h"
 #include "columnar/columnar.h"
 #include "columnar/columnar_storage.h"
 #include "columnar/columnar_version_compat.h"
 static void MemoryContextTotals(MemoryContext context, MemoryContextCounters *counters);
 PG_FUNCTION_INFO_V1(columnar_store_memory_stats);
 PG_FUNCTION_INFO_V1(columnar_storage_info);
 /*
 * columnar_store_memory_stats returns a record of 3 values: size of
 * TopMemoryContext, TopTransactionContext, and Write State context.
 */
 Datum
 columnar_store_memory_stats(PG_FUNCTION_ARGS)
 {
 	const int resultColumnCount = 3;
 	TupleDesc tupleDescriptor = CreateTemplateTupleDesc(resultColumnCount);
 	TupleDescInitEntry(tupleDescriptor, (AttrNumber) 1, "TopMemoryContext",
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupleDescriptor, (AttrNumber) 2, "TopTransactionContext",
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupleDescriptor, (AttrNumber) 3, "WriteStateContext",
 					   INT8OID, -1, 0);
 	tupleDescriptor = BlessTupleDesc(tupleDescriptor);
 	MemoryContextCounters transactionCounters = { 0 };
 	MemoryContextCounters topCounters = { 0 };
 	MemoryContextCounters writeStateCounters = { 0 };
 	MemoryContextTotals(TopTransactionContext, &transactionCounters);
 	MemoryContextTotals(TopMemoryContext, &topCounters);
 	MemoryContextTotals(GetWriteContextForDebug(), &writeStateCounters);
 	bool nulls[3] = { false };
 	Datum values[3] = {
 		Int64GetDatum(topCounters.totalspace),
 		Int64GetDatum(transactionCounters.totalspace),
 		Int64GetDatum(writeStateCounters.totalspace)
 	};
 	HeapTuple tuple = heap_form_tuple(tupleDescriptor, values, nulls);
 	PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
 }
 /*
 * columnar_storage_info - UDF to return internal storage info for a columnar relation.
 *
 * DDL:
 *  CREATE OR REPLACE FUNCTION columnar_storage_info(
 *      rel regclass,
 *      version_major OUT int4,
 *      version_minor OUT int4,
 *      storage_id OUT int8,
 *      reserved_stripe_id OUT int8,
 *      reserved_row_number OUT int8,
 *      reserved_offset OUT int8)
 *    STRICT
 *    LANGUAGE c AS 'MODULE_PATHNAME', 'columnar_storage_info';
 */
 Datum
 columnar_storage_info(PG_FUNCTION_ARGS)
 {
 #define STORAGE_INFO_NATTS 6
 	Oid relid = PG_GETARG_OID(0);
 	TupleDesc tupdesc;
 	/* Build a tuple descriptor for our result type */
 	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 	{
 		elog(ERROR, "return type must be a row type");
 	}
 	if (tupdesc->natts != STORAGE_INFO_NATTS)
 	{
 		elog(ERROR, "return type must have %d columns", STORAGE_INFO_NATTS);
 	}
 	Relation rel = table_open(relid, AccessShareLock);
 	if (!IsColumnarTableAmTable(relid))
 	{
 		ereport(ERROR, (errmsg("table \"%s\" is not a columnar table",
 							   RelationGetRelationName(rel))));
 	}
 	Datum values[STORAGE_INFO_NATTS] = { 0 };
 	bool nulls[STORAGE_INFO_NATTS] = { 0 };
 	/*
 	 * Pass force = true so that we can inspect metapages that are not the
 	 * current version.
 	 *
 	 * NB: ensure the order and number of attributes correspond to DDL
 	 * declaration.
 	 */
 	values[0] = Int32GetDatum(ColumnarStorageGetVersionMajor(rel, true));
 	values[1] = Int32GetDatum(ColumnarStorageGetVersionMinor(rel, true));
 	values[2] = Int64GetDatum(ColumnarStorageGetStorageId(rel, true));
 	values[3] = Int64GetDatum(ColumnarStorageGetReservedStripeId(rel, true));
 	values[4] = Int64GetDatum(ColumnarStorageGetReservedRowNumber(rel, true));
 	values[5] = Int64GetDatum(ColumnarStorageGetReservedOffset(rel, true));
 	/* release lock */
 	table_close(rel, AccessShareLock);
 	HeapTuple tuple = heap_form_tuple(tupdesc, values, nulls);
 	PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
 }
 /*
 * MemoryContextTotals adds stats of the given memory context and its
 * subtree to the given counters.
 */
 static void
 MemoryContextTotals(MemoryContext context, MemoryContextCounters *counters)
 {
 	if (context == NULL)
 	{
 		return;
 	}
 	MemoryContext child;
 	for (child = context->firstchild; child != NULL; child = child->nextchild)
 	{
 		MemoryContextTotals(child, counters);
 	}
 	context->methods->stats(context, NULL, NULL, counters, true);
 }
--- a/src/backend/columnar/columnar_metadata.c
+++ b/src/backend/columnar/columnar_metadata.c
--- a/src/backend/columnar/columnar_reader.c
+++ b/src/backend/columnar/columnar_reader.c
--- a/src/backend/columnar/columnar_storage.c
+++ b/src/backend/columnar/columnar_storage.c
@ -1,866 +0,0 @@
 /*-------------------------------------------------------------------------
 *
 * columnar_storage.c
 *
 * Copyright (c) Citus Data, Inc.
 *
 * Low-level storage layer for columnar.
 *   - Translates columnar read/write operations on logical offsets into operations on pages/blocks.
 *   - Emits WAL.
 *   - Reads/writes the columnar metapage.
 *   - Reserves data offsets, stripe numbers, and row offsets.
 *   - Truncation.
 *
 * Higher-level columnar operations deal with logical offsets and large
 * contiguous buffers of data that need to be stored. But the buffer manager
 * and WAL depend on formatted pages with headers, so these large buffers need
 * to be written across many pages. This module translates the contiguous
 * buffers into individual block reads/writes, and performs WAL when
 * necessary.
 *
 * Storage layout: a metapage in block 0, followed by an empty page in block
 * 1, followed by logical data starting at the first byte after the page
 * header in block 2 (having logical offset ColumnarFirstLogicalOffset). (XXX:
 * Block 1 is left empty for no particular reason. Reconsider?). A columnar
 * table should always have at least 2 blocks.
 *
 * Reservation is done with a relation extension lock, and designed for
 * concurrency, so the callers only need an ordinary lock on the
 * relation. Initializing the metapage or truncating the relation require that
 * the caller holds an AccessExclusiveLock. (XXX: New reservations of data are
 * aligned onto a new page for no particular reason. Reconsider?).
 *
 *-------------------------------------------------------------------------
 */
 #include "postgres.h"
 #include "miscadmin.h"
 #include "safe_lib.h"
 #include "access/generic_xlog.h"
 #include "catalog/storage.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "pg_version_compat.h"
 #include "columnar/columnar.h"
 #include "columnar/columnar_storage.h"
 /*
 * Content of the first page in main fork, which stores metadata at file
 * level.
 */
 typedef struct ColumnarMetapage
 {
 	/*
 	 * Store version of file format used, so we can detect files from
 	 * previous versions if we change file format.
 	 */
 	uint32 versionMajor;
 	uint32 versionMinor;
 	/*
 	 * Each of the metadata table rows are identified by a storageId.
 	 * We store it also in the main fork so we can link metadata rows
 	 * with data files.
 	 */
 	uint64 storageId;
 	uint64 reservedStripeId; /* first unused stripe id */
 	uint64 reservedRowNumber; /* first unused row number */
 	uint64 reservedOffset; /* first unused byte offset */
 	/*
 	 * Flag set to true in the init fork. After an unlogged table reset (due
 	 * to a crash), the init fork will be copied over the main fork. When
 	 * trying to read an unlogged table, if this flag is set to true, we must
 	 * clear the metadata for the table (because the actual data is gone,
 	 * too), and clear the flag. We can cross-check that the table is
 	 * UNLOGGED, and that the main fork is at the minimum size (no actual
 	 * data).
 	 *
 	 * XXX: Not used yet; reserved field for later support for UNLOGGED.
 	 */
 	bool unloggedReset;
 } ColumnarMetapage;
 /* represents a "physical" block+offset address */
 typedef struct PhysicalAddr
 {
 	BlockNumber blockno;
 	uint32 offset;
 } PhysicalAddr;
 #define COLUMNAR_METAPAGE_BLOCKNO 0
 #define COLUMNAR_EMPTY_BLOCKNO 1
 #define COLUMNAR_INVALID_STRIPE_ID 0
 #define COLUMNAR_FIRST_STRIPE_ID 1
 #define OLD_METAPAGE_VERSION_HINT "Use \"VACUUM\" to upgrade the columnar table format " \
 								  "version or run \"ALTER EXTENSION citus UPDATE\"."
 /* only for testing purposes */
 PG_FUNCTION_INFO_V1(test_columnar_storage_write_new_page);
 /*
 * Map logical offsets to a physical page and offset where the data is kept.
 */
 static inline PhysicalAddr
 LogicalToPhysical(uint64 logicalOffset)
 {
 	PhysicalAddr addr;
 	addr.blockno = logicalOffset / COLUMNAR_BYTES_PER_PAGE;
 	addr.offset = SizeOfPageHeaderData + (logicalOffset % COLUMNAR_BYTES_PER_PAGE);
 	return addr;
 }
 /*
 * Map a physical page and offset address to a logical address.
 */
 static inline uint64
 PhysicalToLogical(PhysicalAddr addr)
 {
 	return COLUMNAR_BYTES_PER_PAGE * addr.blockno + addr.offset - SizeOfPageHeaderData;
 }
 static void ColumnarOverwriteMetapage(Relation relation,
 									  ColumnarMetapage columnarMetapage);
 static ColumnarMetapage ColumnarMetapageRead(Relation rel, bool force);
 static void ReadFromBlock(Relation rel, BlockNumber blockno, uint32 offset,
 						  char *buf, uint32 len, bool force);
 static void WriteToBlock(Relation rel, BlockNumber blockno, uint32 offset,
 						 char *buf, uint32 len, bool clear);
 static uint64 AlignReservation(uint64 prevReservation);
 static bool ColumnarMetapageIsCurrent(ColumnarMetapage *metapage);
 static bool ColumnarMetapageIsOlder(ColumnarMetapage *metapage);
 static bool ColumnarMetapageIsNewer(ColumnarMetapage *metapage);
 static void ColumnarMetapageCheckVersion(Relation rel, ColumnarMetapage *metapage);
 /*
 * ColumnarStorageInit - initialize a new metapage in an empty relation
 * with the given storageId.
 *
 * Caller must hold AccessExclusiveLock on the relation.
 */
 void
 ColumnarStorageInit(SMgrRelation srel, uint64 storageId)
 {
 	BlockNumber nblocks = smgrnblocks(srel, MAIN_FORKNUM);
 	if (nblocks > 0)
 	{
 		elog(ERROR,
 			 "attempted to initialize metapage, but %d pages already exist",
 			 nblocks);
 	}
 	/* create two pages */
 #if PG_VERSION_NUM >= PG_VERSION_16
 	PGIOAlignedBlock block;
 #else
 	PGAlignedBlock block;
 #endif
 	Page page = block.data;
 	/* write metapage */
 	PageInit(page, BLCKSZ, 0);
 	PageHeader phdr = (PageHeader) page;
 	ColumnarMetapage metapage = { 0 };
 	metapage.storageId = storageId;
 	metapage.versionMajor = COLUMNAR_VERSION_MAJOR;
 	metapage.versionMinor = COLUMNAR_VERSION_MINOR;
 	metapage.reservedStripeId = COLUMNAR_FIRST_STRIPE_ID;
 	metapage.reservedRowNumber = COLUMNAR_FIRST_ROW_NUMBER;
 	metapage.reservedOffset = ColumnarFirstLogicalOffset;
 	metapage.unloggedReset = false;
 	memcpy_s(page + phdr->pd_lower, phdr->pd_upper - phdr->pd_lower,
 			 (char *) &metapage, sizeof(ColumnarMetapage));
 	phdr->pd_lower += sizeof(ColumnarMetapage);
 	log_newpage(RelationPhysicalIdentifierBackend_compat(&srel), MAIN_FORKNUM,
 				COLUMNAR_METAPAGE_BLOCKNO, page, true);
 	PageSetChecksumInplace(page, COLUMNAR_METAPAGE_BLOCKNO);
 	smgrextend(srel, MAIN_FORKNUM, COLUMNAR_METAPAGE_BLOCKNO, page, true);
 	/* write empty page */
 	PageInit(page, BLCKSZ, 0);
 	log_newpage(RelationPhysicalIdentifierBackend_compat(&srel), MAIN_FORKNUM,
 				COLUMNAR_EMPTY_BLOCKNO, page, true);
 	PageSetChecksumInplace(page, COLUMNAR_EMPTY_BLOCKNO);
 	smgrextend(srel, MAIN_FORKNUM, COLUMNAR_EMPTY_BLOCKNO, page, true);
 	/*
 	 * An immediate sync is required even if we xlog'd the page, because the
 	 * write did not go through shared_buffers and therefore a concurrent
 	 * checkpoint may have moved the redo pointer past our xlog record.
 	 */
 	smgrimmedsync(srel, MAIN_FORKNUM);
 }
 /*
 * ColumnarStorageUpdateCurrent - update the metapage to the current
 * version. No effect if the version already matches. If 'upgrade' is true,
 * throw an error if metapage version is newer; if 'upgrade' is false, it's a
 * downgrade, so throw an error if the metapage version is older.
 *
 * NB: caller must ensure that metapage already exists, which might not be the
 * case on 10.0.
 */
 void
 ColumnarStorageUpdateCurrent(Relation rel, bool upgrade, uint64 reservedStripeId,
 							 uint64 reservedRowNumber, uint64 reservedOffset)
 {
 	LockRelationForExtension(rel, ExclusiveLock);
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, true);
 	if (ColumnarMetapageIsCurrent(&metapage))
 	{
 		/* nothing to do */
 		return;
 	}
 	if (upgrade && ColumnarMetapageIsNewer(&metapage))
 	{
 		elog(ERROR, "found newer columnar metapage while upgrading");
 	}
 	if (!upgrade && ColumnarMetapageIsOlder(&metapage))
 	{
 		elog(ERROR, "found older columnar metapage while downgrading");
 	}
 	metapage.versionMajor = COLUMNAR_VERSION_MAJOR;
 	metapage.versionMinor = COLUMNAR_VERSION_MINOR;
 	/* storageId remains the same */
 	metapage.reservedStripeId = reservedStripeId;
 	metapage.reservedRowNumber = reservedRowNumber;
 	metapage.reservedOffset = reservedOffset;
 	ColumnarOverwriteMetapage(rel, metapage);
 	UnlockRelationForExtension(rel, ExclusiveLock);
 }
 /*
 * ColumnarStorageGetVersionMajor - return major version from the metapage.
 *
 * Throw an error if the metapage is not the current version, unless
 * 'force' is true.
 */
 uint64
 ColumnarStorageGetVersionMajor(Relation rel, bool force)
 {
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, force);
 	return metapage.versionMajor;
 }
 /*
 * ColumnarStorageGetVersionMinor - return minor version from the metapage.
 *
 * Throw an error if the metapage is not the current version, unless
 * 'force' is true.
 */
 uint64
 ColumnarStorageGetVersionMinor(Relation rel, bool force)
 {
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, force);
 	return metapage.versionMinor;
 }
 /*
 * ColumnarStorageGetStorageId - return storage ID from the metapage.
 *
 * Throw an error if the metapage is not the current version, unless
 * 'force' is true.
 */
 uint64
 ColumnarStorageGetStorageId(Relation rel, bool force)
 {
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, force);
 	return metapage.storageId;
 }
 /*
 * ColumnarStorageGetReservedStripeId - return reserved stripe ID from the
 * metapage.
 *
 * Throw an error if the metapage is not the current version, unless
 * 'force' is true.
 */
 uint64
 ColumnarStorageGetReservedStripeId(Relation rel, bool force)
 {
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, force);
 	return metapage.reservedStripeId;
 }
 /*
 * ColumnarStorageGetReservedRowNumber - return reserved row number from the
 * metapage.
 *
 * Throw an error if the metapage is not the current version, unless
 * 'force' is true.
 */
 uint64
 ColumnarStorageGetReservedRowNumber(Relation rel, bool force)
 {
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, force);
 	return metapage.reservedRowNumber;
 }
 /*
 * ColumnarStorageGetReservedOffset - return reserved offset from the metapage.
 *
 * Throw an error if the metapage is not the current version, unless
 * 'force' is true.
 */
 uint64
 ColumnarStorageGetReservedOffset(Relation rel, bool force)
 {
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, force);
 	return metapage.reservedOffset;
 }
 /*
 * ColumnarStorageIsCurrent - return true if metapage exists and is not
 * the current version.
 */
 bool
 ColumnarStorageIsCurrent(Relation rel)
 {
 	BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
 	if (nblocks < 2)
 	{
 		return false;
 	}
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, true);
 	return ColumnarMetapageIsCurrent(&metapage);
 }
 /*
 * ColumnarStorageReserveRowNumber returns reservedRowNumber and advances
 * it for next row number reservation.
 */
 uint64
 ColumnarStorageReserveRowNumber(Relation rel, uint64 nrows)
 {
 	LockRelationForExtension(rel, ExclusiveLock);
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, false);
 	uint64 firstRowNumber = metapage.reservedRowNumber;
 	metapage.reservedRowNumber += nrows;
 	ColumnarOverwriteMetapage(rel, metapage);
 	UnlockRelationForExtension(rel, ExclusiveLock);
 	return firstRowNumber;
 }
 /*
 * ColumnarStorageReserveStripeId returns stripeId and advances it for next
 * stripeId reservation.
 * Note that this function doesn't handle row number reservation.
 * See ColumnarStorageReserveRowNumber function.
 */
 uint64
 ColumnarStorageReserveStripeId(Relation rel)
 {
 	LockRelationForExtension(rel, ExclusiveLock);
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, false);
 	uint64 stripeId = metapage.reservedStripeId;
 	metapage.reservedStripeId++;
 	ColumnarOverwriteMetapage(rel, metapage);
 	UnlockRelationForExtension(rel, ExclusiveLock);
 	return stripeId;
 }
 /*
 * ColumnarStorageReserveData - reserve logical data offsets for writing.
 */
 uint64
 ColumnarStorageReserveData(Relation rel, uint64 amount)
 {
 	if (amount == 0)
 	{
 		return ColumnarInvalidLogicalOffset;
 	}
 	LockRelationForExtension(rel, ExclusiveLock);
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, false);
 	uint64 alignedReservation = AlignReservation(metapage.reservedOffset);
 	uint64 nextReservation = alignedReservation + amount;
 	metapage.reservedOffset = nextReservation;
 	/* write new reservation */
 	ColumnarOverwriteMetapage(rel, metapage);
 	/* last used PhysicalAddr of new reservation */
 	PhysicalAddr final = LogicalToPhysical(nextReservation - 1);
 	/* extend with new pages */
 	BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
 	while (nblocks <= final.blockno)
 	{
 		Buffer newBuffer = ReadBuffer(rel, P_NEW);
 		Assert(BufferGetBlockNumber(newBuffer) == nblocks);
 		ReleaseBuffer(newBuffer);
 		nblocks++;
 	}
 	UnlockRelationForExtension(rel, ExclusiveLock);
 	return alignedReservation;
 }
 /*
 * ColumnarStorageRead - map the logical offset to a block and offset, then
 * read the buffer from multiple blocks if necessary.
 */
 void
 ColumnarStorageRead(Relation rel, uint64 logicalOffset, char *data, uint32 amount)
 {
 	/* if there's no work to do, succeed even with invalid offset */
 	if (amount == 0)
 	{
 		return;
 	}
 	if (!ColumnarLogicalOffsetIsValid(logicalOffset))
 	{
 		elog(ERROR,
 			 "attempted columnar read on relation %d from invalid logical offset: "
 			 UINT64_FORMAT,
 			 rel->rd_id, logicalOffset);
 	}
 	uint64 read = 0;
 	while (read < amount)
 	{
 		PhysicalAddr addr = LogicalToPhysical(logicalOffset + read);
 		uint32 to_read = Min(amount - read, BLCKSZ - addr.offset);
 		ReadFromBlock(rel, addr.blockno, addr.offset, data + read, to_read,
 					  false);
 		read += to_read;
 	}
 }
 /*
 * ColumnarStorageWrite - map the logical offset to a block and offset, then
 * write the buffer across multiple blocks if necessary.
 */
 void
 ColumnarStorageWrite(Relation rel, uint64 logicalOffset, char *data, uint32 amount)
 {
 	/* if there's no work to do, succeed even with invalid offset */
 	if (amount == 0)
 	{
 		return;
 	}
 	if (!ColumnarLogicalOffsetIsValid(logicalOffset))
 	{
 		elog(ERROR,
 			 "attempted columnar write on relation %d to invalid logical offset: "
 			 UINT64_FORMAT,
 			 rel->rd_id, logicalOffset);
 	}
 	uint64 written = 0;
 	while (written < amount)
 	{
 		PhysicalAddr addr = LogicalToPhysical(logicalOffset + written);
 		uint64 to_write = Min(amount - written, BLCKSZ - addr.offset);
 		WriteToBlock(rel, addr.blockno, addr.offset, data + written, to_write,
 					 false);
 		written += to_write;
 	}
 }
 /*
 * ColumnarStorageTruncate - truncate the columnar storage such that
 * newDataReservation will be the first unused logical offset available. Free
 * pages at the end of the relation.
 *
 * Caller must hold AccessExclusiveLock on the relation.
 *
 * Returns true if pages were truncated; false otherwise.
 */
 bool
 ColumnarStorageTruncate(Relation rel, uint64 newDataReservation)
 {
 	if (!ColumnarLogicalOffsetIsValid(newDataReservation))
 	{
 		elog(ERROR,
 			 "attempted to truncate relation %d to invalid logical offset: " UINT64_FORMAT,
 			 rel->rd_id, newDataReservation);
 	}
 	BlockNumber old_rel_pages = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
 	if (old_rel_pages == 0)
 	{
 		/* nothing to do */
 		return false;
 	}
 	LockRelationForExtension(rel, ExclusiveLock);
 	ColumnarMetapage metapage = ColumnarMetapageRead(rel, false);
 	if (metapage.reservedOffset < newDataReservation)
 	{
 		elog(ERROR,
 			 "attempted to truncate relation %d to offset " UINT64_FORMAT \
 			 " which is higher than existing offset " UINT64_FORMAT,
 			 rel->rd_id, newDataReservation, metapage.reservedOffset);
 	}
 	if (metapage.reservedOffset == newDataReservation)
 	{
 		/* nothing to do */
 		UnlockRelationForExtension(rel, ExclusiveLock);
 		return false;
 	}
 	metapage.reservedOffset = newDataReservation;
 	/* write new reservation */
 	ColumnarOverwriteMetapage(rel, metapage);
 	UnlockRelationForExtension(rel, ExclusiveLock);
 	PhysicalAddr final = LogicalToPhysical(newDataReservation - 1);
 	BlockNumber new_rel_pages = final.blockno + 1;
 	Assert(new_rel_pages <= old_rel_pages);
 	/*
 	 * Truncate the storage. Note that RelationTruncate() takes care of
 	 * Write Ahead Logging.
 	 */
 	if (new_rel_pages < old_rel_pages)
 	{
 		RelationTruncate(rel, new_rel_pages);
 		return true;
 	}
 	return false;
 }
 /*
 * ColumnarOverwriteMetapage writes given columnarMetapage back to metapage
 * for given relation.
 */
 static void
 ColumnarOverwriteMetapage(Relation relation, ColumnarMetapage columnarMetapage)
 {
 	/* clear metapage because we are overwriting */
 	bool clear = true;
 	WriteToBlock(relation, COLUMNAR_METAPAGE_BLOCKNO, SizeOfPageHeaderData,
 				 (char *) &columnarMetapage, sizeof(ColumnarMetapage), clear);
 }
 /*
 * ColumnarMetapageRead - read the current contents of the metapage. Error if
 * it does not exist. Throw an error if the metapage is not the current
 * version, unless 'force' is true.
 *
 * NB: it's safe to read a different version of a metapage because we
 * guarantee that fields will only be added and existing fields will never be
 * changed. However, it's important that we don't depend on new fields being
 * set properly when we read an old metapage; an old metapage should only be
 * read for the purposes of upgrading or error checking.
 */
 static ColumnarMetapage
 ColumnarMetapageRead(Relation rel, bool force)
 {
 	BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
 	if (nblocks == 0)
 	{
 		/*
 		 * We only expect this to happen when upgrading citus.so. This is because,
 		 * in current version of columnar, we immediately create the metapage
 		 * for columnar tables, i.e right after creating the table.
 		 * However in older versions, we were creating metapages lazily, i.e
 		 * when ingesting data to columnar table.
 		 */
 		ereport(ERROR, (errmsg("columnar metapage for relation \"%s\" does not exist",
 							   RelationGetRelationName(rel)),
 						errhint(OLD_METAPAGE_VERSION_HINT)));
 	}
 	/*
 	 * Regardless of "force" parameter, always force read metapage block.
 	 * We will check metapage version in ColumnarMetapageCheckVersion
 	 * depending on "force".
 	 */
 	bool forceReadBlock = true;
 	ColumnarMetapage metapage;
 	ReadFromBlock(rel, COLUMNAR_METAPAGE_BLOCKNO, SizeOfPageHeaderData,
 				  (char *) &metapage, sizeof(ColumnarMetapage), forceReadBlock);
 	if (!force)
 	{
 		ColumnarMetapageCheckVersion(rel, &metapage);
 	}
 	return metapage;
 }
 /*
 * ReadFromBlock - read bytes from a page at the given offset. If 'force' is
 * true, don't check pd_lower; useful when reading a metapage of unknown
 * version.
 */
 static void
 ReadFromBlock(Relation rel, BlockNumber blockno, uint32 offset, char *buf,
 			  uint32 len, bool force)
 {
 	Buffer buffer = ReadBuffer(rel, blockno);
 	LockBuffer(buffer, BUFFER_LOCK_SHARE);
 	Page page = BufferGetPage(buffer);
 	PageHeader phdr = (PageHeader) page;
 	if (BLCKSZ < offset + len || (!force && (phdr->pd_lower < offset + len)))
 	{
 		elog(ERROR,
 			 "attempt to read columnar data of length %d from offset %d of block %d of relation %d",
 			 len, offset, blockno, rel->rd_id);
 	}
 	memcpy_s(buf, len, page + offset, len);
 	UnlockReleaseBuffer(buffer);
 }
 /*
 * WriteToBlock - append data to a block, initializing if necessary, and emit
 * WAL. If 'clear' is true, always clear the data on the page and reinitialize
 * it first, and offset must be SizeOfPageHeaderData. Otherwise, offset must
 * be equal to pd_lower and pd_lower will be set to the end of the written
 * data.
 */
 static void
 WriteToBlock(Relation rel, BlockNumber blockno, uint32 offset, char *buf,
 			 uint32 len, bool clear)
 {
 	Buffer buffer = ReadBuffer(rel, blockno);
 	GenericXLogState *state = GenericXLogStart(rel);
 	LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 	Page page = GenericXLogRegisterBuffer(state, buffer, GENERIC_XLOG_FULL_IMAGE);
 	PageHeader phdr = (PageHeader) page;
 	if (PageIsNew(page) || clear)
 	{
 		PageInit(page, BLCKSZ, 0);
 	}
 	if (phdr->pd_lower < offset || phdr->pd_upper - offset < len)
 	{
 		elog(ERROR,
 			 "attempt to write columnar data of length %d to offset %d of block %d of relation %d",
 			 len, offset, blockno, rel->rd_id);
 	}
 	/*
 	 * After a transaction has been rolled-back, we might be
 	 * over-writing the rolledback write, so phdr->pd_lower can be
 	 * different from addr.offset.
 	 *
 	 * We reset pd_lower to reset the rolledback write.
 	 *
 	 * Given that we always align page reservation to the next page as of
 	 * 10.2, having such a disk page is only possible if write operaion
 	 * failed in an older version of columnar, but now user attempts writing
 	 * to that table in version >= 10.2.
 	 */
 	if (phdr->pd_lower > offset)
 	{
 		ereport(DEBUG4, (errmsg("overwriting page %u", blockno),
 						 errdetail("This can happen after a roll-back.")));
 		phdr->pd_lower = offset;
 	}
 	memcpy_s(page + phdr->pd_lower, phdr->pd_upper - phdr->pd_lower, buf, len);
 	phdr->pd_lower += len;
 	GenericXLogFinish(state);
 	UnlockReleaseBuffer(buffer);
 }
 /*
 * AlignReservation - given an unused logical byte offset, align it so that it
 * falls at the start of a page.
 *
 * XXX: Reconsider whether we want/need to do this at all.
 */
 static uint64
 AlignReservation(uint64 prevReservation)
 {
 	PhysicalAddr prevAddr = LogicalToPhysical(prevReservation);
 	uint64 alignedReservation = prevReservation;
 	if (prevAddr.offset != SizeOfPageHeaderData)
 	{
 		/* not aligned; align on beginning of next page */
 		PhysicalAddr initial = { 0 };
 		initial.blockno = prevAddr.blockno + 1;
 		initial.offset = SizeOfPageHeaderData;
 		alignedReservation = PhysicalToLogical(initial);
 	}
 	Assert(alignedReservation >= prevReservation);
 	return alignedReservation;
 }
 /*
 * ColumnarMetapageIsCurrent - is the metapage at the latest version?
 */
 static bool
 ColumnarMetapageIsCurrent(ColumnarMetapage *metapage)
 {
 	return (metapage->versionMajor == COLUMNAR_VERSION_MAJOR &&
 			metapage->versionMinor == COLUMNAR_VERSION_MINOR);
 }
 /*
 * ColumnarMetapageIsOlder - is the metapage older than the current version?
 */
 static bool
 ColumnarMetapageIsOlder(ColumnarMetapage *metapage)
 {
 	return (metapage->versionMajor < COLUMNAR_VERSION_MAJOR ||
 			(metapage->versionMajor == COLUMNAR_VERSION_MAJOR &&
 			 (int) metapage->versionMinor < (int) COLUMNAR_VERSION_MINOR));
 }
 /*
 * ColumnarMetapageIsNewer - is the metapage newer than the current version?
 */
 static bool
 ColumnarMetapageIsNewer(ColumnarMetapage *metapage)
 {
 	return (metapage->versionMajor > COLUMNAR_VERSION_MAJOR ||
 			(metapage->versionMajor == COLUMNAR_VERSION_MAJOR &&
 			 metapage->versionMinor > COLUMNAR_VERSION_MINOR));
 }
 /*
 * ColumnarMetapageCheckVersion - throw an error if accessing old
 * version of metapage.
 */
 static void
 ColumnarMetapageCheckVersion(Relation rel, ColumnarMetapage *metapage)
 {
 	if (!ColumnarMetapageIsCurrent(metapage))
 	{
 		ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 						errmsg(
 							"attempted to access relation \"%s\", which uses an older columnar format",
 							RelationGetRelationName(rel)),
 						errdetail(
 							"Columnar format version %d.%d is required, \"%s\" has version %d.%d.",
 							COLUMNAR_VERSION_MAJOR, COLUMNAR_VERSION_MINOR,
 							RelationGetRelationName(rel),
 							metapage->versionMajor, metapage->versionMinor),
 						errhint(OLD_METAPAGE_VERSION_HINT)));
 	}
 }
 /*
 * test_columnar_storage_write_new_page is a UDF only used for testing
 * purposes. It could make more sense to define this in columnar_debug.c,
 * but the storage layer doesn't expose ColumnarMetapage to any other files,
 * so we define it here.
 */
 Datum
 test_columnar_storage_write_new_page(PG_FUNCTION_ARGS)
 {
 	Oid relationId = PG_GETARG_OID(0);
 	Relation relation = relation_open(relationId, AccessShareLock);
 	/*
 	 * Allocate a new page, write some data to there, and set reserved offset
 	 * to the start of that page. That way, for a subsequent write operation,
 	 * storage layer would try to overwrite the page that we allocated here.
 	 */
 	uint64 newPageOffset = ColumnarStorageGetReservedOffset(relation, false);
 	ColumnarStorageReserveData(relation, 100);
 	ColumnarStorageWrite(relation, newPageOffset, "foo_bar", 8);
 	ColumnarMetapage metapage = ColumnarMetapageRead(relation, false);
 	metapage.reservedOffset = newPageOffset;
 	ColumnarOverwriteMetapage(relation, metapage);
 	relation_close(relation, AccessShareLock);
 	PG_RETURN_VOID();
 }
--- a/src/backend/columnar/columnar_tableam.c
+++ b/src/backend/columnar/columnar_tableam.c
--- a/src/backend/columnar/columnar_writer.c
+++ b/src/backend/columnar/columnar_writer.c
@ -1,775 +0,0 @@
 /*-------------------------------------------------------------------------
 *
 * columnar_writer.c
 *
 * This file contains function definitions for writing columnar tables. This
 * includes the logic for writing file level metadata, writing row stripes,
 * and calculating chunk skip nodes.
 *
 * Copyright (c) 2016, Citus Data, Inc.
 *
 * $Id$
 *
 *-------------------------------------------------------------------------
 */
 #include "postgres.h"
 #include "miscadmin.h"
 #include "safe_lib.h"
 #include "access/heapam.h"
 #include "access/nbtree.h"
 #include "catalog/pg_am.h"
 #include "storage/fd.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 #include "pg_version_compat.h"
 #include "pg_version_constants.h"
 #include "columnar/columnar.h"
 #include "columnar/columnar_storage.h"
 #include "columnar/columnar_version_compat.h"
 #if PG_VERSION_NUM >= PG_VERSION_16
 #include "storage/relfilelocator.h"
 #include "utils/relfilenumbermap.h"
 #else
 #include "utils/relfilenodemap.h"
 #endif
 struct ColumnarWriteState
 {
 	TupleDesc tupleDescriptor;
 	FmgrInfo **comparisonFunctionArray;
 	RelFileLocator relfilelocator;
 	MemoryContext stripeWriteContext;
 	MemoryContext perTupleContext;
 	StripeBuffers *stripeBuffers;
 	StripeSkipList *stripeSkipList;
 	EmptyStripeReservation *emptyStripeReservation;
 	ColumnarOptions options;
 	ChunkData *chunkData;
 	List *chunkGroupRowCounts;
 	/*
 	 * compressionBuffer buffer is used as temporary storage during
 	 * data value compression operation. It is kept here to minimize
 	 * memory allocations. It lives in stripeWriteContext and gets
 	 * deallocated when memory context is reset.
 	 */
 	StringInfo compressionBuffer;
 };
 static StripeBuffers * CreateEmptyStripeBuffers(uint32 stripeMaxRowCount,
 												uint32 chunkRowCount,
 												uint32 columnCount);
 static StripeSkipList * CreateEmptyStripeSkipList(uint32 stripeMaxRowCount,
 												  uint32 chunkRowCount,
 												  uint32 columnCount);
 static void FlushStripe(ColumnarWriteState *writeState);
 static StringInfo SerializeBoolArray(bool *boolArray, uint32 boolArrayLength);
 static void SerializeSingleDatum(StringInfo datumBuffer, Datum datum,
 								 bool datumTypeByValue, int datumTypeLength,
 								 char datumTypeAlign);
 static void SerializeChunkData(ColumnarWriteState *writeState, uint32 chunkIndex,
 							   uint32 rowCount);
 static void UpdateChunkSkipNodeMinMax(ColumnChunkSkipNode *chunkSkipNode,
 									  Datum columnValue, bool columnTypeByValue,
 									  int columnTypeLength, Oid columnCollation,
 									  FmgrInfo *comparisonFunction);
 static Datum DatumCopy(Datum datum, bool datumTypeByValue, int datumTypeLength);
 static StringInfo CopyStringInfo(StringInfo sourceString);
 /*
 * ColumnarBeginWrite initializes a columnar data load operation and returns a table
 * handle. This handle should be used for adding the row values and finishing the
 * data load operation.
 */
 ColumnarWriteState *
 ColumnarBeginWrite(RelFileLocator relfilelocator,
 				   ColumnarOptions options,
 				   TupleDesc tupleDescriptor)
 {
 	/* get comparison function pointers for each of the columns */
 	uint32 columnCount = tupleDescriptor->natts;
 	FmgrInfo **comparisonFunctionArray = palloc0(columnCount * sizeof(FmgrInfo *));
 	for (uint32 columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		FmgrInfo *comparisonFunction = NULL;
 		FormData_pg_attribute *attributeForm = TupleDescAttr(tupleDescriptor,
 															 columnIndex);
 		if (!attributeForm->attisdropped)
 		{
 			Oid typeId = attributeForm->atttypid;
 			comparisonFunction = GetFunctionInfoOrNull(typeId, BTREE_AM_OID,
 													   BTORDER_PROC);
 		}
 		comparisonFunctionArray[columnIndex] = comparisonFunction;
 	}
 	/*
 	 * We allocate all stripe specific data in the stripeWriteContext, and
 	 * reset this memory context once we have flushed the stripe to the file.
 	 * This is to avoid memory leaks.
 	 */
 	MemoryContext stripeWriteContext = AllocSetContextCreate(CurrentMemoryContext,
 															 "Stripe Write Memory Context",
 															 ALLOCSET_DEFAULT_SIZES);
 	bool *columnMaskArray = palloc(columnCount * sizeof(bool));
 	memset(columnMaskArray, true, columnCount * sizeof(bool));
 	ChunkData *chunkData = CreateEmptyChunkData(columnCount, columnMaskArray,
 												options.chunkRowCount);
 	ColumnarWriteState *writeState = palloc0(sizeof(ColumnarWriteState));
 	writeState->relfilelocator = relfilelocator;
 	writeState->options = options;
 	writeState->tupleDescriptor = CreateTupleDescCopy(tupleDescriptor);
 	writeState->comparisonFunctionArray = comparisonFunctionArray;
 	writeState->stripeBuffers = NULL;
 	writeState->stripeSkipList = NULL;
 	writeState->emptyStripeReservation = NULL;
 	writeState->stripeWriteContext = stripeWriteContext;
 	writeState->chunkData = chunkData;
 	writeState->compressionBuffer = NULL;
 	writeState->perTupleContext = AllocSetContextCreate(CurrentMemoryContext,
 														"Columnar per tuple context",
 														ALLOCSET_DEFAULT_SIZES);
 	return writeState;
 }
 /*
 * ColumnarWriteRow adds a row to the columnar table. If the stripe is not initialized,
 * we create structures to hold stripe data and skip list. Then, we serialize and
 * append data to serialized value buffer for each of the columns and update
 * corresponding skip nodes. Then, whole chunk data is compressed at every
 * rowChunkCount insertion. Then, if row count exceeds stripeMaxRowCount, we flush
 * the stripe, and add its metadata to the table footer.
 *
 * Returns the "row number" assigned to written row.
 */
 uint64
 ColumnarWriteRow(ColumnarWriteState *writeState, Datum *columnValues, bool *columnNulls)
 {
 	uint32 columnIndex = 0;
 	StripeBuffers *stripeBuffers = writeState->stripeBuffers;
 	StripeSkipList *stripeSkipList = writeState->stripeSkipList;
 	uint32 columnCount = writeState->tupleDescriptor->natts;
 	ColumnarOptions *options = &writeState->options;
 	const uint32 chunkRowCount = options->chunkRowCount;
 	ChunkData *chunkData = writeState->chunkData;
 	MemoryContext oldContext = MemoryContextSwitchTo(writeState->stripeWriteContext);
 	if (stripeBuffers == NULL)
 	{
 		stripeBuffers = CreateEmptyStripeBuffers(options->stripeRowCount,
 												 chunkRowCount, columnCount);
 		stripeSkipList = CreateEmptyStripeSkipList(options->stripeRowCount,
 												   chunkRowCount, columnCount);
 		writeState->stripeBuffers = stripeBuffers;
 		writeState->stripeSkipList = stripeSkipList;
 		writeState->compressionBuffer = makeStringInfo();
 		Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(
 												  writeState->relfilelocator),
 											  RelationPhysicalIdentifierNumber_compat(
 												  writeState->relfilelocator));
 		Relation relation = relation_open(relationId, NoLock);
 		writeState->emptyStripeReservation =
 			ReserveEmptyStripe(relation, columnCount, chunkRowCount,
 							   options->stripeRowCount);
 		relation_close(relation, NoLock);
 		/*
 		 * serializedValueBuffer lives in stripe write memory context so it needs to be
 		 * initialized when the stripe is created.
 		 */
 		for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 		{
 			chunkData->valueBufferArray[columnIndex] = makeStringInfo();
 		}
 	}
 	uint32 chunkIndex = stripeBuffers->rowCount / chunkRowCount;
 	uint32 chunkRowIndex = stripeBuffers->rowCount % chunkRowCount;
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		ColumnChunkSkipNode **chunkSkipNodeArray = stripeSkipList->chunkSkipNodeArray;
 		ColumnChunkSkipNode *chunkSkipNode =
 			&chunkSkipNodeArray[columnIndex][chunkIndex];
 		if (columnNulls[columnIndex])
 		{
 			chunkData->existsArray[columnIndex][chunkRowIndex] = false;
 		}
 		else
 		{
 			FmgrInfo *comparisonFunction =
 				writeState->comparisonFunctionArray[columnIndex];
 			Form_pg_attribute attributeForm =
 				TupleDescAttr(writeState->tupleDescriptor, columnIndex);
 			bool columnTypeByValue = attributeForm->attbyval;
 			int columnTypeLength = attributeForm->attlen;
 			Oid columnCollation = attributeForm->attcollation;
 			char columnTypeAlign = attributeForm->attalign;
 			chunkData->existsArray[columnIndex][chunkRowIndex] = true;
 			SerializeSingleDatum(chunkData->valueBufferArray[columnIndex],
 								 columnValues[columnIndex], columnTypeByValue,
 								 columnTypeLength, columnTypeAlign);
 			UpdateChunkSkipNodeMinMax(chunkSkipNode, columnValues[columnIndex],
 									  columnTypeByValue, columnTypeLength,
 									  columnCollation, comparisonFunction);
 		}
 		chunkSkipNode->rowCount++;
 	}
 	stripeSkipList->chunkCount = chunkIndex + 1;
 	/* last row of the chunk is inserted serialize the chunk */
 	if (chunkRowIndex == chunkRowCount - 1)
 	{
 		SerializeChunkData(writeState, chunkIndex, chunkRowCount);
 	}
 	uint64 writtenRowNumber = writeState->emptyStripeReservation->stripeFirstRowNumber +
 							  stripeBuffers->rowCount;
 	stripeBuffers->rowCount++;
 	if (stripeBuffers->rowCount >= options->stripeRowCount)
 	{
 		ColumnarFlushPendingWrites(writeState);
 	}
 	MemoryContextSwitchTo(oldContext);
 	return writtenRowNumber;
 }
 /*
 * ColumnarEndWrite finishes a columnar data load operation. If we have an unflushed
 * stripe, we flush it.
 */
 void
 ColumnarEndWrite(ColumnarWriteState *writeState)
 {
 	ColumnarFlushPendingWrites(writeState);
 	MemoryContextDelete(writeState->stripeWriteContext);
 	pfree(writeState->comparisonFunctionArray);
 	FreeChunkData(writeState->chunkData);
 	pfree(writeState);
 }
 void
 ColumnarFlushPendingWrites(ColumnarWriteState *writeState)
 {
 	StripeBuffers *stripeBuffers = writeState->stripeBuffers;
 	if (stripeBuffers != NULL)
 	{
 		MemoryContext oldContext = MemoryContextSwitchTo(writeState->stripeWriteContext);
 		FlushStripe(writeState);
 		MemoryContextReset(writeState->stripeWriteContext);
 		/* set stripe data and skip list to NULL so they are recreated next time */
 		writeState->stripeBuffers = NULL;
 		writeState->stripeSkipList = NULL;
 		MemoryContextSwitchTo(oldContext);
 	}
 }
 /*
 * ColumnarWritePerTupleContext
 *
 * Return per-tuple context for columnar write operation.
 */
 MemoryContext
 ColumnarWritePerTupleContext(ColumnarWriteState *state)
 {
 	return state->perTupleContext;
 }
 /*
 * CreateEmptyStripeBuffers allocates an empty StripeBuffers structure with the given
 * column count.
 */
 static StripeBuffers *
 CreateEmptyStripeBuffers(uint32 stripeMaxRowCount, uint32 chunkRowCount,
 						 uint32 columnCount)
 {
 	uint32 columnIndex = 0;
 	uint32 maxChunkCount = (stripeMaxRowCount / chunkRowCount) + 1;
 	ColumnBuffers **columnBuffersArray = palloc0(columnCount * sizeof(ColumnBuffers *));
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		uint32 chunkIndex = 0;
 		ColumnChunkBuffers **chunkBuffersArray =
 			palloc0(maxChunkCount * sizeof(ColumnChunkBuffers *));
 		for (chunkIndex = 0; chunkIndex < maxChunkCount; chunkIndex++)
 		{
 			chunkBuffersArray[chunkIndex] = palloc0(sizeof(ColumnChunkBuffers));
 			chunkBuffersArray[chunkIndex]->existsBuffer = NULL;
 			chunkBuffersArray[chunkIndex]->valueBuffer = NULL;
 			chunkBuffersArray[chunkIndex]->valueCompressionType = COMPRESSION_NONE;
 		}
 		columnBuffersArray[columnIndex] = palloc0(sizeof(ColumnBuffers));
 		columnBuffersArray[columnIndex]->chunkBuffersArray = chunkBuffersArray;
 	}
 	StripeBuffers *stripeBuffers = palloc0(sizeof(StripeBuffers));
 	stripeBuffers->columnBuffersArray = columnBuffersArray;
 	stripeBuffers->columnCount = columnCount;
 	stripeBuffers->rowCount = 0;
 	return stripeBuffers;
 }
 /*
 * CreateEmptyStripeSkipList allocates an empty StripeSkipList structure with
 * the given column count. This structure has enough chunks to hold statistics
 * for stripeMaxRowCount rows.
 */
 static StripeSkipList *
 CreateEmptyStripeSkipList(uint32 stripeMaxRowCount, uint32 chunkRowCount,
 						  uint32 columnCount)
 {
 	uint32 columnIndex = 0;
 	uint32 maxChunkCount = (stripeMaxRowCount / chunkRowCount) + 1;
 	ColumnChunkSkipNode **chunkSkipNodeArray =
 		palloc0(columnCount * sizeof(ColumnChunkSkipNode *));
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		chunkSkipNodeArray[columnIndex] =
 			palloc0(maxChunkCount * sizeof(ColumnChunkSkipNode));
 	}
 	StripeSkipList *stripeSkipList = palloc0(sizeof(StripeSkipList));
 	stripeSkipList->columnCount = columnCount;
 	stripeSkipList->chunkCount = 0;
 	stripeSkipList->chunkSkipNodeArray = chunkSkipNodeArray;
 	return stripeSkipList;
 }
 /*
 * FlushStripe flushes current stripe data into the file. The function first ensures
 * the last data chunk for each column is properly serialized and compressed. Then,
 * the function creates the skip list and footer buffers. Finally, the function
 * flushes the skip list, data, and footer buffers to the file.
 */
 static void
 FlushStripe(ColumnarWriteState *writeState)
 {
 	uint32 columnIndex = 0;
 	uint32 chunkIndex = 0;
 	StripeBuffers *stripeBuffers = writeState->stripeBuffers;
 	StripeSkipList *stripeSkipList = writeState->stripeSkipList;
 	ColumnChunkSkipNode **columnSkipNodeArray = stripeSkipList->chunkSkipNodeArray;
 	TupleDesc tupleDescriptor = writeState->tupleDescriptor;
 	uint32 columnCount = tupleDescriptor->natts;
 	uint32 chunkCount = stripeSkipList->chunkCount;
 	uint32 chunkRowCount = writeState->options.chunkRowCount;
 	uint32 lastChunkIndex = stripeBuffers->rowCount / chunkRowCount;
 	uint32 lastChunkRowCount = stripeBuffers->rowCount % chunkRowCount;
 	uint64 stripeSize = 0;
 	uint64 stripeRowCount = stripeBuffers->rowCount;
 	elog(DEBUG1, "Flushing Stripe of size %d", stripeBuffers->rowCount);
 	Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(
 											  writeState->relfilelocator),
 										  RelationPhysicalIdentifierNumber_compat(
 											  writeState->relfilelocator));
 	Relation relation = relation_open(relationId, NoLock);
 	/*
 	 * check if the last chunk needs serialization , the last chunk was not serialized
 	 * if it was not full yet, e.g.  (rowCount > 0)
 	 */
 	if (lastChunkRowCount > 0)
 	{
 		SerializeChunkData(writeState, lastChunkIndex, lastChunkRowCount);
 	}
 	/* update buffer sizes in stripe skip list */
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		ColumnChunkSkipNode *chunkSkipNodeArray = columnSkipNodeArray[columnIndex];
 		ColumnBuffers *columnBuffers = stripeBuffers->columnBuffersArray[columnIndex];
 		for (chunkIndex = 0; chunkIndex < chunkCount; chunkIndex++)
 		{
 			ColumnChunkBuffers *chunkBuffers =
 				columnBuffers->chunkBuffersArray[chunkIndex];
 			uint64 existsBufferSize = chunkBuffers->existsBuffer->len;
 			ColumnChunkSkipNode *chunkSkipNode = &chunkSkipNodeArray[chunkIndex];
 			chunkSkipNode->existsChunkOffset = stripeSize;
 			chunkSkipNode->existsLength = existsBufferSize;
 			stripeSize += existsBufferSize;
 		}
 		for (chunkIndex = 0; chunkIndex < chunkCount; chunkIndex++)
 		{
 			ColumnChunkBuffers *chunkBuffers =
 				columnBuffers->chunkBuffersArray[chunkIndex];
 			uint64 valueBufferSize = chunkBuffers->valueBuffer->len;
 			CompressionType valueCompressionType = chunkBuffers->valueCompressionType;
 			ColumnChunkSkipNode *chunkSkipNode = &chunkSkipNodeArray[chunkIndex];
 			chunkSkipNode->valueChunkOffset = stripeSize;
 			chunkSkipNode->valueLength = valueBufferSize;
 			chunkSkipNode->valueCompressionType = valueCompressionType;
 			chunkSkipNode->valueCompressionLevel = writeState->options.compressionLevel;
 			chunkSkipNode->decompressedValueSize = chunkBuffers->decompressedValueSize;
 			stripeSize += valueBufferSize;
 		}
 	}
 	StripeMetadata *stripeMetadata =
 		CompleteStripeReservation(relation, writeState->emptyStripeReservation->stripeId,
 								  stripeSize, stripeRowCount, chunkCount);
 	uint64 currentFileOffset = stripeMetadata->fileOffset;
 	/*
 	 * Each stripe has only one section:
 	 * Data section, in which we store data for each column continuously.
 	 * We store data for each for each column in chunks. For each chunk, we
 	 * store two buffers: "exists" buffer, and "value" buffer. "exists" buffer
 	 * tells which values are not NULL. "value" buffer contains values for
 	 * present values. For each column, we first store all "exists" buffers,
 	 * and then all "value" buffers.
 	 */
 	/* flush the data buffers */
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		ColumnBuffers *columnBuffers = stripeBuffers->columnBuffersArray[columnIndex];
 		for (chunkIndex = 0; chunkIndex < stripeSkipList->chunkCount; chunkIndex++)
 		{
 			ColumnChunkBuffers *chunkBuffers =
 				columnBuffers->chunkBuffersArray[chunkIndex];
 			StringInfo existsBuffer = chunkBuffers->existsBuffer;
 			ColumnarStorageWrite(relation, currentFileOffset,
 								 existsBuffer->data, existsBuffer->len);
 			currentFileOffset += existsBuffer->len;
 		}
 		for (chunkIndex = 0; chunkIndex < stripeSkipList->chunkCount; chunkIndex++)
 		{
 			ColumnChunkBuffers *chunkBuffers =
 				columnBuffers->chunkBuffersArray[chunkIndex];
 			StringInfo valueBuffer = chunkBuffers->valueBuffer;
 			ColumnarStorageWrite(relation, currentFileOffset,
 								 valueBuffer->data, valueBuffer->len);
 			currentFileOffset += valueBuffer->len;
 		}
 	}
 	SaveChunkGroups(writeState->relfilelocator,
 					stripeMetadata->id,
 					writeState->chunkGroupRowCounts);
 	SaveStripeSkipList(writeState->relfilelocator,
 					   stripeMetadata->id,
 					   stripeSkipList, tupleDescriptor);
 	writeState->chunkGroupRowCounts = NIL;
 	relation_close(relation, NoLock);
 }
 /*
 * SerializeBoolArray serializes the given boolean array and returns the result
 * as a StringInfo. This function packs every 8 boolean values into one byte.
 */
 static StringInfo
 SerializeBoolArray(bool *boolArray, uint32 boolArrayLength)
 {
 	uint32 boolArrayIndex = 0;
 	uint32 byteCount = ((boolArrayLength * sizeof(bool)) + (8 - sizeof(bool))) / 8;
 	StringInfo boolArrayBuffer = makeStringInfo();
 	enlargeStringInfo(boolArrayBuffer, byteCount);
 	boolArrayBuffer->len = byteCount;
 	memset(boolArrayBuffer->data, 0, byteCount);
 	for (boolArrayIndex = 0; boolArrayIndex < boolArrayLength; boolArrayIndex++)
 	{
 		if (boolArray[boolArrayIndex])
 		{
 			uint32 byteIndex = boolArrayIndex / 8;
 			uint32 bitIndex = boolArrayIndex % 8;
 			boolArrayBuffer->data[byteIndex] |= (1 << bitIndex);
 		}
 	}
 	return boolArrayBuffer;
 }
 /*
 * SerializeSingleDatum serializes the given datum value and appends it to the
 * provided string info buffer.
 *
 * Since we don't want to limit datum buffer size to RSIZE_MAX unnecessarily,
 * we use memcpy instead of memcpy_s several places in this function.
 */
 static void
 SerializeSingleDatum(StringInfo datumBuffer, Datum datum, bool datumTypeByValue,
 					 int datumTypeLength, char datumTypeAlign)
 {
 	uint32 datumLength = att_addlength_datum(0, datumTypeLength, datum);
 	uint32 datumLengthAligned = att_align_nominal(datumLength, datumTypeAlign);
 	enlargeStringInfo(datumBuffer, datumLengthAligned);
 	char *currentDatumDataPointer = datumBuffer->data + datumBuffer->len;
 	memset(currentDatumDataPointer, 0, datumLengthAligned);
 	if (datumTypeLength > 0)
 	{
 		if (datumTypeByValue)
 		{
 			store_att_byval(currentDatumDataPointer, datum, datumTypeLength);
 		}
 		else
 		{
 			memcpy(currentDatumDataPointer, DatumGetPointer(datum), datumTypeLength); /* IGNORE-BANNED */
 		}
 	}
 	else
 	{
 		Assert(!datumTypeByValue);
 		memcpy(currentDatumDataPointer, DatumGetPointer(datum), datumLength); /* IGNORE-BANNED */
 	}
 	datumBuffer->len += datumLengthAligned;
 }
 /*
 * SerializeChunkData serializes and compresses chunk data at given chunk index with given
 * compression type for every column.
 */
 static void
 SerializeChunkData(ColumnarWriteState *writeState, uint32 chunkIndex, uint32 rowCount)
 {
 	uint32 columnIndex = 0;
 	StripeBuffers *stripeBuffers = writeState->stripeBuffers;
 	ChunkData *chunkData = writeState->chunkData;
 	CompressionType requestedCompressionType = writeState->options.compressionType;
 	int compressionLevel = writeState->options.compressionLevel;
 	const uint32 columnCount = stripeBuffers->columnCount;
 	StringInfo compressionBuffer = writeState->compressionBuffer;
 	writeState->chunkGroupRowCounts =
 		lappend_int(writeState->chunkGroupRowCounts, rowCount);
 	/* serialize exist values, data values are already serialized */
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		ColumnBuffers *columnBuffers = stripeBuffers->columnBuffersArray[columnIndex];
 		ColumnChunkBuffers *chunkBuffers = columnBuffers->chunkBuffersArray[chunkIndex];
 		chunkBuffers->existsBuffer =
 			SerializeBoolArray(chunkData->existsArray[columnIndex], rowCount);
 	}
 	/*
 	 * check and compress value buffers, if a value buffer is not compressable
 	 * then keep it as uncompressed, store compression information.
 	 */
 	for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
 	{
 		ColumnBuffers *columnBuffers = stripeBuffers->columnBuffersArray[columnIndex];
 		ColumnChunkBuffers *chunkBuffers = columnBuffers->chunkBuffersArray[chunkIndex];
 		CompressionType actualCompressionType = COMPRESSION_NONE;
 		StringInfo serializedValueBuffer = chunkData->valueBufferArray[columnIndex];
 		Assert(requestedCompressionType >= 0 &&
 			   requestedCompressionType < COMPRESSION_COUNT);
 		chunkBuffers->decompressedValueSize =
 			chunkData->valueBufferArray[columnIndex]->len;
 		/*
 		 * if serializedValueBuffer is be compressed, update serializedValueBuffer
 		 * with compressed data and store compression type.
 		 */
 		bool compressed = CompressBuffer(serializedValueBuffer, compressionBuffer,
 										 requestedCompressionType,
 										 compressionLevel);
 		if (compressed)
 		{
 			serializedValueBuffer = compressionBuffer;
 			actualCompressionType = requestedCompressionType;
 		}
 		/* store (compressed) value buffer */
 		chunkBuffers->valueCompressionType = actualCompressionType;
 		chunkBuffers->valueBuffer = CopyStringInfo(serializedValueBuffer);
 		/* valueBuffer needs to be reset for next chunk's data */
 		resetStringInfo(chunkData->valueBufferArray[columnIndex]);
 	}
 }
 /*
 * UpdateChunkSkipNodeMinMax takes the given column value, and checks if this
 * value falls outside the range of minimum/maximum values of the given column
 * chunk skip node. If it does, the function updates the column chunk skip node
 * accordingly.
 */
 static void
 UpdateChunkSkipNodeMinMax(ColumnChunkSkipNode *chunkSkipNode, Datum columnValue,
 						  bool columnTypeByValue, int columnTypeLength,
 						  Oid columnCollation, FmgrInfo *comparisonFunction)
 {
 	bool hasMinMax = chunkSkipNode->hasMinMax;
 	Datum previousMinimum = chunkSkipNode->minimumValue;
 	Datum previousMaximum = chunkSkipNode->maximumValue;
 	Datum currentMinimum = 0;
 	Datum currentMaximum = 0;
 	/* if type doesn't have a comparison function, skip min/max values */
 	if (comparisonFunction == NULL)
 	{
 		return;
 	}
 	if (!hasMinMax)
 	{
 		currentMinimum = DatumCopy(columnValue, columnTypeByValue, columnTypeLength);
 		currentMaximum = DatumCopy(columnValue, columnTypeByValue, columnTypeLength);
 	}
 	else
 	{
 		Datum minimumComparisonDatum = FunctionCall2Coll(comparisonFunction,
 														 columnCollation, columnValue,
 														 previousMinimum);
 		Datum maximumComparisonDatum = FunctionCall2Coll(comparisonFunction,
 														 columnCollation, columnValue,
 														 previousMaximum);
 		int minimumComparison = DatumGetInt32(minimumComparisonDatum);
 		int maximumComparison = DatumGetInt32(maximumComparisonDatum);
 		if (minimumComparison < 0)
 		{
 			currentMinimum = DatumCopy(columnValue, columnTypeByValue, columnTypeLength);
 		}
 		else
 		{
 			currentMinimum = previousMinimum;
 		}
 		if (maximumComparison > 0)
 		{
 			currentMaximum = DatumCopy(columnValue, columnTypeByValue, columnTypeLength);
 		}
 		else
 		{
 			currentMaximum = previousMaximum;
 		}
 	}
 	chunkSkipNode->hasMinMax = true;
 	chunkSkipNode->minimumValue = currentMinimum;
 	chunkSkipNode->maximumValue = currentMaximum;
 }
 /* Creates a copy of the given datum. */
 static Datum
 DatumCopy(Datum datum, bool datumTypeByValue, int datumTypeLength)
 {
 	Datum datumCopy = 0;
 	if (datumTypeByValue)
 	{
 		datumCopy = datum;
 	}
 	else
 	{
 		uint32 datumLength = att_addlength_datum(0, datumTypeLength, datum);
 		char *datumData = palloc0(datumLength);
 		/*
 		 * We use IGNORE-BANNED here since we don't want to limit datum size to
 		 * RSIZE_MAX unnecessarily.
 		 */
 		memcpy(datumData, DatumGetPointer(datum), datumLength); /* IGNORE-BANNED */
 		datumCopy = PointerGetDatum(datumData);
 	}
 	return datumCopy;
 }
 /*
 * CopyStringInfo creates a deep copy of given source string allocating only needed
 * amount of memory.
 */
 static StringInfo
 CopyStringInfo(StringInfo sourceString)
 {
 	StringInfo targetString = palloc0(sizeof(StringInfoData));
 	if (sourceString->len > 0)
 	{
 		targetString->data = palloc0(sourceString->len);
 		targetString->len = sourceString->len;
 		targetString->maxlen = sourceString->len;
 		/*
 		 * We use IGNORE-BANNED here since we don't want to limit string
 		 * buffer size to RSIZE_MAX unnecessarily.
 		 */
 		memcpy(targetString->data, sourceString->data, sourceString->len); /* IGNORE-BANNED */
 	}
 	return targetString;
 }
 bool
 ContainsPendingWrites(ColumnarWriteState *state)
 {
 	return state->stripeBuffers != NULL && state->stripeBuffers->rowCount != 0;
 }
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Jason Petersen	a091353c0c	Add 5.2.0 CHANGELOG entry Our longest yet!	2016-08-15 12:59:35 -06:00
Jason Petersen	0509d85e94	Fix Travis local_first_candidate_nodes failures A recent change to the image used in Travis causes some problems for the code we use here to ensure the local replica is first. Since this code is essentially dead in a post-stage world anyhow, we're OK with ripping out the tests to placate Travis.	2016-08-14 23:19:16 -06:00
Murat Tuncer	379ba0d717	Remove a router planner test for materialized view PostgreSQL 9.5.4 stopped calling planner for materialized view create command when NO DATA option is provided. This causes our test to behave differently between pre-9.5.4 and 9.5.4.	2016-08-14 23:19:16 -06:00
Andres Freund	75cd7a7d6f	Merge pull request #718 from citusdata/fix-700-5.2 Skip over unreferenced parameters when router executing prepared stat…	2016-08-05 14:28:23 -07:00
Andres Freund	7f6084c14f	Skip over unreferenced parameters when router executing prepared statement. When an unreferenced prepared statement parameter does not explicitly have a type assigned, we cannot deserialize it, to send to the remote side. That commonly happens inside plpgsql functions, where local variables are passed in as unused prepared statement parameters.	2016-08-05 14:20:22 -07:00
Jason Petersen	27171c230c	Avoid attempting to lock invalid shard identifier A recent change generates a "dummy" shard placement with its identifier set to INVALID_SHARD_ID for SELECT queries against distributed tables with no shards. Normally, no lock is acquired for SELECT statements, but if all_modifications_commutative is set to true, we will acquire a shared lock, triggering an assertion failure within LockShardResource in the above case. The "dummy" shard placement is actually necessary to ensure such empty queries have somewhere to execute, and INVALID_SHARD_ID seems the most appropriate value for the dummy's shard identifier field, so the most straightforward fix is to just avoid locking invalid shard identifiers.	2016-08-04 14:45:54 -07:00
Jason Petersen	596bc7194d	Lock tools version	2016-08-02 21:28:17 -07:00
		`@ -1 +0,0 @@`
			`DESCRIPTION: PR description that will go into the change log, up to 78 characters`
		`@ -1,2 +0,0 @@`
			`dnl aclocal.m4`
			`m4_include([config/general.m4])`