Setting Up a High-Availability PostgreSQL Database in Your Kubernetes Infrastructure

Setting Up a High-Availability PostgreSQL Database in Your Kubernetes Infrastructure

Introduction

PostgreSQL, commonly called Postgres, is a widely adopted open-source database known for its comprehensive feature set suitable for enterprise production environments across various Operating Systems.

This article provides a detailed guide on setting up PostgreSQL in a High-Availability (HA) configuration within a Kubernetes environment.

What we will create

The picture above shows the Postgres database as a single installation but this is not what we will deploy. The picture below shows the High Availability (HA) installation.

In this High Availability (HA) deployment, two instances of PostgreSQL are provisioned, each with dedicated storage. One instance operates as the Master, handling read/write transactions, while the other serves as a Replica, configured for read-only access.

Updates made to the Master are synchronized with the Replica to maintain data consistency.

In the event of a Master failure, the Replica automatically assumes the role of Master through a process known as failover. In Kubernetes, a new instance is typically scheduled to replace the failed Master, ensuring continuous service availability.

Three Kubernetes Services are established for seamless connectivity:

  • rw: Handles all read and write transactions through the Master.

  • ro: Allows read-only transactions exclusively from the Replicas.

  • r: Provides read-only access from any available instance.

Assuming a microservices architecture, applications can connect to the appropriate Service based on their transactional needs. The rw Service dynamically updates to redirect transactions to the new Master post-failover, ensuring uninterrupted service without requiring manual intervention.

Next, we will proceed with the setup process to implement this robust PostgreSQL HA configuration on Kubernetes.

CloudNativePG Kubernetes operator

When deploying applications in a Kubernetes cluster, managing a multitude of manifest files can be complex. Helm charts simplify this by consolidating manifests into a single file. However, even with Helm, operational tasks like deployment, upgrades, and backups require ongoing management.

To streamline these tasks, Kubernetes operators automate routine operations within the cluster. Operators act as applications deployed in Kubernetes that handle tasks traditionally managed manually.

Several operators cater to PostgreSQL deployments in Kubernetes, each offering distinct features and capabilities. For this guide, we will utilize CloudNativePG, renowned for its comprehensive array of options and configurations tailored specifically for PostgreSQL in Kubernetes environments.

By leveraging CloudNativePG's capabilities, we aim to simplify and enhance the management of PostgreSQL instances, ensuring scalability, reliability, and operational efficiency within Kubernetes clusters.

Krew

A key component of the CloudNativePG deployment involves leveraging a cnpg plugin extension for kubectl. Management of kubectl extensions is facilitated through the Krew utility.

To proceed, krew must be installed on the machine from which kubectl commands are executed. In our case, this pertains to the k8s-master node.

To install Krew on an Ubuntu system, execute the following command:

(
  set -x; cd "$(mktemp -d)" &&
  OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
  ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
  KREW="krew-${OS}_${ARCH}" &&
  curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
  tar zxvf "${KREW}.tar.gz" &&
  ./"${KREW}" install krew
)

This is copied from the Krew site itself. It is a complicated script but is designed to download and install the correct version for your OS.

Add the install to your path by inserting the following into your .bashrc file:

export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"

Reload with:

source .bashrc

Check the installation with:

kubectl krew

cnpg Plugin Installation

You can now install the CloudNativePG plugin with:

kubectl krew install cnpg

This plugin facilitates the creation of a manifest file for the CloudNativePG operator, encompassing all necessary resources required by the operator. These resources can be generated using:

kubectl cnpg install generate -n pg-operator --replicas 1 pg-operator-manifests.yaml

The command generates a manifest file named pg-operator-manifests.yml. It reflects the decision to configure:

  • A single replica of the operator. For high availability, typically three replicas are recommended when sufficient control plane or master nodes are available. Given the current infrastructure with a single master node, one replica has been chosen.

  • The operator is isolated within its designated namespace (pg-operator), ensuring it remains protected from inadvertent deletion.

Install thecnpg operator

Consider the operator as an extension of the Kubernetes control plane, necessitating deployment on the k8s-master node. This can be achieved by adding a toleration to the Deployment resource. A toleration instructs the Kubernetes scheduler to prioritize nodes that align with specified toleration rules.

Locate the Deployment resource within the pg-operator-manifests.yml file (search for Kind: Deployment). Scroll down to the line terminationGracePeriodSeconds: 10, and at the same indentation level, add the following:

tolerations:
- effect: NoSchedule
  key: node-role.kubernetes.io/master
  operator: Exists
- effect: NoSchedule
  key: node-role.kubernetes.io/control-plane
  operator: Exists

This basically says that it can be scheduled on a node with a taint which prevents scheduling on that node.

Now deploy the manifests and then check the install with:

kubectl apply -f pg-operator-manifests.yaml
kubectl get pods -n pg-operator

You should get a result something like this:

NAME                                       READY   STATUS    RESTARTS   AGE
cnpg-controller-manager-6fbbb565f4-jcdvd   1/1     Running   0          95s

Install Postgres

Now that the operator is installed, we can proceed to deploy our Postgres high-availability (HA) cluster using it.

To initiate the cluster deployment, we need to formulate a manifest that specifies the database configuration parameters for the operator to create. While many options are available, most settings default to standard values. For comprehensive understanding, I have included these options here.

Within this definition, we will also establish user credentials. These credentials will be stored securely within Kubernetes as secrets. Kubernetes manages secrets as key-value pairs, typically requiring two pairs for credentials, with keys named username and password. It's important to note that Kubernetes expects these keys and values to be encoded in base64 format. This can be accomplished using:

echo mypassword | base64

Base-64 encoding is not secure and any base-64 string can be decoded.

You can try this by taking the output of the previous command:

echo bXlwYXNzd29yZAo= | base64 -d

The reason for using base64 encoding in Kubernetes secrets is to allow storage of any binary data, such as passwords, as strings of letters and numbers. This ensures that complex passwords can be safely included in YAML files without affecting syntax.

To implement this, include the resulting base64 strings in the following manifest file. The manifest consists of three YAML documents, separated by "---", which should be combined into a single file named pg_config.yml.

Superuser Credentials

The first section defines credentials for the superuser (postgres) of the database. Replace "secret_password" with a securely encoded password:

apiVersion: v1
kind: Secret
metadata:
  name: pg-superuser
  namespace: pg
type: kubernetes.io/basic-auth
data:
  password: c2VjcmV0X3Bhc3N3b3Jk  # base64 encoded secret_password
  username: cG9zdGdyZXM=          # base64 encoded postgres (must be postgres)

Normal User Credentials

The next section sets up credentials for the application's database user (app_user):

---
apiVersion: v1
kind: Secret
metadata:
  name: pg-app-user
  namespace: pg
type: kubernetes.io/basic-auth
data:
  password: c2VjcmV0X3Bhc3N3b3Jk  # base64 encoded secret_password (choose a different password)
  username: YXBwX3VzZXI=          # base64 encoded app_user (choose a different username)

Setting up the PostgreSQL HA Cluster

The largest section configures the PostgreSQL cluster using the CloudNativePG Cluster Custom Resource Definition (CRD). Replace placeholders with your desired configuration values:

---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-pgsql-cluster
  namespace: pg
spec:
  description: "My example pg cluster"
  imageName: ghcr.io/cloudnative-pg/postgresql:15.1
  instances: 2  # Adjust as per your infrastructure

  superuserSecret:
    name: pg-superuser
  enableSuperuserAccess: true

  startDelay: 30
  stopDelay: 100
  primaryUpdateStrategy: unsupervised

  postgresql:
    parameters:
      max_connections: '200'
      shared_buffers: '256MB'
      effective_cache_size: '768MB'
      maintenance_work_mem: '64MB'
      checkpoint_completion_target: '0.9'
      wal_buffers: '7864kB'
      default_statistics_target: '100'
      random_page_cost: '1.1'
      effective_io_concurrency: '200'
      work_mem: '655kB'
      huge_pages: 'off'
      min_wal_size: '1GB'
      max_wal_size: '4GB'

    pg_hba:
    - host all all 10.240.0.0/16 scram-sha-256

  bootstrap:
    initdb:
      database: my_app_db
      owner: app_user
      secret:
        name: pg-app-user
      postInitApplicationSQL:
        - create schema my_app

  storage:
    size: 10Gi
    storageClass: nfs-client  # Adjust based on your storage provisioner

Notes:

  • Adjust instances under spec to reflect your cluster's needs (e.g., number of nodes).

  • Customize PostgreSQL parameters (postgresql.parameters) according to your infrastructure and performance requirements.

  • Ensure secrets (pg-superuser and pg-app-user) are created in the pg namespace before applying the YAML.

  • The storageClass should match your Persistent Volume provisioner (nfs-client in this example).

Applying the Configuration:

Apply the configuration to your Kubernetes cluster:

kubectl apply -f pg_config.yml

Verify the deployment:

kubectl get all -n pg

Accessing the Database

To access the database, forward the PostgreSQL service port:

kubectl port-forward svc/my-pgsql-cluster-rw -n pg --address <node IP address> 5432:5432

Connect your database client (e.g., DBeaver) to <node IP address>:5432 using:

  • Host: <node IP address>

  • Port: 5432

  • User: app_user

  • Password: <app_user secret>

A successful connection indicates the PostgreSQL cluster is operational.

For further details on managing and optimizing your PostgreSQL setup in Kubernetes, additional articles will provide comprehensive guidance.

Conclusion

In this article, we successfully deployed a High-Availability PostgreSQL database within a Kubernetes cluster.

We began by installing the Krew plugin manager, which enabled us to add the CloudNativePG (cnpg) plugin to kubectl seamlessly. With this plugin, we crafted manifest files to deploy the CloudNativePG PostgreSQL operator.

Following the installation of the operator, we leveraged its capabilities to orchestrate the database cluster setup. This encompassed the creation of a database, schema, and user, culminating in a fully operational database ready for utilization.