On top of CockroachDB's built-in automation, you can use a third-party orchestration system to simplify and automate even more of your operations, from deployment to scaling to overall cluster management.
This page demonstrates a basic integration with the open-source Kubernetes orchestration system. Using either the CockroachDB Helm chart or a few configuration files, you'll quickly create a 3-node local cluster. You'll run some SQL commands against the cluster and then simulate node failure, watching how Kubernetes auto-restarts without the need for any manual intervention. You'll then scale the cluster with a single command before shutting the cluster down, again with a single command.
To orchestrate a physically distributed cluster in production, see Orchestrated Deployments. To deploy a 30-day free CockroachDB Dedicated cluster instead of running CockroachDB yourself, see the Quickstart.
Before you begin
Before getting started, it's helpful to review some Kubernetes-specific terminology:
Feature | Description |
---|---|
minikube | This is the tool you'll use to run a Kubernetes cluster inside a VM on your local workstation. |
pod | A pod is a group of one of more Docker containers. In this tutorial, all pods will run on your local workstation, each containing one Docker container running a single CockroachDB node. You'll start with 3 pods and grow to 4. |
StatefulSet | A StatefulSet is a group of pods treated as stateful units, where each pod has distinguishable network identity and always binds back to the same persistent storage on restart. StatefulSets are considered stable as of Kubernetes version 1.9 after reaching beta in version 1.5. |
persistent volume | A persistent volume is a piece of storage mounted into a pod. The lifetime of a persistent volume is decoupled from the lifetime of the pod that's using it, ensuring that each CockroachDB node binds back to the same storage on restart. When using minikube , persistent volumes are external temporary directories that endure until they are manually deleted or until the entire Kubernetes cluster is deleted. |
persistent volume claim | When pods are created (one per CockroachDB node), each pod will request a persistent volume claim to “claim” durable storage for its node. |
Step 1. Start Kubernetes
Follow Kubernetes' documentation to install
minikube
, the tool used to run Kubernetes locally, for your OS. This includes installing a hypervisor andkubectl
, the command-line tool used to manage Kubernetes from your local workstation.Note:Make sure you installminikube
version 0.21.0 or later. Earlier versions do not include a Kubernetes server that supports themaxUnavailability
field andPodDisruptionBudget
resource type used in the CockroachDB StatefulSet configuration.Start a local Kubernetes cluster:
$ minikube start
Step 2. Start CockroachDB
To start your CockroachDB cluster, you can either use our StatefulSet configuration and related files directly, or you can use the Helm package manager for Kubernetes to simplify the process.
From your local workstation, use our
cockroachdb-statefulset.yaml
file to create the StatefulSet that automatically creates 3 pods, each with a CockroachDB node running inside it:$ kubectl create -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/cockroachdb-statefulset.yaml
service/cockroachdb-public created service/cockroachdb created poddisruptionbudget.policy/cockroachdb-budget created statefulset.apps/cockroachdb created
Confirm that three pods are
Running
successfully. Note that they will not be consideredReady
until after the cluster has been initialized:$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 0/1 Running 0 2m cockroachdb-1 0/1 Running 0 2m cockroachdb-2 0/1 Running 0 2m
Confirm that the persistent volumes and corresponding claims were created successfully for all three pods:
$ kubectl get pv
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM REASON AGE pvc-52f51ecf-8bd5-11e6-a4f4-42010a800002 1Gi RWO Delete Bound default/datadir-cockroachdb-0 26s pvc-52fd3a39-8bd5-11e6-a4f4-42010a800002 1Gi RWO Delete Bound default/datadir-cockroachdb-1 27s pvc-5315efda-8bd5-11e6-a4f4-42010a800002 1Gi RWO Delete Bound default/datadir-cockroachdb-2 27s
Use our
cluster-init.yaml
file to perform a one-time initialization that joins the CockroachDB nodes into a single cluster:$ kubectl create \ -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/cluster-init.yaml
job.batch/cluster-init created
Confirm that cluster initialization has completed successfully. The job should be considered successful and the Kubernetes pods should soon be considered
Ready
:$ kubectl get job cluster-init
NAME COMPLETIONS DURATION AGE cluster-init 1/1 7s 27s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cluster-init-cqf8l 0/1 Completed 0 56s cockroachdb-0 1/1 Running 0 7m51s cockroachdb-1 1/1 Running 0 7m51s cockroachdb-2 1/1 Running 0 7m51s
The StatefulSet configuration sets all CockroachDB nodes to log to stderr
, so if you ever need access to a pod/node's logs to troubleshoot, use kubectl logs <podname>
rather than checking the log on the persistent volume.
Install the Helm client (version 3.0 or higher) and add the
cockroachdb
chart repository:$ helm repo add cockroachdb https://charts.cockroachdb.com/
"cockroachdb" has been added to your repositories
Update your Helm chart repositories to ensure that you're using the latest CockroachDB chart:
$ helm repo update
Install the CockroachDB Helm chart.
Provide a "release" name to identify and track this particular deployment of the chart.
Note:This tutorial uses
my-release
as the release name. If you use a different value, be sure to adjust the release name in subsequent commands.$ helm install my-release cockroachdb/cockroachdb
Behind the scenes, this command uses our
cockroachdb-statefulset.yaml
file to create the StatefulSet that automatically creates 3 pods, each with a CockroachDB node running inside it, where each pod has distinguishable network identity and always binds back to the same persistent storage on restart.Confirm that CockroachDB cluster initialization has completed successfully, with the pods for CockroachDB showing
1/1
underREADY
and the pod for initialization showingCOMPLETED
underSTATUS
:$ kubectl get pods
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 8m my-release-cockroachdb-1 1/1 Running 0 8m my-release-cockroachdb-2 1/1 Running 0 8m my-release-cockroachdb-init-hxzsc 0/1 Completed 0 1h
Confirm that the persistent volumes and corresponding claims were created successfully for all three pods:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-71019b3a-fc67-11e8-a606-080027ba45e5 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-0 standard 11m pvc-7108e172-fc67-11e8-a606-080027ba45e5 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-1 standard 11m pvc-710dcb66-fc67-11e8-a606-080027ba45e5 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-2 standard 11m
The StatefulSet configuration sets all CockroachDB nodes to log to stderr
, so if you ever need access to a pod/node's logs to troubleshoot, use kubectl logs <podname>
rather than checking the log on the persistent volume.
Step 3. Use the built-in SQL client
Launch a temporary interactive pod and start the built-in SQL client inside it:
$ kubectl run cockroachdb -it \ --image=cockroachdb/cockroach:v22.2.8 \ --rm \ --restart=Never \ -- sql \ --insecure \ --host=cockroachdb-public
$ kubectl run cockroachdb -it \ --image=cockroachdb/cockroach:v22.2.8 \ --rm \ --restart=Never \ -- sql \ --insecure \ --host=my-release-cockroachdb-public
Run some basic CockroachDB SQL statements:
> CREATE DATABASE bank;
> CREATE TABLE bank.accounts ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), balance DECIMAL );
> INSERT INTO bank.accounts (balance) VALUES (1000.50), (20000), (380), (500), (55000);
> SELECT * FROM bank.accounts;
id | balance +--------------------------------------+---------+ 6f123370-c48c-41ff-b384-2c185590af2b | 380 990c9148-1ea0-4861-9da7-fd0e65b0a7da | 1000.50 ac31c671-40bf-4a7b-8bee-452cff8a4026 | 500 d58afd93-5be9-42ba-b2e2-dc00dcedf409 | 20000 e6d8f696-87f5-4d3c-a377-8e152fdc27f7 | 55000 (5 rows)
Exit the SQL shell and delete the temporary pod:
> \q
Step 4. Access the DB Console
To access the cluster's DB Console:
In a new terminal window, port-forward from your local machine to the
cockroachdb-public
service:$ kubectl port-forward service/cockroachdb-public 8080
$ kubectl port-forward service/cockroachdb-public 8080
$ kubectl port-forward service/my-release-cockroachdb-public 8080
Forwarding from 127.0.0.1:8080 -> 8080
Note:Theport-forward
command must be run on the same machine as the web browser in which you want to view the DB Console. If you have been running these commands from a cloud instance or other non-local shell, you will not be able to view the UI without configuringkubectl
locally and running the aboveport-forward
command on your local machine.Go to http://localhost:8080.
In the UI, verify that the cluster is running as expected:
- View the Node List to ensure that all nodes successfully joined the cluster.
- Click the Databases tab on the left to verify that
bank
is listed.
Step 5. Simulate node failure
Based on the replicas: 3
line in the StatefulSet configuration, Kubernetes ensures that three pods/nodes are running at all times. When a pod/node fails, Kubernetes automatically creates another pod/node with the same network identity and persistent storage.
To see this in action:
Terminate one of the CockroachDB nodes:
$ kubectl delete pod cockroachdb-2
pod "cockroachdb-2" deleted
$ kubectl delete pod cockroachdb-2
pod "cockroachdb-2" deleted
$ kubectl delete pod my-release-cockroachdb-2
pod "my-release-cockroachdb-2" deleted
In the DB Console, the Cluster Overview will soon show one node as Suspect. As Kubernetes auto-restarts the node, watch how the node once again becomes healthy.
Back in the terminal, verify that the pod was automatically restarted:
$ kubectl get pod cockroachdb-2
NAME READY STATUS RESTARTS AGE cockroachdb-2 1/1 Running 0 12s
$ kubectl get pod cockroachdb-2
NAME READY STATUS RESTARTS AGE cockroachdb-2 1/1 Running 0 12s
$ kubectl get pod my-release-cockroachdb-2
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-2 1/1 Running 0 44s
Step 6. Add nodes
Use the
kubectl scale
command to add a pod for another CockroachDB node:$ kubectl scale statefulset cockroachdb --replicas=4
statefulset "cockroachdb" scaled
$ kubectl scale statefulset my-release-cockroachdb --replicas=4
statefulset "my-release-cockroachdb" scaled
Verify that the pod for a fourth node,
cockroachdb-3
, was added successfully:$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 28m cockroachdb-1 1/1 Running 0 27m cockroachdb-2 1/1 Running 0 10m cockroachdb-3 1/1 Running 0 5s example-545f866f5-2gsrs 1/1 Running 0 25m
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 28m my-release-cockroachdb-1 1/1 Running 0 27m my-release-cockroachdb-2 1/1 Running 0 10m my-release-cockroachdb-3 1/1 Running 0 5s example-545f866f5-2gsrs 1/1 Running 0 25m
Step 7. Remove nodes
To safely remove a node from your cluster, you must first decommission the node and only then adjust the spec.replicas
value of your StatefulSet configuration to permanently remove it. This sequence is important because the decommissioning process lets a node finish in-flight requests, rejects any new requests, and transfers all range replicas and range leases off the node.
If you remove nodes without first telling CockroachDB to decommission them, you may cause data or even cluster unavailability. For more details about how this works and what to consider before removing nodes, see Prepare for graceful shutdown.
Launch a temporary interactive pod and use the
cockroach node status
command to get the internal IDs of nodes:$ kubectl run cockroachdb -it \ --image=cockroachdb/cockroach:v22.2.8 \ --rm \ --restart=Never \ -- node status \ --insecure \ --host=cockroachdb-public
id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | cockroachdb-0.cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | cockroachdb-2.cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | cockroachdb-1.cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | cockroachdb-3.cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows)
$ kubectl run cockroachdb -it \ --image=cockroachdb/cockroach:v22.2.8 \ --rm \ --restart=Never \ -- node status \ --insecure \ --host=my-release-cockroachdb-public
id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | my-release-cockroachdb-0.my-release-cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | my-release-cockroachdb-2.my-release-cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | my-release-cockroachdb-1.my-release-cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | my-release-cockroachdb-3.my-release-cockroachdb.default.svc.cluster.local:26257 | v22.2.8 | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows)
Note the ID of the node with the highest number in its address (in this case, the address including
cockroachdb-3
) and use thecockroach node decommission
command to decommission it:Note:It's important to decommission the node with the highest number in its address because, when you reduce the replica count, Kubernetes will remove the pod for that node.
$ kubectl run cockroachdb -it \ --image=cockroachdb/cockroach:v22.2.8 \ --rm \ --restart=Never \ -- node decommission <node ID> \ --insecure \ --host=cockroachdb-public
$ kubectl run cockroachdb -it \ --image=cockroachdb/cockroach:v22.2.8 \ --rm \ --restart=Never \ -- node decommission <node ID> \ --insecure \ --host=my-release-cockroachdb-public
You'll then see the decommissioning status print to
stderr
as it changes:id | is_live | replicas | is_decommissioning | membership | is_draining -----+---------+----------+--------------------+-----------------+-------------- 4 | true | 73 | true | decommissioning | false
Once the node has been fully decommissioned, you'll see a confirmation:
id | is_live | replicas | is_decommissioning | membership | is_draining -----+---------+----------+--------------------+-----------------+-------------- 4 | true | 0 | true | decommissioning | false (1 row) No more data reported on target nodes. Please verify cluster health before removing the nodes.
Once the node has been decommissioned, remove a pod from your StatefulSet:
$ kubectl scale statefulset cockroachdb --replicas=3
statefulset "cockroachdb" scaled
$ helm upgrade \ my-release \ cockroachdb/cockroachdb \ --set statefulset.replicas=3 \ --reuse-values
Step 8. Stop the cluster
If you plan to restart the cluster, use the
minikube stop
command. This shuts down the minikube virtual machine but preserves all the resources you created:$ minikube stop
Stopping local Kubernetes cluster... Machine stopped.
You can restore the cluster to its previous state with
minikube start
.If you do not plan to restart the cluster, use the
minikube delete
command. This shuts down and deletes the minikube virtual machine and all the resources you created, including persistent volumes:$ minikube delete
Deleting local Kubernetes cluster... Machine deleted.
Tip:To retain logs, copy them from each pod'sstderr
before deleting the cluster and all its resources. To access a pod's standard error stream, runkubectl logs <podname>
.
See also
Explore other core CockroachDB benefits and features:
- Replication & Rebalancing
- Fault Tolerance & Recovery
- Low Latency Multi-Region Deployment
- Serializable Transactions
- Cross-Cloud Migration
- Orchestration
- JSON Support
You might also want to learn how to orchestrate a production deployment of CockroachDB with Kubernetes.