Running TiDB on Kubernetes

The content of this page hasn't been updated for years and might refer to discontinued products and projects.

At Banzai Cloud we provision different applications or frameworks to Pipeline, the PaaS we built on Kubernetes. We practice what we preach, and our PaaS’ control plane also runs on Kubernetes and requires a layer of data storage. It was therefore necessary that we explore two different use cases: how to deploy and to run a distributed, scalable and fully SQL compliant DB to cover our client’s, and our own, internal needs. Additionally, most of the legacy or Java Enterprise Edition applications we provision to the Pipeline platform require a database backend. While it may be true that we (currently) support only two wire protocols, this post focuses on how we run/deploy, operate, autoscale and monitor TiDB - a mysql wire protocol-based database.

TiDB 🔗︎

TiDB (pronounced: /‘taɪdiːbi:/ tai-D-B, etymology: titanium) is a Hybrid Transactional/Analytical Processing (HTAP) database. Inspired by the design of Google F1 and Google Spanner, TiDB features infinite horizontal scalability, strong consistency, and high availability. The goal of TiDB is to serve as a one-stop solution for online transactions and analysis.

tl;dr: 🔗︎

We deploy, run, scale and monitor TiDB on Kubernetes. We love TiDB’s architecture and the separation of concerns inherent in its building blocks, which perfectly suits k8s.

$ helm repo add banzaicloud-incubator http://kubernetes-charts-incubator.banzaicloud.com
$ helm repo update
$ helm install banzaicloud-incubator/tidb

Introduction 🔗︎

This chart bootstraps a TiDB deployment on a Kubernetes cluster using the Helm package manager.

Prerequisites 🔗︎

Kubernetes 1.7+ with Beta APIs enabled
PV provisioner support in the underlying infrastructure

Installing the Chart 🔗︎

To install the chart with the release name my-release:

$ helm install --name my-release banzaicloud-incubator/tidb

It deploys TiDB to the Kubernetes cluster with the its default configuration. The configuration section lists the parameters that can be configured during installation.

Uninstalling the Chart 🔗︎

To uninstall/delete the my-release deployment:

$ helm delete my-release

The above command removes all Kubernetes components associated with the chart and deletes the release.

Configuration 🔗︎

The following table lists the configurable parameters of the TiDB chart and their default values.

Parameter	Description	Default
`pd.name`	Placement Drive container name	`pd`
`pd.image`	Placement Drive container image	`pingcap/pd:{VERSION}`
`pd.replicaCount`	Replica Count	`3`
`pd.service.type`	Kubernetes service type to expose	`ClusterIP`
`pd.service.nodePort`	Port to bind to for NodePort service type	`nil`
`pd.service.annotations`	Additional annotations to add to service	`nil`
`pd.service.PeerPort`	Port to bind to for Peer service type	`2380`
`pd.service.ClientPort`	Port to bind to for Client service type	`2379`
`pd.imagePullPolicy`	Image pull policy.	`IfNotPresent`
`pd.resources`	CPU/Memory resource requests/limits	Memory: `256Mi`, CPU: `250m`
`tidb.name`	TiDB container name	`db`
`tidb.image`	TiDB container image	`pingcap/tidb:{VERSION}`
`tidb.replicaCount`	Replica Count	`2`
`tidb.service.type`	Kubernetes service type to expose	`ClusterIP`
`tidb.service.nodePort`	Port to bind to for NodePort service type	`nil`
`tidb.service.annotations`	Additional annotations to add to service	`nil`
`tidb.service.mysql`	Port to bind to for Mysql service type	`4000`
`tidb.service.status`	Port to bind to for Status service type	`10080`
`tidb.imagePullPolicy`	Image pull policy.	`IfNotPresent`
`tidb.persistence.enabled`	Use a PVC to persist data	`false`
`tidb.persistence.existingClaim`	Use an existing PVC	`nil`
`tidb.persistence.storageClass`	Storage class of backing PVC	`nil` (uses alpha storage class annotation)
`tidb.persistence.accessMode`	Use volume as ReadOnly or ReadWrite	`ReadWriteOnce`
`tidb.persistence.size`	Size of data volume	`8Gi`
`tidb.resources`	CPU/Memory resource requests/limits	Memory: `128Mi`, CPU: `250m`
`tikv.name`	TiKV container name	`kv`
`tikv.image`	TiKV container image	`pingcap/tikv:{VERSION}`
`tikv.replicaCount`	Replica Count	`3`
`tikv.service.type`	Kubernetes service type to expose	`ClusterIP`
`tikv.service.nodePort`	Port to bind to for NodePort service type	`nil`
`tikv.service.annotations`	Additional annotations to add to service	`nil`
`tidb.service.ClientPort`	Port to bind to for Client service type	`20160`
`tikv.imagePullPolicy`	Image pull policy.	`IfNotPresent`
`tikv.persistence.enabled`	Use a PVC to persist data	`false`
`tikv.persistence.existingClaim`	Use an existing PVC	`nil`
`tikv.persistence.storageClass`	Storage class of backing PVC	`nil` (uses alpha storage class annotation)
`tikv.persistence.accessMode`	Use volume as ReadOnly or ReadWrite	`ReadWriteOnce`
`tikv.persistence.size`	Size of data volume	`8Gi`
`tikv.resources`	CPU/Memory resource requests/limits	Memory: `128Mi`, CPU: `250m`

Specify each parameter using the --set key=value[,key=value] argument to helm install.

Alternatively, a .yaml file that specifies the values for these parameters may be provided during the chart’s installation. For example:

$ helm install --name my-release -f values.yaml banzaicloud-incubator/tidb

Tip: You can use the default values.yaml

Persistence 🔗︎

The chart mounts a Persistent Volume to a given location. By default, the volume is created using dynamic volume provisioning. An existing PersistentVolumeClaim can be defined thusly:

Existing PersistentVolumeClaims 🔗︎

Create the PersistentVolume
Create the PersistentVolumeClaim
Install the chart

$ helm install --set persistence.existingClaim=PVC_NAME banzaicloud-incubator/tidb

What’s next 🔗︎

This posts highlights how easy it is to use TiDB on Kubernetes through Helm. Obviously, on Pipeline we do things differently, and the cluster and the deployment is provisioned either through the REST API.

Remark 1: Currently we use our own Helm charts, but we’ve noticed that PingCAP is already working on a TiDB operator - once that’s released or the source code is made available we’ll reconsider this approach. We love Kubernetes operators and have written/use quite a few, so we look forward to getting our hands on a new one.

Remark 2: In the event of PD node/StatefulSet failure Pipeline auto-recovers, however, due to an issue with the PD’s internal etcd re-joining, that recovery may not be successful.

Related resources

Kubernetes ingress on kind (Kubernetes IN Docker)

article

Unprivileged OCI image builds on Kubernetes

article

How to correctly size containers for Java 10 applications

article