Databases on Kubernetes -- Support your cloud native strategy with open source Kubernetes operators
Today, developers are used to running applications in the cloud. They are accustomed to using software containers and building applications using microservices components connected by APIs. Gartner estimates that more than 90 percent of global organizations will be running containerized applications in production by 2027, up from less than 40 percent in 2021. Similarly, the company has predicted that 70 percent of organizations will complement continuous delivery for their applications with continuous infrastructure automation to improve business agility by 2025.
From an infrastructure perspective, this means Kubernetes. However, Kubernetes was initially built to manage stateless application components rather than the rest of the infrastructure that goes to make up IT systems. For the other elements involved, such as databases, containerization had to be made to fit.
From this, you might think running databases in containers would be sub-optimal. However, this is not the case. Kubernetes can help manage these systems more efficiently with automation. Rather than being 'second class citizens' within your overall infrastructure, running databases on Kubernetes brings up new opportunities to improve your performance overall.
Automation and database management
To start with, let’s look at the challenges that people face around database management. Alongside deploying the database, each instance will need to be looked after over time. This starts with ensuing availability for the database -- if it supports a critical application, then implementing a cluster for high availability might be needed.
Alongside this, you will have to consider your data backup process, so that you can have a copy if you ever need to bring the database instance back online. You may also need data replication for sending data offsite to another location for redundancy and resiliency.
For more complex applications that process large volumes of data, one instance or server might not be enough to handle the workload. Instead you may have to spread the work across multiple instances using sharding. Lastly, you will need information from that database server itself on how well it performs, so you will need logging, metrics and monitoring data for observability.
Each of these tasks is important to keep your application healthy, from providing protection against downtime through to showing you how well you are running. They all require some specialist knowledge to implement if you are going to do them yourself.
Where Kubernetes can help is in making it easier to automate all these tasks so that they don’t need as much manual work to maintain database installations over time. Rather than needing a specific DBA to keep a close eye on instances all day, automating these steps can remove some of that overhead. This approach relies on a Kubernetes operator that links the database you have with Kubernetes to pass information and requests back and forth between systems.
Kubernetes operators provide the management interface between Kubernetes and the databases that you have. They are therefore critical to get right. However, you don’t have to start from scratch -- there are multiple operators available for each open source database that you might have in place, such as MySQL or PostgreSQL, as well as for other databases like MongoDB. These operators should fit into your existing Infrastructure-as-Code tools and CI/CD pipelines, and automate both your day-one deployment tasks and your day-two management operations.
The essential elements to bear in mind when you pick your operator are as follows:
- Can it support any platform? Rather than being tied to any particular cloud service or Kubernetes provider service, can you use this operator to run your database containers where you want them? This helps you avoid usage restrictions for any cloud or on-prem infrastructure.
- Is the operator supported, and how many people are involved? Some operators might be available after being developed to meet the needs of a single company and then released, while others are developed and supported for a wider community to take advantage of.
- Is the operator itself open source? Using an open source Kubernetes operator means that the community can get involved and provide support if you need help
Running databases on Kubernetes, you can support your Cloud Native strategy and manage your database workloads on any supported Kubernetes cluster running in private, public, hybrid, or multi-cloud environments.
Databases on Kubernetes are commonly considered complex, but with the right approach to a Kubernetes operator in mind, you can simplify database management, enable easier migrations to Kubernetes, and respond to demand spikes in a flexible manner. More importantly, you can reduce your costs over using public cloud services that have additional overheads and profit margins for the provider, while also keeping full control over your database deployment and infrastructure strategy for the future. Your choice of Kubernetes operator will affect how successful your approach to continuous application deployment and continuous infrastructure automation will be.
Sergey Pronin is Group Product Manager at Percona, an open source database software, support and services company. He has spent the last fifteen years working around databases, site reliability engineering and software development best practices at vendor and end-user organizations. Prior to Percona, he led software engineering at ESX Capital and Alpari Group.