At Banzai Cloud we’re always looking for new and innovative technologies that support our users in their transition to microservices deployed on Kubernetes, using Pipeline. In recent months we’ve partnered with CoreOS and RedHat to work on Kubernetes operators. That project was opensourced today, and is now available on GitHub. If you read through the rest of this blog, you’ll learn what an operator
is, and how to use the operator sdk
to develop an operator
through a concrete example we developed, here, at Banzai Cloud. Additionally, there are a few operators we’ve made available on our GitHub, each of which are built on the new operator SDK.
tl;dr: 🔗︎
- a new Kubernetes operator framework has been released
- we were actively involved in the creation of the new SDK and, as a result, we have released a few other operators
- the operator discussed herein provides seamless, out of the box monitoring of all JVM based applications, without having to actually scrape an interface
Deploying and operating complex applications, which consist of multiple interdependent components/services on Kubernetes, is no trivial task, especially with the constructs provided by Kubernetes. As an example, let’s consider an application that requires a minimum number of instances that can be solved with Kubernetes deployments. However, if these instances have to be reconfigured or reinitialized at runtime, whenever the number of instances changes (upscale/downscale), then we need to adapt quickly and perform the necessary reconfiguration steps. Trying to solve this problem by implementing scripts that use Kubernetes command line tools grows cumbersome, especially as we get closer to real life use cases wherein we have to deal with resiliency, log collection, monitoring etc.
CoreOS introduced operators for automating the handling of these kinds of complex operational scenarios. In a nutshell, operators
extend the Kubernetes API through a third party resources mechanism (custom resource) and give us fine-grained access and control over what’s going on inside the cluster.
Before we go further, a few words about Kubernetes custom resources are necessary to better understand what an operator
is. A resource in Kubernetes is an endpoint in the Kubernetes API that stores Kubernetes objects (e.g. Pod objects) of a certain kind (e.g. Pod). A custom resource is essentially a resource that can be added to Kubernetes to extend the basic Kubernetes API. Once a custom resource is installed, users can manage objects of this kind with kubectl
, the same way they do for built-in Kubernetes resources like pods. Accordingly, there must be a controller that carries out operations induced via kubectl
. Custom controllers are controllers for custom resources. To summarize, an operator
is a custom controller that works with custom resources of a certain kind.
CoreOS also developed an SDK for developing operators
. The SDK eases the implementation of an operator
, since it provides high level APIs that write operational logic, and generates a skeleton that spares developers writing boilerplate code.
Let’s have a look at how can we use the Operator SDK
.
First we need to install the Operator SDK onto our development machine. If you’re ready for an adventure in the latest and greatest, install the CLI from the master
branch.
Once the CLI is installed the development flow should look as follows:
- Create a new operator project
- Define the Kubernetes resources to watch
- Define the operator logic in a designated handler
- Update and generate code for custom resources
- Build and generate the operator deployment manifests
- Deploy the operator
- Create custom resources
Create a new operator project 🔗︎
Run the CLI to create a new operator
project.
$ cd $GOPATH/src/github.com/<your-github-repo>/
$ operator-sdk new <operator-project-name> --api-version=<your-api-group>/<version> --kind=<custom-resource-kind>
$ cd <operator-project-name>
- operator-project-name - the CLI generates the project skeleton under this directory
- your-api-group - this is the Kubernetes API group for the custom resource handled by our
operator
(e.g. mycompany.com) - version - this is the Kubernetes API version for the custom resource handled by our
operator
(e.g. v1alpha, beta etc see Kubernetes API versioning) - custom-resource-kind - the name of our custom resource type
Define the Kubernetes resources to watch 🔗︎
The main.go
under cmd/<operator-project-name>
is our main point of entry from which we start and initialize the operator
. This is the place to configure the list of resource types the operator is interested in getting notifications about from Kubernetes.
Define the operator logic in a designated handler 🔗︎
The events related to the watched resources received from Kubernetes are channeled into func (h *Handler) Handle(ctx types.Context, event types.Event) error
defined in pkg/stub/handler.go
. This is the place to implement operator logic that reacts to various events published to Kubernetes.
Each custom resource has structure. The structure of the custom resource handled by our operator must be specified in types.go
, which resides under pkg/apis/<api-group>/<version>
. The Spec
field is where we define the structure for the specifications of that custom resource. There is also a Status
field that is meant to be populated with information which describes the state of the custom resource object.
The Operator SDK
exposes functions for performing CRUD operations on Kubernetes resources:
- query package - defines functions for retrieving Kubernetes resources available in the cluster
- action package - defines functions for creating, updating and deleting Kubernetes resources
For more details on how to use these functions, see the concrete operator example below.
Update and generate code for custom resources 🔗︎
Whenever there are changes to types.go
, generated code must be refreshed that depends on types defined in types.go
.
$ operator-sdk generate k8s
Build and generate the operator deployment manifests 🔗︎
Build the operator and generate deployment files.
operator-sdk build <your-docker-image>
A docker image that contains the binary of your operator is built, and needs to be pushed to a registry.
The deployment files for creating custom resources and deploying the operator that handles these are generated in the deploy
directory.
operator.yml
- this installs custom resource definitions and deploys the operator (custom controller). Any changes to this file will be overwritten wheneveroperator-sdk build <your-docker-image>
is executed.cr.yaml
- this is for defining the specs of the custom resource. This will be unmarshalled into an object and passed to the operator.rbac.yaml
- this defines the RBAC that’s to be created for the operator, in case the Kubernetes cluster has RBAC enabled.
Deploy the operator 🔗︎
$ kubectl create -f deploy/rbac.yaml
$ kubectl create -f deploy/operator.yaml
Create custom resources 🔗︎
Once the operator is running, you can start creating custom resources. Populate the spec
section of deploy/cr.yaml
with data that you want to pass to the operator. The structure of spec
must comply with the structure of the Spec
field in types.go
.
$ kubectl create -f deploy/cr.yaml
To see the customer resource objects in the cluster:
$ kubectl get <custom-resource-kind>
To see a specific custom resource instance:
$ kubectl get <custom-resource-kind> <custom-resource-object-name>
The Prometheus JMX Exporter case 🔗︎
Our PaaS Pipeline deploys applications to Kubernetes clusters and provides enterprise features like monitoring and centralized logging, just to name a few.
For monitoring, we use Prometheus to collect metrics from the applications we deploy. If you’re interested in why we chose Prometheus, you should read our monitoring blog series.
Applications may or may not publish metrics to Prometheus by themselves, so, we are faced with the question of how we might enable publishing metrics to Prometheus, out of the box, for all apps. Prometheus JMX Exporter is a handy component written for Java applications, which can query data from mBeans via JMX and expose these in the format required by Prometheus.
To use the exporter you must:
- identify pods that run those Java applications that don’t publish metrics to Prometheus
- inject the Prometheus JMX Exporter java agent into the application to expose its metrics
- provide a configuration for the Prometheus JMX Exporter java agent that controls what metrics are published
- make the Prometheus server automatically aware of an endpoint from where it can scrape metrics
- these operations are non-intrusive (i.e. should not restart the pod)
In order to accomplish the tasks we’ve just listed, we’ll need to perform quite a few operations, so we decided to implement an operator for them. Let’s see how this is done.
Prometheus JMX Exporter, as it is implemented, can only be loaded into Java processes at JVM startup. Fortunately, only a small change is necessary to make it loadable into a preemptively running Java process. You can take a look at the change in our jmx_exporter fork.
We need a loader that loads the JMX exporter Java agent into a running Java process identified by PID. The loader is a fairly small application and its source code is available, here:
Prometheus JMX Exporter requires a configuration to be passed in. We’ll store the configuration for the exporter in a Kubernetes config map
The custom resource for our operator(types.go
):
type PrometheusJmxExporter struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec PrometheusJmxExporterSpec `json:"spec"`
Status PrometheusJmxExporterStatus `json:"status,omitempty"`
}
type PrometheusJmxExporterSpec struct {
LabelSelector map[string]string `json:"labelSelector,required"`
Config struct {
ConfigMapName string `json:"configMapName,required"`
ConfigMapKey string `json:"configMapKey,required"`
} `json:"config"`
Port int `json: port,required`
}
LabelSelector
- specifies the labels underwhich Pods are selectedConfigMapName
,ConfigMapKey
- the config map that contains the configuration of the Prometheus JMX ExporterPort
- the port number for the endpoint where metrics will be exposed to a Prometheus server.
Here’s an example yaml file for creating custom resource objects:
apiVersion: "banzaicloud.com/v1alpha1"
kind: "PrometheusJmxExporter"
metadata:
name: "example-prom-jmx-exp"
spec:
labelSelector:
app: dummyapp
config:
configMapName: prometheus-jmx-exporter-config
configMapKey: config.yaml
port: 9400
The custom resource spec holds the data that instructs the operator logic as to which pods to process, the port to expose the metrics at, and the config map that stores the metrics configuration for the exporter.
The status of a PrometheusJmxExporter custom resource object should list the metrics endpoints that were created based on it’s specs, thus the structure for the Status
field is:
type PrometheusJmxExporterStatus struct {
MetricsEndpoints []*MetricsEndpoint `json: metricsEndpoints,omitempty`
}
type MetricsEndpoint struct {
Pod string `json:"pod,required"`
Port int `json:"port,required"`
}
The operator has to react to events related to PrometheusJmxExporter custom resources and Pods, thus it has to watch these kinds of resources(main.go
):
func main() {
...
namespace := os.Getenv("OPERATOR_NAMESPACE")
sdk.Watch("banzaicloud.com/v1alpha1", "PrometheusJmxExporter", namespace, 0)
sdk.Watch("v1", "Pod", namespace, 0)
...
}
The handler for events related to PrometheusJmxExporter custom resources and Pods is defined in handler.go
:
func (h *Handler) Handle(ctx types.Context, event types.Event) error {
switch o := event.Object.(type) {
case *v1alpha1.PrometheusJmxExporter:
prometheusJmxExporter := o
...
...
case *v1.Pod:
pod := o
...
...
}
When a PrometheusJmxExporter custom resource object is created/updated, the operator:
- queries all pods in the current namespace whose label matches the labelSelector of the PrometheusJmxExporter custom resource object spec
- verifies which of the returned pods are already processed in order to skip them
- processes the remaining pods
- updates the status of the current PrometheusJmxExporter custom resource with the newly created metrics endpoints
When a Pod is created/updated/deleted the operator:
- searches for the PrometheusJmxExporter custom resource object, which labelSelector matched to the pod
- if a PrometheusJmxExporter custom resource object is found, then it continues processing the pod
- updates the status of the PrometheusJmxExporter custom resource with the newly created metrics endpoints
In order to query Kubernetes resources we use the query
package of the Operator SDK
.
e.g.:
podList := v1.PodList{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
}
listOptions := query.WithListOptions(&metav1.ListOptions{
LabelSelector: labelSelector,
IncludeUninitialized: false,
})
err := query.List(namespace, &podList, listOptions)
if err != nil {
logrus.Errorf("Failed to query pods : %v", err)
return nil, err
}
jmxExporterList := v1alpha1.PrometheusJmxExporterList{
TypeMeta: metav1.TypeMeta{
Kind: "PrometheusJmxExporter",
APIVersion: "banzaicloud.com/v1alpha1",
},
}
listOptions := query.WithListOptions(&metav1.ListOptions{
IncludeUninitialized: false,
})
if err := query.List(namespace, &jmxExporterList, listOptions); err != nil {
logrus.Errorf("Failed to query prometheusjmxexporters : %v", err)
return nil, err
}
To update Kubernetes resources we use the action
package of the Operator SDK
, like so:
// update status
newStatus := createPrometheusJmxExporterStatus(podList.Items)
if !prometheusJmxExporter.Status.Equals(newStatus) {
prometheusJmxExporter.Status = createPrometheusJmxExporterStatus(podList.Items)
logrus.Infof(
"PrometheusJmxExporter: '%s/%s' : Update status",
prometheusJmxExporter.Namespace,
prometheusJmxExporter.Name)
action.Update(prometheusJmxExporter)
}
The processing of a pod consists of the following steps:
- execute
jps
inside the containers of the pod to get the PID of java processes - copy the Prometheus JMX Exporter and java agent loader artifacts into the containers where a Java process has been found
- read the exporter configuration from the config map and copy it into the container as a config file
- run the loader inside the container to load the exporter into the Java process
- add the port of the exporter to the container’s exposed port list, so the Prometheus server will be able to scrape that port
- annotate the pod with
prometheus.io/scrape
andprometheus.io/port
, since the Prometheus server scrapes pods with these annotations - flag the pod with an annotation to mark that it has been successfully processed
As the Kubernetes API doesn’t directly support the execution of commands inside containers, we borrowed this implementation from kubectl exec
. The same is true for kubectl cp
.
The source code of Prometheus JMX Exporter operator
is available on GitHub