Effortless monitoring of Java applications on Kubernetes

The content of this page hasn't been updated for years and might refer to discontinued products and projects.

At Banzai Cloud we place a lot of emphasis on the observability of applications deployed to the Pipeline Platform, which we built on top of Kubernetes. To this end, one of the key components we use is Prometheus. Just to recap, Prometheus is:

an open source systems monitoring and alerting tool
a powerful query language (PromQL)
a pull based metrics gathering system
a simple text format for metrics exposition

Problem statement 🔗︎

Usually, legacy applications are not exactly prepared for these last two, so we need a solution that bridges the gap between systems that do not speak the Prometheus metrics format: enter exporters.
Containerized (legacy) applications are already pre-packaged as Kubernetes deployments, and operators are not always prepared to modify, or capable of modifying, them
We need a way to place or inject the Prometheus JMX exporter into containers/Java processes within a Pod
Ideally this happens without any modification of existing deployment descriptors or code; the exporter automagically works for any deployment, gets configured and exposes metrics.

Resolution 🔗︎

In our A complete guide to Kubernetes Operator SDK post, we demonstrated how to write a Kubernetes operator using the new operator-sdk through the example of injecting a Prometheus JMX Exporter into a Java process.

This post focuses on the details of enabling the monitoring of Java processes running inside Pods to add JVM metrics monitoring on the fly. In this blog post we’ll take a deep dive into the building blocks necessary for the operator to be able to inject a Prometheus JMX Exporter into a java process running inside a Pod:

Prometheus Java Agent Loader
Running commands inside of running container remotely
Deploy application artifacts into running containers

Java agent inject

Prometheus Java Agent Loader 🔗︎

Assuming that, somehow, we are able to get inside a container, inside a Pod, we need a way of attaching to a running Java process identified by a pid, and injecting the Prometheus JMX Exporter java agent into it. This is what Prometheus Java Agent Loader does for us.

Find Java process with given `pid` 🔗︎

private static VirtualMachineDescriptor findVirtualMachine(long pid) {
    List<VirtualMachineDescriptor> vmds = VirtualMachine.list();

    List<VirtualMachineDescriptor> selectedVmds = vmds.stream()
            .filter(javaProc -> javaProc.id().equalsIgnoreCase(Long.toString(pid)))
            .collect(Collectors.toList());

    if (selectedVmds.isEmpty()) {
        System.out.println(String.format("No Java process with PID %d found !", pid));
        System.exit(1);
    }

    return selectedVmds.get(0);
}

This code snippet lists all running virtual machines on the host, and returns the one which VirtualMachineDescriptor.id matches to the given pid.

Inject Java Agent into the running process 🔗︎

Once we’ve got an instance of VirtualMachineDescriptor running a Java process, we attach to the process:

private VirtualMachine attach(VirtualMachineDescriptor vmd) {
    VirtualMachine vm = null;
    try {
        vm = VirtualMachine.attach(vmd);
    }
    catch (AttachNotSupportedException ex) {
        System.out.println("Attaching to (" + vmd + ") failed due to: " + ex);
    }
    catch (IOException ex) {
        System.out.println("Attaching to (" + vmd + ") failed due to: " + ex);
    }

    return vm;
}

In the next step, we inject the agent into the attached Java process using VirtualMachine.loadAgent:

public boolean loadInto(VirtualMachineDescriptor vmd) {
    VirtualMachine vm = attach(vmd);
    if (vm == null)
    {
        return false;
    }

    try {
        vm.loadAgent(
                prometheusAgentPath,
                String.format("%d:%s", prometheusPort, prometheusAgentConfigPath)
                );

        return true;

    } catch (AgentLoadException e) {
        System.out.println(e);
    } catch (AgentInitializationException e) {
        System.out.println(e);
    } catch (IOException e) {
        System.out.println(e);
    }
    finally {
        detach(vm);
    }

    return false;
}

Running commands inside of a running container, remotely 🔗︎

It’s not enough to have an application that can load Java Agents into a running Java process, we have to make sure it’s capable of running inside a container, running inside a Pod. We’re aware of the existence of kubectl exec, but, instead of invoking kubectl exec from our operator, we opted to borrow the kubectl exec implementation for our operator. The entire source code of the implementation is available here.

In a nutshell we POST an exec request to the container of a Pod. The parameters of the request contain the actual command to be executed inside the container, and stdin/stdout performs interactions with the running command:

execReq := kubeClient.CoreV1().RESTClient().Post()
execReq = execReq.Resource("pods").Name(podName).Namespace(namespace).SubResource("exec")

execReq.VersionedParams(&v1.PodExecOptions{
    Container: container.Name,
    Command:   command,
    Stdout:    true,
    Stderr:    true,
    Stdin:     stdinReader != nil,
}, scheme.ParameterCodec)

exec, err := remotecommand.NewSPDYExecutor(inClusterConfig, "POST", execReq.URL())

stdOut := bytes.Buffer{}
stdErr := bytes.Buffer{}

err = exec.Stream(remotecommand.StreamOptions{
    Stdout: bufio.NewWriter(&stdOut),
    Stderr: bufio.NewWriter(&stdErr),
    Stdin:  stdinReader,
    Tty:    false,
})

Deploy application artifacts into running containers 🔗︎

Now that we know how to load an agent into a running Java process and execute a command remotely, the last step is to ship the artifacts of the agent loader and the agent itself into the running container. Again we turned to kubectl for advice on how to perform kubectl cp using Kubernetes API calls.

How it’s implemented:

Package all the application artifacts into a `tar` file 🔗︎

// makeTar tars the files and subdirectories of srcDir into tarDestDir (root directory within the tar file)
// then writes the created tar file to writer
func makeTar(srcDir, tarDestDir string, writer io.Writer) error {
	srcDirPath := path.Clean(srcDir)
	tarDestDirPath := path.Clean(tarDestDir)

	tarWriter := tar.NewWriter(writer)
	defer tarWriter.Close()

	return makeTarRec(srcDirPath, tarDestDirPath, tarWriter)
}

// makeTarRec recursively tars the content of srcDirPath
func makeTarRec(srcPath, tarDestPath string, writer *tar.Writer) error {
	stat, err := os.Lstat(srcPath)
	if err != nil {
		return err
	}

	if stat.IsDir() {
		files, err := ioutil.ReadDir(srcPath)
		if err != nil {
			return err
		}

		if len(files) == 0 {
			// empty dir
			hdr, err := tar.FileInfoHeader(stat, srcPath)
			if err != nil {
				return err
			}

			hdr.Name = tarDestPath
			if err := writer.WriteHeader(hdr); err != nil {
				return err
			}
		}

		for _, f := range files {
			if err := makeTarRec(path.Join(srcPath, f.Name()), path.Join(tarDestPath, f.Name()), writer); err != nil {
				return err
			}
		}
	} else if stat.Mode()&os.ModeSymlink != 0 {
		//case soft link
		hdr, _ := tar.FileInfoHeader(stat, srcPath)
		target, err := os.Readlink(srcPath)
		if err != nil {
			return err
		}

		hdr.Linkname = target
		hdr.Name = tarDestPath
		if err := writer.WriteHeader(hdr); err != nil {
			return err
		}
	} else {
		//case regular file or other file type like pipe
		hdr, err := tar.FileInfoHeader(stat, srcPath)
		if err != nil {
			return err
		}
		hdr.Name = tarDestPath

		if err := writer.WriteHeader(hdr); err != nil {
			return err
		}

		f, err := os.Open(srcPath)
		if err != nil {
			return err
		}
		defer f.Close()

		if _, err := io.Copy(writer, f); err != nil {
			return err
		}
		return f.Close()
	}

	return nil
}

Execute `tar xfm - -C` remotely inside the container 🔗︎

The tar command reads the tar archive file from stdin and extracts it into a destination directory inside the container. The tar archive containing our artifacts is written to the stdin of the tar command that’s running inside the container.

// copyToPod uploads the content of srcDir to destDir on the container of the pod identified by podName
// in namespace
func copyToPod(namespace, podName string, container *v1.Container, srcDir, destDir string) error {
	logrus.Infof("Copying the content of '%s' directory to '%s/%s/%s:%s'", srcDir, namespace, podName, container.Name, destDir)

	ok, err := checkSourceDir(srcDir)
	if err != nil {
		return err
	}

	if !ok {
		logrus.Warnf("Source directory '%s' is empty. There is nothing to copy.", srcDir)
		return nil
	}

	if destDir != "/" && strings.HasSuffix(destDir, "/") {
		destDir = strings.TrimSuffix(destDir, "/")
	}

	err = createDestDirIfNotExists(namespace, podName, container, destDir)
	if err != nil {
		logrus.Errorf("Creating destination directory failed: %v", err)
		return err
	}

	reader, writer := io.Pipe()
	go func() {
		defer writer.Close()

		err := makeTar(srcDir, ".", writer)
		if err != nil {
			logrus.Errorf("Making tar file of '%s' failed: %v", err)
		}
	}()

	_, err = execCommand(namespace, podName, reader, container, "tar", "xfm", "-", "-C", destDir)

	logrus.Infof("Copying the content of '%s' directory to '%s/%s/%s:%s' finished", srcDir, namespace, podName, container.Name, destDir)
	return err
}

Gluing it all together 🔗︎

With these building blocks we can run a command inside containers remotely to get the pid of all running Java processes, upload the artifacts of the agent loader and the Prometheus JMX Exporter agent and the configuration files into the running container, then run the agent loader remotely without actually touching Pod spec, thus avoiding Pod restarts.

This is used extensively in Pipeline in cases pertaining to legacy Java deployments in order to expose application JMX metrics.

Related resources

Banzai Cloud @KubeCon, San Diego

article