Banzai Cloud Logo Close
Home Products Benefits Blog Company Contact
Sign in

At Banzai Cloud we strive to enable a secure software supply chain which ensures that applications deployed with the Pipeline platform and Pipeline Kubernetes Engine are secure, without reducing developer productivity across all environments (on-premise, multi-, hybrid-, and edge-cloud). While we have our own internal processes and a dedicated security team working full time on hardening the entire application platform stack, it also makes sense to provide confidence to our customers following industry standard benchmarks.

Read more...

One of the core features of Pipeline, Banzai Cloud’s application and devops container management platform, is multi-dimensional autoscaling based on default and custom metrics. Upon our introduction of custom metrics, we opted for an approach that relied on the Prometheus Adapter to gather metrics from Prometheus. Since then, a lot of our customers have begun using Hoizontal Pod Autoscaling, and most of them have been satisfied with only basic CPU & memory metrics.

Read more...

Banzai Cloud is proud to announce that our open source Pipeline Kubernetes Engine is now a CNCF Certified Kubernetes Distribution! PKE is an extremely simple Kubernetes installer and distribution, designed to work anywhere, and is the preferred run-time of Banzai Cloud’s cloud native application and devops container management platform, Pipeline. Banzai Cloud Pipeline supercharges the development, deployment and scaling of container-based applications with native support for multi-, hybrid-, and edge-cloud environments.

Read more...

Two weeks ago we announced the first release of our Istio operator. Since then we’ve added support for Istio’s preliminary 1.1 release. This post will detail how and why you should try it. In creating the operator, our main goal was to simplify the deployment and management of Istio’s components. This release is still in alpha, and its main goal is still to replace Helm charts as a preferred means of installing Istio, but it provides a few additional features we think you’ll find convenient.

Read more...

One of the key features of the Pipeline platform is its ability to automatically provision, manage, and operate different application frameworks through what we call spotguides. Among the many spotguides we support on Kubernetes (Spark, Zeppelin, NodeJS, Golang, even custom frameworks - to name a few) Apache Kafka is among the most popular. We are heavily invested in making it as easy and straightforward as possible to operate Apache Kafka automatically on Kubernetes, and we believe that our current Apache Kafka Spotguide does just that.

Read more...

Last week we released our open source Istio operator designed to help ease the sometimes difficult task of managing Istio. One of the main feature of the operator is its ability to manage a single mesh multi-cluster Istio. Multi cluster scenarios Typical multi-cluster-based patterns are single mesh - combining multiple clusters into one unit managed by one Istio control plane - and mesh federation, wherein multiple clusters act as individual management domains and the service exposure between those domains is done selectively.

Read more...

At Banzai Cloud we try to provide our users with a unified, cloud and on-premise-agnostic authentication and authorization mechanism. Note that our Pipeline platform supports cloud provider-managed Kubernetes and, as of recently, our own Kubernetes distribution - the Pipeline Kubernetes Engine, PKE. We also recently introduced an open source project, JWT-to-RBAC (you can read more about that project, here), designed to solve authentication and authorization challenges within the Pipeline platform in a cloud provider-agnostic way.

Read more...

Service mesh has, without question, been one of the most vigorously debated and obsessed over topics of discussion in recent memory. It seems like, whichever way you turn, you run into heated arguments between those developers that are convinced that service mesh will outgrow even Kubernetes, and the naysayers convinced that, outside of use in a few large companies, service mesh is impractical to the point of uselessness. As always, the truth probably lies somewhere in between, but that doesn’t mean you can avoid developing an opinion, especially if you’re a Kubernetes distribution and platform provider like us.

Read more...

If you’re a Node.js developer, it’s very likely that you are building microservices and have already come across Kubernetes. Kubernetes is a solid foundation for scalable container deployments but is also infamous for it’s steep learning curve. This post will explain how to containerize your Node.js applications and what it will take to make them production-ready on Kubernetes. While this post can be used as a tutorial for DIY jobs, we have also automated this entire process, allowing developers to kickstart their code to production experience with a few clicks.

Read more...

At Banzai Cloud we are building a managed Cloud Native application and devops platform called Pipeline. Pipeline supercharges the development, deployment and scaling of container-based applications with native support for multi and hybrid-cloud environments. The Pipeline platform provides support for advanced scheduling that enables enterprises to run their workflows in an efficient way by scheduling workflows to nodes that meet the needs of the workflow (e.g.: CPU, memory, network, IO, spot price, etc).

Read more...

Banzai Cloud’s Pipeline platform allows enterprises to develop, deploy and scale container-based applications on six cloud providers, using multiple Kubernetes distributions. One significant difference between the cloud providers that support Kubernetes (we support ACSK, EKS, AKS, GKE, DO and OKE) and our own Banzai Cloud Pipeline Kubernetes Engine is our ability to access the Kubernetes API server, and to configure it. Whether our enterprise customers are using Banzai Cloud’s PKE distribution in a hybrid environment, or cloud provider-managed Kubernetes, they demand we meet the same high standards - the ability to authenticate and authorize (e.

Read more...

One of the Banzai Cloud Pipeline platform’s key open-source projects is Bank-Vaults - the Vault swiss-army knife (and more) for Kubernetes. Feature requirements are part of the Pipeline platform, and the relatively large community around Bank-Vaults also has its own use cases and requirements. We’ve received lots of external contributions (thank you!), and we continue to find time to work on our community-driven features. While there have been many besides, these are the most sought-after features of the last few weeks.

Read more...

At Banzai Cloud we’re building a managed Cloud Native application and devops platform, called Pipeline. Pipeline supercharges the development, deployment and scaling of container-based applications with native support for multi- and hybrid-cloud environments. Pipeline’s built-in CI/CD solution is capable of creating Kubernetes clusters, running and testing builds, packaging and deploying applications as Helm charts, and lots more—all while its secrets are stored and managed by Vault. If you’d like to read more about the CI/CD system’s other features, such as native Kubernetes support, unprivileged builds and more, please read this post.

Read more...

As those of you who are following us here at Banzai Cloud may or may not be aware, we are in the middle of releasing/certifying our own Kubernetes distribution — Pipeline Kubernetes Engine (PKE). PKE will be orchestrated the same way as other providers already supported by Pipeline, and will benefit from/inherit those features of the Banzai Cloud Pipeline platform that you already know and love. If you’re interested in learning more about PKE and our vision for buidling multi and hybrid cloud managed (application) environments, please read this post

Read more...

Spotguides are one of the most useful features of the Banzai Cloud Pipeline platform. Spotguides are managed application environments, which provide an easy and scalable way of deploying applications, and programmatically take care of all the necessary “plumbing” that is critical for production (if you missed our introductory series on Spotguides, you can catch up here). A Spotguide defines a template for provisioning cloud resources (Kubernetes clusters, object stores, logging, monitoring, security, etc.

Read more...

Suppose you’re working on a project which is running on Kubernetes - like we usually do - and you would like to test out this project on each and every pull request or commit. You can write many unit and integration tests, but at the end of the day, the proof of the pudding is in the eating. A real test would be to start up the application on the same platform where it will end up being deployed in production (in this case Kubernetes) and exercise it with some real workloads (aka end-to-end tests).

Read more...

In December 2018 we released the public beta of Pipeline and introduced a Banzai Cloud terminology - spotguides. We have already gone deep into what Spotguides were and how they supercharged Kubernetes deployments of application frameworks (automated deployments, preconfigured GitHub repositories, CI/CD, job specific automated cluster sizing, Vault based secret management, etc.). This post is focused on one specific spotguide: Spark with HistoryServer. Since the very early days, one of the most popular deployments to Kubernetes has been Apache Spark.

Read more...

The following is a guest blog post from Robbie Blaine, Site Reliability Engineer at EOH Big Data Lab. Contributions from the community are a key factor in driving our products forward. BIG thanks to all of you who have engaged with us by raising issues, giving feedback, or creating pull requests. Keep them coming, we love them! Automating Vault Deployment and Configuration on OKD with Bank-Vaults Hashicorp Vault is an Encryption-as-a-Service tool that is used to securely store and access secrets.

Read more...

Our last two blog posts about the Kubernetes scheduler explained how taints and tolerations and different types of affinities are working. In today’s post we are going one layer deeper and we’ll discuss how to implement and deploy a custom Kubernetes scheduler. Writing a scheduler may sound intimidating at first, but if you’ll follow this article you’ll realise that creating something that works and schedules pods based on some simple rules is quite easy.

Read more...

If you’re a frequent reader of this blog, you may have already seen a “short” description of what our platform does. It usually goes something like this: Banzai Cloud Pipeline is a solution-oriented application platform which allows enterprises to develop, deploy and securely scale container-based applications in multi- and hybrid-cloud environments. We frequently elaborate on this by providing a list of key features: Banzai Cloud Pipeline leverages best-of-breed cloud components, such as Kubernetes and adds a unified system architecture that enables a highly productive, yet flexible environment for developers and operations teams alike.

Read more...

One of the main features of the Banzai Cloud Pipeline platform is that it allows enterprises to run workloads cost effectively by mixing spot or preemptible instances with regular ones, without sacrificing overall reliability. The platform allows enterprises to develop, deploy and scale container-based applications and it leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive yet flexible environment for developers and operation teams alike. tl;dr The Banzai Cloud Pipeline platform switched to a unified, cloud-aware spot instance termination handler to properly drain the cluster node and provide information to the monitoring system if an instance is going to be preempted from a cluster nodepool.

Read more...

About a year ago we published a blog post on Spotguides, a core feature of the Banzai Cloud Pipeline platform. We spent a lot of time using and refining the original ideas, and as a result, many things changed since we first introduced the concept. In this blog post we’ll learn about how Spotguides are used to easily deploy and manage complex cloud-native application stacks. What are Spotguides? At Banzai Cloud most of the project names and terminologies are borrowed from surfing (yes, few of us are eager surfers).

Read more...

Banzai Cloud’s Pipeline platform and Kubernetes distribution tames the complexity inherent in the development, deployment, and scaling of modern containerized applications. The platform seeks to bring the power of cutting-edge cloud and containerization technologies to a wide range of enterprises. “Runners focus on the race they’re running, not the materials their shoes are made of; they trust that their shoes will get them to the finish line,” says Kris Flautner, CEO of Banzai Cloud.

Read more...

A strong focus on security has always been a key part of the Banzai Cloud Pipeline platform. We incorporated Vault into our architecture early in the design process, and developed a number of supporting components so it be used easily on Kubernetes. We love what Vault enables us to do but, as with many things security-related, strengthening one part of a system exposed a weakness elsewhere. For us, that weakness was K8s secrets, which is the way in which applications usually consume secrets and credentials on Kubernetes.

Read more...

Admission webhook series: In-depth introduction to Kubernetes admission webhooks Detecting and blocking vulnerable containers in Kubernetes (deployments) Controlling the scheduling of pods on spot instance clusters Banzai Cloud’s Pipeline platform uses a number of Kubernetes webhooks to provide several advanced features, such as spot instance scheduling, vulnerability scans and some advanced security features (to bypass K8s secrets - more to come next week). The Pipeline webhooks are all open source and available on our GitHub:

Read more...

As of version 1.6, Kubernetes provides role-based access control (RBAC) so that administrators can set up fine-grained access to a variety of Kubernetes resources. It would take too long to fully explain why it makes sense to use RBAC in this post, but, in a nutshell, RBAC provides a level of control that most enterprises need to meet their security requirements within Kubernetes clusters. Processes and human operators that assume the identity of a Kubernetes Service Account will authenticate with said account and gain its associated access rights.

Read more...

The Kubernetes scheduler can be constrained to place a pod on particular nodes using a few different options. One of these options is node and pod affinities. In a smaller homogeneous cluster they probably don’t make too much sense, because the scheduler is doing a good job spreading pods on different nodes, - well, that’s its job - but when you have a larger cluster with different types of nodes, maybe even spreading across availability zones, or multiple racks, then affinities may come in handy.

Read more...

These days it seems that everyone is using some sort a CI/CD solution for their software development projects, either a third-party service, or something written in house. Those of us working on the Banzai Cloud Pipeline platform are no different; our CI/CD solution is capable of creating Kubernetes clusters, running and testing builds, of pulling secrets from Vault, packaging and deploying applications as Helm charts, and lots more. For quite awhile now (since the end of 2017), we’ve been looking for a Kubernetes native solution but could not find many.

Read more...

Banzai Cloud has closed its latest round of seed funding with a total investment of $2.5 million. The round was led by PortfoLion, a Central European venture capital and private equity fund, and included financing from FastVentures and Euroventures of Budapest, the latter being an angel investor that has been with the company since its foundation in 2017. “We’re very excited by this opportunity, by the exceptional team Banzai Cloud has brought together that will make them a force to be reckoned with in the world of enterprise computing.

Read more...

Banzai Cloud is on a mission to simplify the development, deployment, and scaling of complex applications and to bring the full power of Kubernetes to all developers and enterprises. Banzai Cloud’s Pipeline provides a platform which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed technology from the Cloud Native Foundation ecosystem to create a highly productive, yet flexible environment for developers and operation teams alike. One of the key tools we use from the Kubernetes ecosystem is Helm.

Read more...

Banzai Cloud’s Pipeline provides a platform which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures—multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, etc.—are a tier zero feature of the Pipeline platform, which we strive to automate and enable for all enterprises.

Read more...

Business continuity is a key requirement for our enterprise users, and it becomes exponentially more important in a cloud native environment. The Banzai Cloud Platform operates in such an environment, deploying and managing large scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment, and provides a safety net to backup and restore cluster and application states during the lifecycle of the cluster.

Read more...

Banzai Cloud’s Pipeline platform is an operating system which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security - multiple authentication backends, fine grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, etc. - is a tier zero feature of the Pipeline platform, which we strive to automate and enable to all enterprises.

Read more...

Satellite is a Golang library and RESTful API that determines a host’s cloud provider with a simple HTTP call. Behind the scenes, it uses file systems and provider metadata to properly identify cloud providers. When we started to work on Pipeline and the Banzai Cloud Pipeline Platform Operators, we soon realized how frequently we would need to find out which cloud provider the service was actually running on. Note that Pipeline supports 6 different cloud providers

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded Hands on Thanos Monitoring Vault on Kubernetes using Cloud Native technologies At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline.

Read more...

The Banzai Cloud Cloudinfo service retrieves product and pricing information from cloud providers and exposes it through a RESTful API, and UI. Our Kubernetes based Pipeline platform and Telescopes recommendation engine make use of this information when they advise users on cluster layout and resourcing. Here’s a quick primer of how and why we utilize the Cloudinfo service: Pipeline platform users have the option of launching clusters or deploying applications based only on resource- and SLA-requirements (price, IO, memory, CPU, GPU, etc.

Read more...

One of the main advantages of the Pipeline platform is that it allows users to use their infrastructure cost effectively; Telescopes helps with cluster and machine instance recommendations, Hollowtrees enables SLA-aware cost reduction using spot instances, and autoscalers allow for multi-dimensional autoscaling based on custom metrics. This post will highlight some new features of the Banzai Cloud Horizontal Pod Autoscaler Kubernetes Operator and the advanced automation supported by Pipeline - a new, forward-thinking way to operate Kubernetes clusters and autoscale deployments.

Read more...

Enterprises often use multi-tenant and heterogenous clusters to deploy their applications to Kubernetes. These applications usually have needs which require special scheduling constraints. Pods may require nodes with special hardware, isolation, or colocation with other pods running in the system. The Pipeline platform allows users to express their constraints in terms of resources (CPU, memory, network, IO, etc.). These requirements are turned into infrastructure specifications using Telescopes. Once the cluster nodes are created and properly labeled by Pipeline, deployments are run with the specified constraints automatically on top of Kubernetes.

Read more...

Banzai Cloud announced today that it is collaborating with Oracle to bring its feature-rich application platform to Oracle Kubernetes Engine users. Banzai Cloud’s Pipeline deployment automation and execution engine enables developers to go from commit to scale in minutes by automating all the underlying tasks that provide convenient CI/CD flows, robust security, analytics and the ability to scale. The technology not only provides significant productivity gains to developers, but also increases operational efficiencies by aiding instance selection and automated introspection of large-scale workloads.

Read more...

Last year Alibaba joined CNCF and announced plans to create their own Kubernetes service - Alibaba ACK. The service was luanched more than a year ago, with its stated objective to make it easy to run Kubernetes on Alibaba Cloud without needing to install, operate, and maintain a Kubernetes control plane. At Banzai Cloud we are committed to providing support for Kubernetes on all major cloud providers, thus one of our priorities was to enable Alibaba Cloud’s Container Service for Kubernetes in Pipeline and take the DevOps experience to the next level by turning ACK into a feature-rich enterprise-grade application platform.

Read more...

At Banzai Cloud we are building an application-centric platform for containers - Pipeline - running on Kubernetes to allow developers to go from commit to scale in minutes. We support multiple development languages and frameworks to build applications, with one common goal: all Pipeline deployments receive integrated CI/CD, centralized logging, monitoring, enterprise-grade security, autoscaling, and spot price support automatically, out of the box. In most cases we accomplish this in a non-intrusive way (i.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Banzai Cloud is happy to announce that it is an Amazon EKS Platform Partner. Last year Amazon joined CNCF and announced plans to create their own Kubernetes service - Amazon EKS. The service has been launched this June, with the objective to make it easy for you to run Kubernetes on AWS without needing to install, operate, and maintain your own Kubernetes control plane. At Banzai Cloud we are committed to provide support for Kubernetes on all major cloud providers for our users thus one of our priority was to enable Amazon EKS in Pipeline and take the DevOps and user experience to the next level by turning EKS into a feature rich enterprise-grade application platform.

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. We have always been committed to supporting Kubernetes and our container based application platform on all major providers, however, we are also committed to making portability between cloud vendors easy, seamless and automated. Accordingly, this post will highlight a few important aspects of a multi-cloud approach we learned from our users, and the open source code we developed and made part of the Pipeline platform.

Read more...

At Banzai Cloud we put a lot of emphasis on observability, so we automatically provide centralized monitoring and log collection for all clusters and deployments done through Pipeline. Over the last few months we’ve been experimenting with different approaches - tailored and driven by our customers’ individual needs - the best of which are now coded into our open source Logging-Operator. Just to recap, here are our earlier posts about logging using the fluent ecosystem Centralized log collection on Kubernetes.

Read more...

Continuing our commitment to support all major cloud providers, today we are adding support for Oracle’s Kubernetes-managed cloud service, OKE – Oracle Kubernetes Engine in Pipeline. We are building a feature rich enterprise-grade application platform on top of Kubernetes - called Pipeline - to deliver a better DevOps experience by automating the lifecycle management of applications. This experience is now available to OKE users as well; here, we will guide you through the first steps of using OKE and summarize the benefits of using Pipeline.

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers, specifically AWS, GCP, Azure, AliCloud, Oracle and BYOC - on-premise and hybrid - and deploy all kinds of predefined or ad-hoc workloads to these clusters. For us and our enterprise users authentication and authorization is absolutely vital, thus, in order to access the Kubernetes API and the Services in an authenticated manner as defined within Kubernetes, we arrived at a simple but flexible solution.

Read more...

A few months ago the Kubernetes Operator SDK was released with one of its goals being the conversion of human operational knowledge into code. At Banzai Cloud we’ve been contributors and early adopters of this technology, since it provides a better standardized method of automating our processes and allows us to dramatically ease the lives of our customers. We are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline, wherein we endeavour to automate the DevOps experience and the lifecycle of deployments.

Read more...

At Banzai Cloud we are building a feature-rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. Applications deployed to Pipeline automatically inherit the platform’s features: enterprise-grade security, observability (centralized log collection, monitoring and tracing), discovery, high availability and resiliency, just to name a few - encapsulated in spotguides. One of the most popular spotguides we deploy is Spark. In the past few months we’ve been working and pushing many pull requests to make Spark a first class player on Kubernetes and to make it resilient.

Read more...

A few weeks back we released Telescopes, our Kubernetes cluster layout recommender application. That application has evolved quite a bit, and in this post we’ll provide insight into some its new features and recent changes. Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes tl;dr: We added new features to Telescopes to provide support for blacklisting or whitelisting instance types Recommendation accuracies can now be checked There is now support that allows asking cloud instance types for CPU, memory and network performance.

Read more...

One of our goals at BanzaiCloud is to make our customers’ lives easier by providing low barrier to entry, easy to use solutions for running applications on Kubernetes. To achieve this, we often rely on Kubernetes Operators to provide comprehensive solutions over the course of an application’s lifecycle. Here is a list of our operators, which we have already open sourced: Vault Operator Prometheus JMX Exporter Operator

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. Security is one of our main areas of focus, and we strive to automate and enable those security patterns we consider essential, including tier zero features for all enterprises using the Pipeline Platform. We’ve blogged about how to handle security scenarios on several of our previous posts. This time we’d like to focus on a different aspect of securing Kubernetes deployments:

Read more...

We are excited to announce that Banzai Cloud has joined the Cloud Native Computing Foundation! The CNCF and The Linux Foundation are expending extraordinary effort in helping to standardize open source technologies that enable the development, deployment, management and operation of next generation Cloud Native software stacks. Our mission is to bring Cloud Native to enterprises and with this announcement, we strive to help push container and cloud native technology standardization and interoperability forward.

Read more...

At Banzai Cloud we are building a feature-rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. For an enterprise-grade application platform security is a must and it has many building blocks. Please read through the Security series on our blog to learn how we deal with a variety of security-related issues. Security series: Authentication and authorization of Pipeline users with OAuth2 and Vault Dynamic credentials with Vault using Kubernetes Service Accounts Dynamic SSH with Vault and Pipeline Secure Kubernetes Deployments with Vault and Pipeline Policy enforcement on K8s with Pipeline The Vault swiss-army knife The Banzai Cloud Vault Operator Vault unseal flow with KMS

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers such as AWS, GCP, Azure, Oracle, Alibaba and BYOC, on-premise and hybrid, and deploy all kinds of predefined or ad-hoc workloads to these clusters. For us and our enterprise users, Kubernetes secret management (base 64) was not sufficient, so we chose Vault and added Kubernetes support to manage our secrets.

Read more...

If you followed our blog series on Autoscaling on Kubernetes, you should already be familiar with Kubernetes’ Cluster autoscaler and the Vertical Pod Autoscaler used with Java 10 applications. This post will show you how to use the Horizontal Pod Autoscaler to autoscale your deployments based on custom metrics obtained from Prometheus. As a deployment example we’ve chosen our JEE Petstore example application on Wildfly to show that, beside metrics like cpu and memory, which are provided by default on Kubernetes, using our Wildfly Operator, all Java and Java Enterprise Edition / Wildfly specific metrics are automatically placed at your fingertips, available in Prometheus, allowing you to easily autoscale deployments.

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers such as AWS, GCP, Azure and BYOC, on-premise and hybrid, and deploy all kinds of predefined or ad-hoc workloads to these clusters. For us and our enterprise users, Kubernetes secret management (Base64) was woefully inadequate, so we chose Vault with native Kubernetes support to manage our secrets.

Read more...

At Banzai Cloud we provision all kinds of applications to Kubernetes and we try to autoscale these clusters with Pipeline and/or properly size application resources as needed. As promised in an earlier blog post, How to correctly size containers for Java 10 applications, we’ll share our findings on the Vertical Pod Autoscaler(VPA) used with Java 10. VPA sets resource requests on pod containers automatically, based on historical usage, thus ensuring that pods are scheduled onto nodes where appropriate resource amounts are available for each pod.

Read more...

One of our goals with Pipeline is to support Java and Java Enterprise Edition deployments, allowing developers to iterate fast while building and deploying safe, and also pushing code to production. In order to do that, we place a lot of importance on different aspects of a Java/JEE application’s lifecycle - we allow engineers: To continuously integrate and deploy their Java apps to Kubernetes To deploy Java Enterprise Edition applications to Kubernetes Once the Java containers are deployed to K8s, to avoid OOMKills To correctly size Java containers And, once deployments are done and sized, to monitor them without any code modification Enter Infinispan - a distributed cache and data grid.

Read more...

One of our goals at Banzai Cloud is to eliminate the concept of nodes, insofar as that is possible, so that users will only be aware of their applications and respective resource needs (cpu, gpu, memory, network, etc). Launching Telescopes was a first step in that direction - helping end users to select the right instance types for the job, through Telescopes infrastructure recommendations, then turning those recommendations into actual infrastructure with Pipeline.

Read more...

At Banzai Cloud we use Kafka internally a lot. We have some internal systems and customer reporting deployments where we rely heavily on Kafka deployed to Kubernetes. We practice what we preach and all these deployments (not just the external ones) are done using our application platform, Pipeline. There is one difference between regular Kafka deployments and ours (though it is not relevant to this post): we have removed Zookeeper and use etcd instead.

Read more...

At Banzai Cloud we’re always open to experimenting with and integrating new software (tools, products). We also love to validate our new ideas by quickly implementing “proof of concept” projects. Even though we used five or so programming languages while building the Pipeline Platform, we love and use Golang the most. While these PoC projects are not intended for production use, they often serve as the basis for it. When this is the case, the PoC code needs to be refactored - or prepared for production.

Read more...

At Banzai Cloud we place a lot of emphasis on the observability of applications deployed to the Pipeline Platform, which we built on top of Kubernetes. To this end, one of the key components we use is Prometheus. Just to recap, Prometheus is: an open source systems monitoring and alerting tool a powerful query language (PromQL) a pull based metrics gathering system a simple text format for metrics exposition Problem statement Usually, legacy applications are not exactly prepared for these last two, so we need a solution that bridges the gap between systems that do not speak the Prometheus metrics format: enter exporters.

Read more...
May 24 2018

kurun

Author

During the development of the Pipeline Platform all of its key building blocks such as Pipeline, Hollowtrees and Bank-Vaults have relied on making extensive Kubernetes API calls. Often, we tried a quick K8s API call or ran a small PoC inside a cluster, while also wanting to avoid the usual deployment process. We quickly realized that we needed a shortcut. There are tools like telepresence that support slightly more complex scenarios.

Read more...

Creating Kubernetes clusters in the cloud and deploying (or CI/CDing) applications to those clusters is not always simple. There are a few conventional options, but they are either cloud or distribution specific. While we were working on our open source Pipeline Platform, we needed a solution which covered (here follows an inclusive but not exhaustive list of requirements): provisioning of Kubernetes clusters on all major cloud providers (via REST, UI and CLI) using a unified interface application lifecycle management (on-demand deploy, CI/CD, dependency management, etc) preferably over a REST interface support for multi tenancy, and advanced security scenarios (app to app security with dynamic secrets, standards, multi-auth backends, and more) ability to build cross-cloud or hybrid Kubernetes environments This posts highlights the ease of creating Kubernetes clusters using the Pipeline API on the following providers:

Read more...

At Banzai Cloud we continue to work hard on the Pipeline platform we’re building on Kubernetes. We’ve open sourced quite a few operators already, and even recently teamed up with Red Hat and CoreOS to begin work on Kubernetes Operators using the new Operator SDK, and to help move human operational knowledge into code. The purpose of this blog will be to take a dive deep into the PVC Operator.

Read more...

At Banzai Cloud we’re building a feature rich platform, Pipeline, on top of Kubernetes. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers - AWS, GCP, Azure and BYOC - and deploy all kinds of predefined or ad-hoc workloads to these clusters. We wanted to set the industry standard for the way in which our users log in and interact with secure endpoints, and, at the same time, we wanted to provide dynamic secret management for each application we support.

Read more...

At Banzai Cloud we run and deploy containerized applications to our PaaS, Pipeline. Java or JVM-based workloads, are among the notable workloads deployed to Pipeline, so getting them right is pretty important for us and our users. Java/JVM based workloads on Kubernetes with Pipeline Why my Java application is OOMKilled Deploying Java Enterprise Edition applications to Kubernetes A complete guide to Kubernetes Operator SDK Spark, Zeppelin, Kafka on Kubernetes

Read more...

A good number of years ago, back at beginning of this century, most of us here at Banzai Cloud were in the Java Enterprise business, building application servers (BEA Weblogic and JBoss) and JEE applications. Those days are gone; the technology stack and landscape has dramatically changed; monolithic applications are out of fashion, but we still have lots of them running in production. Because of our background, we have a personal investment in helping to shift Java enterprise edition business applications towards microservices, managed deployments, Kubernetes, and the cloud using Pipeline.

Read more...

The Pipeline platform contains a complete CI/CD component to support developers building, deploying and operating applications in an automated way on Kubernetes. Most of our documentation, blog posts and how-tos have focused on Spark, Zeppelin and Tensorflow examples. However, it is possible to build and deploy any application with Pipeline’s CI/CD component. Our last post about the Banzai Cloud CI/CD flow described how to build/deploy a Spring Boot application on Kuberbetes.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes A few months ago we posted on this blog about overspending in the cloud. We discussed how difficult it is to keep track of the vast array of instance types and pricing options offered by cloud providers, especially on AWS with spot pricing.

Read more...

At Banzai Cloud we’re always looking for new and innovative technologies that support our users in their transition to microservices deployed on Kubernetes, using Pipeline. In recent months we’ve partnered with CoreOS and RedHat to work on Kubernetes operators. That project was opensourced today, and is now available on GitHub. If you read through the rest of this blog, you’ll learn what an operator is, and how to use the operator sdk to develop an operator through a concrete example we developed, here, at Banzai Cloud.

Read more...

We are excited to announce that Banzai Cloud is now a Kubernetes Certified Service Provider (KCSP). The KCSP program was started by the Cloud Native Computing Foundation in collaboration with the Linux Foundation and represents a milestone in the wide-spread adoption of a cloud native platform. It provides a strict set of rules and a battery of certified experts that guarantee only experienced partners be admitted to the program. This fosters trust, so enterprises can rely on Banzai Cloud and our flagship PaaS, Pipeline, to bring to bear the experience necessary to guide them on their Kubernetes and microservices journey to cloud native application platforms and production usage.

Read more...

Bank Vaults is a thick, tricky, shifty right with a fast and intense tube for experienced surfers only, located on Mentawai. Think heavy steel doors, secret unlocking combinations and burly guards with smack-down attitudes. Watch out for clean-up sets. Bank Vaults is a wrapper for the official Vault client with automatic token renewal, built in Kubernetes support, dynamic database credential management, multiple unseal options, automatic re/configuration and more.

Read more...

At Banzai Cloud we push different types of workloads to Kubernetes with our open source PaaS, Pipeline. There are lots of deployments we support for which we have defined Helm charts, however, Pipeline is able to deploy applications from any repository. These deployments are pushed on-prem or in the cloud, but many of these deployments share one common feature, the need for persistent volumes. Kubernetes provides abundant options in this regard, and each cloud provider also offers custom/additional alternatives.

Read more...

In this blog we’ll continue our series about Kubernetes logging, and cover some advanced techniques and visualizations pertaining to collected logs. Just to recap, with our open source PaaS, Pipeline, we monitor and collect/move a large number of the logs for the distributed applications we push to Kubernetes. We are expending a lot of effort to monitor large and federated clusters, and to automate these with Pipeline, so that our users receive out of the box monitoring and log collection for free.

Read more...

In the past few weeks we’ve been blogging about the advanced, enterprise-grade security features we are building into our open source PaaS, Pipeline. If you’d like to review these features, please read this series: Security series: Authentication and authorization of Pipeline users with OAuth2 and Vault Dynamic credentials with Vault using Kubernetes Service Accounts Dynamic SSH with Vault and Pipeline Secure Kubernetes Deployments with Vault and Pipeline Policy enforcement on K8s with Pipeline The Vault swiss-army knife The Banzai Cloud Vault Operator Vault unseal flow with KMS Kubernetes secret management with Pipeline Container vulnerability scans with Pipeline Kubernetes API proxy with Pipeline

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Collecting Spark History Server event logs in the cloud

Read more...

As we eluded to in the last post in this series, we’ll be continuing our discussion of centralized and secure Kubernetes logging/log collection. Log messages can contain sensitive information, so it’s important to secure transport between distributed parts of the log flow. This post will describe how we’ve secured moving log messages on our Kubernetes clusters provisioned by Pipeline. Logging series: Centralized logging under Kubernetes Secure logging on Kubernetes with Fluentd and Fluent Bit

Read more...

During the development of our open source Pipeline PaaS, we introduced some handy features to help deal with deployments. We deploy most of our applications as Helm releases, so we needed a way to interact programatically (using gRPC) and to use a UI (RESTful API) with Helm. In order to do that with Pipeline, we introduced a very useful feature that manages Helm repositories and deploys applications with Helm to Kubernetes, using RESTful API calls.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service Kubernetes was designed in such a way as to be fault tolerant of worker node failures. If a node goes missing because of a hardware problem, a cloud infrastructure problem, or if Kubernetes simply ceases to receive heartbeat messages from a node for any reason, the Kubernetes control plane is clever enough to handle it.

Read more...

The Pipeline platform contains a complete CI/CD component to support developers building, deploying and operating applications in an automated way, deployed to Kubernetes. Most of our documentation, blog posts and howtos have so far focused on Spark, Zeppelin and Tensorflow examples. However, we can actually build and deploy any application with Pipeline’s CI/CD component. This post showcases how to enable a simple Spring Boot application for the Banzai Cloud CI/CD flow, build and save the necessary artifacts, and deploy it to a Kubernetes cluster.

Read more...

For our Pipeline PaaS, monitoring is an essential part of operating distributed applications in production. We put a great deal of effort into monitoring large and federated clusters and automating these with Pipeline, so all our users receive out of the box monitoring for free. You can read about our monitoring series, below: Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded

Read more...

For some time we’ve been evangelizing the idea that the runtime fabric of big data workloads should be Kubernetes. In this post I’d like to walk through the thought process behind that change and discuss its benefits. Obliviously, this is a pretty large topic, and this post has no intention of covering it completely - also, it reflects the views and opinions that we at Banzai Cloud believe in and push others to adopt.

Read more...

The adoption of serverless technologies is quickly progressing. According to this survey, it’s on par with the adoption of containers. And, even though ‘serverless’ is a very vague term (it can be argued that it’s rarely used in production, especially in complex applications), it seems set to be one of the most dominant trends in the near future in the cloud computing space. While, once, serverless referred specifically to early stage AWS Lambda, the category has matured rapidly.

Read more...

Banzai Pipeline, or simply “Pipeline” is a tabletop reef break located in Hawaii, on Oahu’s North Shore. It is the most famous and infamous reef on the planet, and serves as the benchmark by which all other surf breaks are measured. Pipeline is a PaaS with a built in CI/CD engine to deploy cloud native microservices to a public cloud or on-premise. It simplifies and abstracts all the details of provisioning cloud infrastructure, installing or reusing a Kubernetes cluster, and deploying an application.

Read more...

As part of the Debug 101 series, we’re back hunting a small but annoying bug. This kind of bug is not really a bug, but a side effect of several tools working together. Here comes trouble I deploy a development version of Pipeline on a Kubernetes cluster running on top of AWS infrastructure. For this deployment I use the following Helm chart command. $: helm install --name pipeline banzaicloud-stable/pipeline-cp \ --set=drone.

Read more...

Kubeless was designed to be a Kubernetes-native serverless framework, and, for PubSub functions, uses Apache Kafka behind the scenes. At Banzai Cloud we like cloud-native technologies, however, we weren’t happy about having to operate a Zookeeper cluster on Kubernetes, so we modified and open-sourced a version for Kafka in which we replaced Zookeeper with etcd, which was (and still is) a better fit. This post is part of our serverless series, which discusses deploying Kubeless, using Kafka on etcd with Pipeline, and deploying a so called PubSub function.

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded At Banzai Cloud we deploy large distributed applications to Kubernetes clusters that we also operate. We don’t enjoy waking up to PagerDuty notifications in the middle of the night, so we try to get ahead of problems by operating these clusters as efficiently as possible.

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded Kafka on Kubernetes the easy way At Banzai Cloud we provision and monitor large Kubernetes clusters deployed to multiple cloud/hybrid environments, using Prometheus. The clusters, applications or frameworks are all managed by our next generation PaaS, Pipeline.

Read more...

At Banzai Cloud we’re always looking for products or frameworks that add value to our business, which we can enable in our open source PaaS, Pipeline. Any list of such products would include serverless frameworks. Thus, today we’re adding Fn as a supported spotguide, making it easy for users to deploy Fn with Pipeline on their chosen cloud provider. Before we dive into how to deploy and use Fn with Pipeline, here are a few reasons why we thought Fn should be supported by Pipeline:

Read more...

In our last last entry in the distributed TensorFlow series, we used a research example for distributed training of an Inception model. In this post we’ll showcase how to do the same thing on GPU instances, this time on Azure managed Kubernetes - AKS deployed with Pipeline. As you may remember from our previous post that the first thing to consider when running distributed Tensorflow models is whether you have shared storage space available.

Read more...

At Banzai Cloud we secure our Kubernetes services using Vault and OAuth2 tokens. This has not always been the case, though we’ve had authentication in our project (even though it was basic) from a very early PoC stage - and we suggest that you do the same. Usually, inbound connections to Kubernetes cluster services are accessed via Ingress. Just to recap, public services are typically accessed through a loadbalancer service.

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded At Banzai Cloud we provision and monitor large Kubernetes clusters deployed to multiple cloud/hybrid environments. These clusters and applications or frameworks are all managed by our next generation PaaS, Pipeline.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

At Banzai Cloud we provision different frameworks and tools like Spark, Zeppelin, Kafka, Tensorflow, etc to our Pipeline PaaS (built on Kubernetes). Last week we added serverless capabilities to Pipeline, using OpenFaas. This blog post explains how to deploy OpenFaaS to Kubernetes using Pipeline and invoke an example function. We’ll distinguish between the provisioning of the serverless frameworks we support (this post is about OpenFaaS but Pipeline also supports Kubeless), from the invocation of functions through the Pipeline API or CI/CD workflow once it’s dispatched to any of the serverless frameworks we deploy to Kubernetes.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service You may remember the Hollowtrees project we open sourced a few weeks ago: a framework for the management of AWS spot instance clusters, batteries included: Hollowtrees, an alert-react based framework that’s part of the Pipeline PaaS, which coordinates monitoring, applies rules and dispatches action chains to plugins using standard CNCF interfaces AWS spot instance termination Prometheus exporter AWS autoscaling group Prometheus exporter AWS Spot Instance recommender Kubernetes action plugin to execute k8s operations (e.

Read more...

At Banzai Cloud we are building a cloud agnostic, open source next generation CloudFoundry/Heroku-like PaaS, Pipeline, while running several big data workloads natively on Kubernetes. Apache Kafka is one of the cloud native workloads we support out-of-the-box, alongside Apache Spark and Apache Zeppelin. If you’re interested in running big data workloads on Kubernetes, please read the following blog series as well. Apache Kafka on Kubernetes series: Kafka on Kubernetes - using etcd Monitoring Apache Kafka with Prometheus Kafka on Kubernetes with Local Persistent Volumes Kafka on Kubernetes the easy way

Read more...

At Banzai Cloud we run multiple Kubernetes clusters deployed with our next generation PaaS, Pipeline, and we deploy these clusters across different cloud providers like AWS, Azure and Google, or on-premise. These clusters are typically launched via the same control plane deployed either to AWS, as a CloudFormation template, or Azure, as an ARM template. And, since we practice what we preach, they run inside Kubernetes as well. One of the added values to deployments via Pipeline is out-of-the-box monitoring and dashboards through default spotguides for the applications we also support out-of-the-box.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service Last week we open sourced the Hollowtrees project, a framework that manages AWS spot instance clusters - batteries included: Hollowtrees - an alert-react based framework that’s part of the Pipeline PaaS, which coordinates monitoring, applies rules and dispatches action chains to plugins using standard CNCF interfaces AWS spot instance termination Prometheus exporter AWS autoscaling group Prometheus exporter AWS Spot Instance recommender Kubernetes action plugin to execute k8s operations (e.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Banzai Pipeline, or simply “Pipeline” is a tabletop reef break located in Hawaii, on Oahu’s North Shore. It is the most famous and infamous reef on the planet, and serves as the benchmark by which all other surf breaks are measured. Pipeline is a PaaS with a built in CI/CD engine to deploy cloud native microservices to a public cloud or on-premise. It simplifies and abstracts all the details of provisioning cloud infrastructure, installing or reusing a Kubernetes cluster, and deploying an application.

Read more...

Hollowtrees is a wave of highest pedigree, the pin-up centerfold of the Mentawai islands’ surf break which brings new machine-like connotations to the word perfection. Watch out for the aptly named ‘Surgeon’s Table’, a brutal reef famous for taking bits and pieces of Hollowtrees’ surfers as trophies. Hollowtrees, a ruleset based watch-guard keeps spot instance-based clusters safe and allows for them to be used in production. It handles spot price surges within a given region or availability zone and reschedules applications before instances are taken down.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

We are moving relatively quickly, implementing new Pipeline features and releases, with our second major release scheduled for this week. Among other new features we’ve already added a new managed Kubernetes provider, Microsoft’s Azure AKS. Azure Container Service (AKS) is a preview feature of the Azure Cloud - and we’re proud to be among its earliest adopters. We can provision and deploy apps to Kubernetes on Azure VMs the same way we do on EC2, however, at Banzai Cloud we strongly believe that the future is in managed Kubernetes services; most of our investment regarding cloud neutrality and provisioning is built on managed Kubernetes services both in the cloud (GKE, OCI and ACS in beta, or under development) and on-prem.

Read more...

Last time we discussed how our Pipeline PaaS deploys and provisions an AWS EFS filesystem on Kubernetes and what the performance benefits are for Spark or TensorFlow. This post is gives: An introduction to TensorFlow on Kubernetes The benefits of EFS for TensorFlow (image data storage for TensorFlow jobs) Pipeline uses the kubeflow framework to deploy: A JupyterHub to create & manage interactive Jupyter notebooks A TensorFlow Training Controller that can be configured to use CPUs or GPUs A TensorFlow Serving container Note that Pipeline also has default Spotguides for Spark and Zeppelin to help support your datascience experience

Read more...

At Banzai Cloud we provision different frameworks and tools like Spark, Zeppelin and, most recently, Tensorflow, all of which run on our Pipeline PaaS (built on Kubernetes). One of Pipeline’s early adopters runs a Tensorflow Training Controller using GPUs on AWS EC2, wired into our CI/CD pipeline, which needs significant parallelization for reading training data. We’ve introduced support for Amazon Elastic File System and made it publicly available in the forthcoming release of Pipeline.

Read more...

At Banzai Cloud we provision different applications or frameworks to Pipeline, the PaaS we built on Kubernetes. We practice what we preach, and our PaaS’ control plane also runs on Kubernetes and requires a layer of data storage. It was therefore necessary that we explore two different use cases: how to deploy and to run a distributed, scalable and fully SQL compliant DB to cover our client’s, and our own, internal needs.

Read more...

At Banzai Cloud we run and deploy containerized applications to Pipeline, our PaaS. Those of you who (like us) run Java applications inside Docker, have probably already come across the problem of JVMs inaccurately detecting available memory when running inside a container. Instead of accurately detecting the memory available in a Docker container, JVMs see the available memory of the machine. This can lead to cases wherein applications that run inside containers are killed whenever they try to use an amount of memory that exceeds the limits of the Docker container.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Modern applications and services usually expose their functions via REST; moreover, modules and components also make use of external services that are exposed as REST. Thus, developers often need to design RESTful services and write REST service clients. It’s a given in this kind of work that these services will be called thousands of times during the development process (developers need to understand the API, as well as the messages and the resources involved), and even after, to make sure everything works as desired.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

As 2017 comes to an end, we’re looking back at the three blog posts that were most popular with our readers. We can’t go too far back (though we’ve had 13 posts and one release already), since we founded our startup just a little over one month ago (on November 20, 2017, to be precise), but during this short period we’ve achieved a whole lot, and laid the foundation for some exciting new projects we plan to ship out early next year.

Read more...

Last week we released the first version of Pipeline - a PaaS with end to end support for cloud native apps, from GitHub commit hooks deployed to the cloud in minutes to the use of a fully customizable CI/CD workflow. At the core of the Pipeline PaaS are its spotguides - a collection of workflow/pipeline steps defined in a .pipeline.yml file and a few Drone plugins. In this post we’d like to demystify spotguides and describe, step by step, how they work; the next post will be a tutorial on how to write a custom spotguide and its associated plugin.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Banzai Pipeline, or simply Pipeline, is a tabletop reef break located in Hawaii, on Oahu’s North Shore. It is the most famous and infamous reef on the planet, and serves as the benchmark by which all other waves are measured. Pipeline is a PaaS with a built in CI/CD engine to deploy cloud native microservices to a public cloud or on-premise. It simplifies and abstracts all the details of provisioning cloud infrastructure, installing or reusing a Kubernetes cluster and deploying an application.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service One of the primary advantages always discussed in the context of deciding whether to move a deployment to the cloud is cost. There are no upfront costs in moving to the cloud because you don’t have to buy any hardware, and you only pay for what you really use because you can scale your infrastructure according to your workloads.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Debug 101 Today we’re starting a new series called Debug 101, which deals with those issues that gave us particularly bad headaches and took a large amount of time to debug, understand and fix. We believe strongly in open source software and open issue resolution, and we try to describe our problems and suggest fixes as we go, so you don’t have to shave that yak. We already have, and they yak looks awesome.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

At Banzai Cloud we use different cloud providers or managed Kubernetes offerings, one of which is Microsoft Azure Managed Kubernetes. It’s a pretty soid service that allows you to deploy a managed k8s cluster without requiring you to deal with low level Kubernetes building blocks, tooling, or cloud infrastructure provisioning. However, there is one temporary issue which is a cornerstone of our PasS, Pipeline: the Azure Go-SDK does not contain the bindings for our new service.

Read more...