Banzai Cloud Logo Close
Home Benefits Blog Company Contact
Sign up Login

Banzai Cloud has closed its latest round of seed funding with a total investment of $2.5 million. The round was led by PortfoLion, a Central European venture capital and private equity fund, and included financing from FastVentures and Euroventures of Budapest, the latter being an angel investor that has been with the company since its foundation in 2017. “We’re very excited by this opportunity, by the exceptional team Banzai Cloud has brought together that will make them a force to be reckoned with in the world of enterprise computing.

Read more...

Banzai Cloud is on a mission to simplify the development, deployment, and scaling of complex applications and to bring the full power of Kubernetes to all developers and enterprises. Banzai Cloud’s Pipeline provides a platform which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed technology from the Cloud Native Foundation ecosystem to create a highly productive, yet flexible environment for developers and operation teams alike. One of the key tools we use from the Kubernetes ecosystem is Helm.

Read more...

Banzai Cloud’s Pipeline provides a platform which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures—multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, etc.—are a tier zero feature of the Pipeline platform, which we strive to automate and enable for all enterprises.

Read more...

Business continuity is a key requirement for our enterprise users, and it becomes exponentially more important in a cloud native environment. The Banzai Cloud Platform operates in such an environment, deploying and managing large scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment, and provides a safety net to backup and restore cluster and application states during the lifecycle of the cluster.

Read more...

Banzai Cloud’s Pipeline platform is an operating system which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security - multiple authentication backends, fine grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, etc. - is a tier zero feature of the Pipeline platform, which we strive to automate and enable to all enterprises.

Read more...

Satellite is a Golang library and RESTful API that determines a host’s cloud provider with a simple HTTP call. Behind the scenes, it uses file systems and provider metadata to properly identify cloud providers. When we started to work on Pipeline and the Banzai Cloud Pipeline Platform Operators, we soon realized how frequently we would need to find out which cloud provider the service was actually running on. Note that Pipeline supports 6 different cloud providers

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded Hands on Thanos Monitoring Vault on Kubernetes using Cloud Native technologies At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline.

Read more...

The Banzai Cloud Productinfo service retrieves product and pricing information from cloud providers and exposes it through a RESTful API, and UI. Our Kubernetes based Pipeline platform and Telescopes recommendation engine make use of this information when they advise users on cluster layout and resourcing. Here’s a quick primer of how and why we utilize the Productinfo service: Pipeline platform users have the option of launching clusters or deploying applications based only on resource- and SLA-requirements (price, IO, memory, CPU, GPU, etc.

Read more...

One of the main advantages of the Pipeline platform is that it allows users to use their infrastructure cost effectively; Telescopes helps with cluster and machine instance recommendations, Hollowtrees enables SLA-aware cost reduction using spot instances, and autoscalers allow for multi-dimensional autoscaling based on custom metrics. This post will highlight some new features of the Banzai Cloud Horizontal Pod Autoscaler Kubernetes Operator and the advanced automation supported by Pipeline - a new, forward-thinking way to operate Kubernetes clusters and autoscale deployments.

Read more...

Enterprises often use multi-tenant and heterogenous clusters to deploy their applications to Kubernetes. These applications usually have needs which require special scheduling constraints. Pods may require nodes with special hardware, isolation, or colocation with other pods running in the system. The Pipeline platform allows users to express their constraints in terms of resources (CPU, memory, network, IO, etc.). These requirements are turned into infrastructure specifications using Telescopes. Once the cluster nodes are created and properly labeled by Pipeline, deployments are run with the specified constraints automatically on top of Kubernetes.

Read more...

Banzai Cloud announced today that it is collaborating with Oracle to bring its feature-rich application platform to Oracle Kubernetes Engine users. Banzai Cloud’s Pipeline deployment automation and execution engine enables developers to go from commit to scale in minutes by automating all the underlying tasks that provide convenient CI/CD flows, robust security, analytics and the ability to scale. The technology not only provides significant productivity gains to developers, but also increases operational efficiencies by aiding instance selection and automated introspection of large-scale workloads.

Read more...

Last year Alibaba joined CNCF and announced plans to create their own Kubernetes service - Alibaba ACK. The service was luanched more than a year ago, with its stated objective to make it easy to run Kubernetes on Alibaba Cloud without needing to install, operate, and maintain a Kubernetes control plane. At Banzai Cloud we are committed to providing support for Kubernetes on all major cloud providers, thus one of our priorities was to enable Alibaba Cloud’s Container Service for Kubernetes in Pipeline and take the DevOps experience to the next level by turning ACK into a feature-rich enterprise-grade application platform.

Read more...

At Banzai Cloud we are building an application-centric platform for containers - Pipeline - running on Kubernetes to allow developers to go from commit to scale in minutes. We support multiple development languages and frameworks to build applications, with one common goal: all Pipeline deployments receive integrated CI/CD, centralized logging, monitoring, enterprise-grade security, autoscaling, and spot price support automatically, out of the box. In most cases we accomplish this in a non-intrusive way (i.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Banzai Cloud is happy to announce that it is an Amazon EKS Platform Partner. Last year Amazon joined CNCF and announced plans to create their own Kubernetes service - Amazon EKS. The service has been launched this June, with the objective to make it easy for you to run Kubernetes on AWS without needing to install, operate, and maintain your own Kubernetes control plane. At Banzai Cloud we are committed to provide support for Kubernetes on all major cloud providers for our users thus one of our priority was to enable Amazon EKS in Pipeline and take the DevOps and user experience to the next level by turning EKS into a feature rich enterprise-grade application platform.

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. We have always been committed to supporting Kubernetes and our container based application platform on all major providers, however, we are also committed to making portability between cloud vendors easy, seamless and automated. Accordingly, this post will highlight a few important aspects of a multi-cloud approach we learned from our users, and the open source code we developed and made part of the Pipeline platform.

Read more...

At Banzai Cloud we put a lot of emphasis on observability, so we automatically provide centralized monitoring and log collection for all clusters and deployments done through Pipeline. Over the last few months we’ve been experimenting with different approaches - tailored and driven by our customers’ individual needs - the best of which are now coded into our open source Logging-Operator. Just to recap, here are our earlier posts about logging using the fluent ecosystem Centralized log collection on Kubernetes.

Read more...

Continuing our commitment to support all major cloud providers, today we are adding support for Oracle’s Kubernetes-managed cloud service, OKE – Oracle Kubernetes Engine in Pipeline. We are building a feature rich enterprise-grade application platform on top of Kubernetes - called Pipeline - to deliver a better DevOps experience by automating the lifecycle management of applications. This experience is now available to OKE users as well; here, we will guide you through the first steps of using OKE and summarize the benefits of using Pipeline.

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers, specifically AWS, GCP, Azure, AliCloud, Oracle and BYOC - on-premise and hybrid - and deploy all kinds of predefined or ad-hoc workloads to these clusters. For us and our enterprise users authentication and authorization is absolutely vital, thus, in order to access the Kubernetes API and the Services in an authenticated manner as defined within Kubernetes, we arrived at a simple but flexible solution.

Read more...

A few months ago the Kubernetes Operator SDK was released with one of its goals being the conversion of human operational knowledge into code. At Banzai Cloud we’ve been contributors and early adopters of this technology, since it provides a better standardized method of automating our processes and allows us to dramatically ease the lives of our customers. We are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline, wherein we endeavour to automate the DevOps experience and the lifecycle of deployments.

Read more...

At Banzai Cloud we are building a feature-rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. Applications deployed to Pipeline automatically inherit the platform’s features: enterprise-grade security, observability (centralized log collection, monitoring and tracing), discovery, high availability and resiliency, just to name a few - encapsulated in spotguides. One of the most popular spotguides we deploy is Spark. In the past few months we’ve been working and pushing many pull requests to make Spark a first class player on Kubernetes and to make it resilient.

Read more...

A few weeks back we released Telescopes, our Kubernetes cluster layout recommender application. That application has evolved quite a bit, and in this post we’ll provide insight into some its new features and recent changes. Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes tl;dr: We added new features to Telescopes to provide support for blacklisting or whitelisting instance types Recommendation accuracies can now be checked There is now support that allows asking cloud instance types for CPU, memory and network performance.

Read more...

One of our goals at BanzaiCloud is to make our customers’ lives easier by providing low barrier to entry, easy to use solutions for running applications on Kubernetes. To achieve this, we often rely on Kubernetes Operators to provide comprehensive solutions over the course of an application’s lifecycle. Here is a list of our operators, which we have already open sourced: Vault Operator Prometheus JMX Exporter Operator

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. Security is one of our main areas of focus, and we strive to automate and enable those security patterns we consider essential, including tier zero features for all enterprises using the Pipeline Platform. We’ve blogged about how to handle security scenarios on several of our previous posts. This time we’d like to focus on a different aspect of securing Kubernetes deployments:

Read more...

We are excited to announce that Banzai Cloud has joined the Cloud Native Computing Foundation! The CNCF and The Linux Foundation are expending extraordinary effort in helping to standardize open source technologies that enable the development, deployment, management and operation of next generation Cloud Native software stacks. Our mission is to bring Cloud Native to enterprises and with this announcement, we strive to help push container and cloud native technology standardization and interoperability forward.

Read more...

At Banzai Cloud we are building a feature-rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. For an enterprise-grade application platform security is a must and it has many building blocks. Please read through the Security series on our blog to learn how we deal with a variety of security-related issues. Security series: Authentication and authorization of Pipeline users with OAuth2 and Vault Dynamic credentials with Vault using Kubernetes Service Accounts Dynamic SSH with Vault and Pipeline Secure Kubernetes Deployments with Vault and Pipeline Policy enforcement on K8s with Pipeline The Vault swiss-army knife The Banzai Cloud Vault Operator Vault unseal flow with KMS

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers such as AWS, GCP, Azure, Oracle, Alibaba and BYOC, on-premise and hybrid, and deploy all kinds of predefined or ad-hoc workloads to these clusters. For us and our enterprise users, Kubernetes secret management (base 64) was not sufficient, so we chose Vault and added Kubernetes support to manage our secrets.

Read more...

If you followed our blog series on Autoscaling on Kubernetes, you should already be familiar with Kubernetes’ Cluster autoscaler and the Vertical Pod Autoscaler used with Java 10 applications. This post will show you how to use the Horizontal Pod Autoscaler to autoscale your deployments based on custom metrics obtained from Prometheus. As a deployment example we’ve chosen our JEE Petstore example application on Wildfly to show that, beside metrics like cpu and memory, which are provided by default on Kubernetes, using our Wildfly Operator, all Java and Java Enterprise Edition / Wildfly specific metrics are automatically placed at your fingertips, available in Prometheus, allowing you to easily autoscale deployments.

Read more...

At Banzai Cloud we are building a feature rich enterprise-grade application platform, built for containers on top of Kubernetes, called Pipeline. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers such as AWS, GCP, Azure and BYOC, on-premise and hybrid, and deploy all kinds of predefined or ad-hoc workloads to these clusters. For us and our enterprise users, Kubernetes secret management (Base64) was woefully inadequate, so we chose Vault with native Kubernetes support to manage our secrets.

Read more...

At Banzai Cloud we provision all kinds of applications to Kubernetes and we try to autoscale these clusters with Pipeline and/or properly size application resources as needed. As promised in an earlier blog post, How to correctly size containers for Java 10 applications, we’ll share our findings on the Vertical Pod Autoscaler(VPA) used with Java 10. VPA sets resource requests on pod containers automatically, based on historical usage, thus ensuring that pods are scheduled onto nodes where appropriate resource amounts are available for each pod.

Read more...

One of our goals with Pipeline is to support Java and Java Enterprise Edition deployments, allowing developers to iterate fast while building and deploying safe, and also pushing code to production. In order to do that, we place a lot of importance on different aspects of a Java/JEE application’s lifecycle - we allow engineers: To continuously integrate and deploy their Java apps to Kubernetes To deploy Java Enterprise Edition applications to Kubernetes Once the Java containers are deployed to K8s, to avoid OOMKills To correctly size Java containers And, once deployments are done and sized, to monitor them without any code modification Enter Infinispan - a distributed cache and data grid.

Read more...

One of our goals at Banzai Cloud is to eliminate the concept of nodes, insofar as that is possible, so that users will only be aware of their applications and respective resource needs (cpu, gpu, memory, network, etc). Launching Telescopes was a first step in that direction - helping end users to select the right instance types for the job, through Telescopes infrastructure recommendations, then turning those recommendations into actual infrastructure with Pipeline.

Read more...

At Banzai Cloud we use Kafka internally a lot. We have some internal systems and customer reporting deployments where we rely heavily on Kafka deployed to Kubernetes. We practice what we preach and all these deployments (not just the external ones) are done using our application platform, Pipeline. There is one difference between regular Kafka deployments and ours (though it is not relevant to this post): we have removed Zookeeper and use etcd instead.

Read more...

At Banzai Cloud we’re always open to experimenting with and integrating new software (tools, products). We also love to validate our new ideas by quickly implementing “proof of concept” projects. Even though we used five or so programming languages while building the Pipeline Platform, we love and use Golang the most. While these PoC projects are not intended for production use, they often serve as the basis for it. When this is the case, the PoC code needs to be refactored - or prepared for production.

Read more...

At Banzai Cloud we place a lot of emphasis on the observability of applications deployed to the Pipeline Platform, which we built on top of Kubernetes. To this end, one of the key components we use is Prometheus. Just to recap, Prometheus is: an open source systems monitoring and alerting tool a powerful query language (PromQL) a pull based metrics gathering system a simple text format for metrics exposition Problem statement Usually, legacy applications are not exactly prepared for these last two, so we need a solution that bridges the gap between systems that do not speak the Prometheus metrics format: enter exporters.

Read more...
May 24 2018

kurun

Author

During the development of the Pipeline Platform all of its key building blocks such as Pipeline, Hollowtrees and Bank-Vaults have relied on making extensive Kubernetes API calls. Often, we tried a quick K8s API call or ran a small PoC inside a cluster, while also wanting to avoid the usual deployment process. We quickly realized that we needed a shortcut. There are tools like telepresence that support slightly more complex scenarios.

Read more...

Creating Kubernetes clusters in the cloud and deploying (or CI/CDing) applications to those clusters is not always simple. There are a few conventional options, but they are either cloud or distribution specific. While we were working on our open source Pipeline Platform, we needed a solution which covered (here follows an inclusive but not exhaustive list of requirements): provisioning of Kubernetes clusters on all major cloud providers (via REST, UI and CLI) using a unified interface application lifecycle management (on-demand deploy, CI/CD, dependency management, etc) preferably over a REST interface support for multi tenancy, and advanced security scenarios (app to app security with dynamic secrets, standards, multi-auth backends, and more) ability to build cross-cloud or hybrid Kubernetes environments This posts highlights the ease of creating Kubernetes clusters using the Pipeline API on the following providers:

Read more...

At Banzai Cloud we continue to work hard on the Pipeline platform we’re building on Kubernetes. We’ve open sourced quite a few operators already, and even recently teamed up with Red Hat and CoreOS to begin work on Kubernetes Operators using the new Operator SDK, and to help move human operational knowledge into code. The purpose of this blog will be to take a dive deep into the PVC Operator.

Read more...

At Banzai Cloud we’re building a feature rich platform, Pipeline, on top of Kubernetes. With Pipeline we provision large, multi-tenant Kubernetes clusters on all major cloud providers - AWS, GCP, Azure and BYOC - and deploy all kinds of predefined or ad-hoc workloads to these clusters. We wanted to set the industry standard for the way in which our users log in and interact with secure endpoints, and, at the same time, we wanted to provide dynamic secret management for each application we support.

Read more...

At Banzai Cloud we run and deploy containerized applications to our PaaS, Pipeline. Java or JVM-based workloads, are among the notable workloads deployed to Pipeline, so getting them right is pretty important for us and our users. Java/JVM based workloads on Kubernetes with Pipeline Why my Java application is OOMKilled Deploying Java Enterprise Edition applications to Kubernetes A complete guide to Kubernetes Operator SDK Spark, Zeppelin, Kafka on Kubernetes

Read more...

A good number of years ago, back at beginning of this century, most of us here at Banzai Cloud were in the Java Enterprise business, building application servers (BEA Weblogic and JBoss) and JEE applications. Those days are gone; the technology stack and landscape has dramatically changed; monolithic applications are out of fashion, but we still have lots of them running in production. Because of our background, we have a personal investment in helping to shift Java enterprise edition business applications towards microservices, managed deployments, Kubernetes, and the cloud using Pipeline.

Read more...

The Pipeline PaaS contains a complete CI/CD component to support developers building, deploying and operating applications in an automated way on Kubernetes. Most of our documentation, blog posts and how-tos have focused on Spark, Zeppelin and Tensorflow examples. However, it is possible to build and deploy any application with Pipeline’s CI/CD component. Our last post about the Banzai Cloud CI/CD flow described how to build/deploy a Spring Boot application on Kuberbetes.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes A few months ago we posted on this blog about overspending in the cloud. We discussed how difficult it is to keep track of the vast array of instance types and pricing options offered by cloud providers, especially on AWS with spot pricing.

Read more...

At Banzai Cloud we’re always looking for new and innovative technologies that support our users in their transition to microservices deployed on Kubernetes, using Pipeline. In recent months we’ve partnered with CoreOS and RedHat to work on Kubernetes operators. That project was opensourced today, and is now available on GitHub. If you read through the rest of this blog, you’ll learn what an operator is, and how to use the operator sdk to develop an operator through a concrete example we developed, here, at Banzai Cloud.

Read more...

We are excited to announce that Banzai Cloud is now a Kubernetes Certified Service Provider (KCSP). The KCSP program was started by the Cloud Native Computing Foundation in collaboration with the Linux Foundation and represents a milestone in the wide-spread adoption of a cloud native platform. It provides a strict set of rules and a battery of certified experts that guarantee only experienced partners be admitted to the program. This fosters trust, so enterprises can rely on Banzai Cloud and our flagship PaaS, Pipeline, to bring to bear the experience necessary to guide them on their Kubernetes and microservices journey to cloud native application platforms and production usage.

Read more...

Bank Vaults is a thick, tricky, shifty right with a fast and intense tube for experienced surfers only, located on Mentawai. Think heavy steel doors, secret unlocking combinations and burly guards with smack-down attitudes. Watch out for clean-up sets. Bank Vaults is a wrapper for the official Vault client with automatic token renewal, built in Kubernetes support, dynamic database credential management, multiple unseal options, automatic re/configuration and more.

Read more...

At Banzai Cloud we push different types of workloads to Kubernetes with our open source PaaS, Pipeline. There are lots of deployments we support for which we have defined Helm charts, however, Pipeline is able to deploy applications from any repository. These deployments are pushed on-prem or in the cloud, but many of these deployments share one common feature, the need for persistent volumes. Kubernetes provides abundant options in this regard, and each cloud provider also offers custom/additional alternatives.

Read more...

In this blog we’ll continue our series about Kubernetes logging, and cover some advanced techniques and visualizations pertaining to collected logs. Just to recap, with our open source PaaS, Pipeline, we monitor and collect/move a large number of the logs for the distributed applications we push to Kubernetes. We are expending a lot of effort to monitor large and federated clusters, and to automate these with Pipeline, so that our users receive out of the box monitoring and log collection for free.

Read more...

In the past few weeks we’ve been blogging about the advanced, enterprise-grade security features we are building into our open source PaaS, Pipeline. If you’d like to review these features, please read this series: Security series: Authentication and authorization of Pipeline users with OAuth2 and Vault Dynamic credentials with Vault using Kubernetes Service Accounts Dynamic SSH with Vault and Pipeline Secure Kubernetes Deployments with Vault and Pipeline Policy enforcement on K8s with Pipeline The Vault swiss-army knife The Banzai Cloud Vault Operator Vault unseal flow with KMS Kubernetes secret management with Pipeline Container vulnerability scans with Pipeline Kubernetes API proxy with Pipeline

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Collecting Spark History Server event logs in the cloud

Read more...

As we eluded to in the last post in this series, we’ll be continuing our discussion of centralized and secure Kubernetes logging/log collection. Log messages can contain sensitive information, so it’s important to secure transport between distributed parts of the log flow. This post will describe how we’ve secured moving log messages on our Kubernetes clusters provisioned by Pipeline. Logging series: Centralized logging under Kubernetes Secure logging on Kubernetes with Fluentd and Fluent Bit

Read more...

During the development of our open source Pipeline PaaS, we introduced some handy features to help deal with deployments. We deploy most of our applications as Helm releases, so we needed a way to interact programatically (using gRPC) and to use a UI (RESTful API) with Helm. In order to do that with Pipeline, we introduced a very useful feature that manages Helm repositories and deploys applications with Helm to Kubernetes, using RESTful API calls.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service Kubernetes was designed in such a way as to be fault tolerant of worker node failures. If a node goes missing because of a hardware problem, a cloud infrastructure problem, or if Kubernetes simply ceases to receive heartbeat messages from a node for any reason, the Kubernetes control plane is clever enough to handle it.

Read more...

The Pipeline PaaS contains a complete CI/CD component to support developers building, deploying and operating applications in an automated way, deployed to Kubernetes. Most of our documentation, blog posts and howtos have so far focused on Spark, Zeppelin and Tensorflow examples. However, we can actually build and deploy any application with Pipeline’s CI/CD component. This post showcases how to enable a simple Spring Boot application for the Banzai Cloud CI/CD flow, build and save the necessary artifacts, and deploy it to a Kubernetes cluster.

Read more...

For our Pipeline PaaS, monitoring is an essential part of operating distributed applications in production. We put a great deal of effort into monitoring large and federated clusters and automating these with Pipeline, so all our users receive out of the box monitoring for free. You can read about our monitoring series, below: Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded

Read more...

For some time we’ve been evangelizing the idea that the runtime fabric of big data workloads should be Kubernetes. In this post I’d like to walk through the thought process behind that change and discuss its benefits. Obliviously, this is a pretty large topic, and this post has no intention of covering it completely - also, it reflects the views and opinions that we at Banzai Cloud believe in and push others to adopt.

Read more...

The adoption of serverless technologies is quickly progressing. According to this survey, it’s on par with the adoption of containers. And, even though ‘serverless’ is a very vague term (it can be argued that it’s rarely used in production, especially in complex applications), it seems set to be one of the most dominant trends in the near future in the cloud computing space. While, once, serverless referred specifically to early stage AWS Lambda, the category has matured rapidly.

Read more...

Banzai Pipeline, or simply “Pipeline” is a tabletop reef break located in Hawaii, on Oahu’s North Shore. It is the most famous and infamous reef on the planet, and serves as the benchmark by which all other surf breaks are measured. Pipeline is a PaaS with a built in CI/CD engine to deploy cloud native microservices to a public cloud or on-premise. It simplifies and abstracts all the details of provisioning cloud infrastructure, installing or reusing a Kubernetes cluster, and deploying an application.

Read more...

As part of the Debug 101 series, we’re back hunting a small but annoying bug. This kind of bug is not really a bug, but a side effect of several tools working together. Here comes trouble I deploy a development version of Pipeline on a Kubernetes cluster running on top of AWS infrastructure. For this deployment I use the following Helm chart command. $: helm install --name pipeline banzaicloud-stable/pipeline-cp \ --set=drone.

Read more...

Kubeless was designed to be a Kubernetes-native serverless framework, and, for PubSub functions, uses Apache Kafka behind the scenes. At Banzai Cloud we like cloud-native technologies, however, we weren’t happy about having to operate a Zookeeper cluster on Kubernetes, so we modified and open-sourced a version for Kafka in which we replaced Zookeeper with etcd, which was (and still is) a better fit. This post is part of our serverless series, which discusses deploying Kubeless, using Kafka on etcd with Pipeline, and deploying a so called PubSub function.

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded At Banzai Cloud we deploy large distributed applications to Kubernetes clusters that we also operate. We don’t enjoy waking up to PagerDuty notifications in the middle of the night, so we try to get ahead of problems by operating these clusters as efficiently as possible.

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded At Banzai Cloud we provision and monitor large Kubernetes clusters deployed to multiple cloud/hybrid environments, using Prometheus. The clusters, applications or frameworks are all managed by our next generation PaaS, Pipeline.

Read more...

At Banzai Cloud we’re always looking for products or frameworks that add value to our business, which we can enable in our open source PaaS, Pipeline. Any list of such products would include serverless frameworks. Thus, today we’re adding Fn as a supported spotguide, making it easy for users to deploy Fn with Pipeline on their chosen cloud provider. Before we dive into how to deploy and use Fn with Pipeline, here are a few reasons why we thought Fn should be supported by Pipeline:

Read more...

In our last last entry in the distributed TensorFlow series, we used a research example for distributed training of an Inception model. In this post we’ll showcase how to do the same thing on GPU instances, this time on Azure managed Kubernetes - AKS deployed with Pipeline. As you may remember from our previous post that the first thing to consider when running distributed Tensorflow models is whether you have shared storage space available.

Read more...

At Banzai Cloud we secure our Kubernetes services using Vault and OAuth2 tokens. This has not always been the case, though we’ve had authentication in our project (even though it was basic) from a very early PoC stage - and we suggest that you do the same. Usually, inbound connections to Kubernetes cluster services are accessed via Ingress. Just to recap, public services are typically accessed through a loadbalancer service.

Read more...

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded At Banzai Cloud we provision and monitor large Kubernetes clusters deployed to multiple cloud/hybrid environments. These clusters and applications or frameworks are all managed by our next generation PaaS, Pipeline.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

At Banzai Cloud we provision different frameworks and tools like Spark, Zeppelin, Kafka, Tensorflow, etc to our Pipeline PaaS (built on Kubernetes). Last week we added serverless capabilities to Pipeline, using OpenFaas. This blog post explains how to deploy OpenFaaS to Kubernetes using Pipeline and invoke an example function. We’ll distinguish between the provisioning of the serverless frameworks we support (this post is about OpenFaaS but Pipeline also supports Kubeless), from the invocation of functions through the Pipeline API or CI/CD workflow once it’s dispatched to any of the serverless frameworks we deploy to Kubernetes.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service You may remember the Hollowtrees project we open sourced a few weeks ago: a framework for the management of AWS spot instance clusters, batteries included: Hollowtrees, an alert-react based framework that’s part of the Pipeline PaaS, which coordinates monitoring, applies rules and dispatches action chains to plugins using standard CNCF interfaces AWS spot instance termination Prometheus exporter AWS autoscaling group Prometheus exporter AWS Spot Instance recommender Kubernetes action plugin to execute k8s operations (e.

Read more...

At Banzai Cloud we are building a cloud agnostic, open source next generation CloudFoundry/Heroku-like PaaS, Pipeline, while running several big data workloads natively on Kubernetes. Apache Kafka is one of the cloud native workloads we support out-of-the-box, alongside Apache Spark and Apache Zeppelin. If you’re interested in running big data workloads on Kubernetes, please read the following blog series as well. Apache Kafka on Kubernetes series: Kafka on Kubernetes with Local Persistent Volumes Monitoring Apache Kafka with Prometheus

Read more...

At Banzai Cloud we run multiple Kubernetes clusters deployed with our next generation PaaS, Pipeline, and we deploy these clusters across different cloud providers like AWS, Azure and Google, or on-premise. These clusters are typically launched via the same control plane deployed either to AWS, as a CloudFormation template, or Azure, as an ARM template. And, since we practice what we preach, they run inside Kubernetes as well. One of the added values to deployments via Pipeline is out-of-the-box monitoring and dashboards through default spotguides for the applications we also support out-of-the-box.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service Last week we open sourced the Hollowtrees project, a framework that manages AWS spot instance clusters - batteries included: Hollowtrees - an alert-react based framework that’s part of the Pipeline PaaS, which coordinates monitoring, applies rules and dispatches action chains to plugins using standard CNCF interfaces AWS spot instance termination Prometheus exporter AWS autoscaling group Prometheus exporter AWS Spot Instance recommender Kubernetes action plugin to execute k8s operations (e.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Banzai Pipeline, or simply “Pipeline” is a tabletop reef break located in Hawaii, on Oahu’s North Shore. It is the most famous and infamous reef on the planet, and serves as the benchmark by which all other surf breaks are measured. Pipeline is a PaaS with a built in CI/CD engine to deploy cloud native microservices to a public cloud or on-premise. It simplifies and abstracts all the details of provisioning cloud infrastructure, installing or reusing a Kubernetes cluster, and deploying an application.

Read more...

Hollowtrees is a wave of highest pedigree, the pin-up centerfold of the Mentawai islands’ surf break which brings new machine-like connotations to the word perfection. Watch out for the aptly named ‘Surgeon’s Table’, a brutal reef famous for taking bits and pieces of Hollowtrees’ surfers as trophies. Hollowtrees, a ruleset based watch-guard keeps spot instance-based clusters safe and allows for them to be used in production. It handles spot price surges within a given region or availability zone and reschedules applications before instances are taken down.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

We are moving relatively quickly, implementing new Pipeline features and releases, with our second major release scheduled for this week. Among other new features we’ve already added a new managed Kubernetes provider, Microsoft’s Azure AKS. Azure Container Service (AKS) is a preview feature of the Azure Cloud - and we’re proud to be among its earliest adopters. We can provision and deploy apps to Kubernetes on Azure VMs the same way we do on EC2, however, at Banzai Cloud we strongly believe that the future is in managed Kubernetes services; most of our investment regarding cloud neutrality and provisioning is built on managed Kubernetes services both in the cloud (GKE, OCI and ACS in beta, or under development) and on-prem.

Read more...

Last time we discussed how our Pipeline PaaS deploys and provisions an AWS EFS filesystem on Kubernetes and what the performance benefits are for Spark or TensorFlow. This post is gives: An introduction to TensorFlow on Kubernetes The benefits of EFS for TensorFlow (image data storage for TensorFlow jobs) Pipeline uses the kubeflow framework to deploy: A JupyterHub to create & manage interactive Jupyter notebooks A TensorFlow Training Controller that can be configured to use CPUs or GPUs A TensorFlow Serving container Note that Pipeline also has default Spotguides for Spark and Zeppelin to help support your datascience experience

Read more...

At Banzai Cloud we provision different frameworks and tools like Spark, Zeppelin and, most recently, Tensorflow, all of which run on our Pipeline PaaS (built on Kubernetes). One of Pipeline’s early adopters runs a Tensorflow Training Controller using GPUs on AWS EC2, wired into our CI/CD pipeline, which needs significant parallelization for reading training data. We’ve introduced support for Amazon Elastic File System and made it publicly available in the forthcoming release of Pipeline.

Read more...

At Banzai Cloud we provision different applications or frameworks to Pipeline, the PaaS we built on Kubernetes. We practice what we preach, and our PaaS’ control plane also runs on Kubernetes and requires a layer of data storage. It was therefore necessary that we explore two different use cases: how to deploy and to run a distributed, scalable and fully SQL compliant DB to cover our client’s, and our own, internal needs.

Read more...

At Banzai Cloud we run and deploy containerized applications to Pipeline, our PaaS. Those of you who (like us) run Java applications inside Docker, have probably already come across the problem of JVMs inaccurately detecting available memory when running inside a container. Instead of accurately detecting the memory available in a Docker container, JVMs see the available memory of the machine. This can lead to cases wherein applications that run inside containers are killed whenever they try to use an amount of memory that exceeds the limits of the Docker container.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Modern applications and services usually expose their functions via REST; moreover, modules and components also make use of external services that are exposed as REST. Thus, developers often need to design RESTful services and write REST service clients. It’s a given in this kind of work that these services will be called thousands of times during the development process (developers need to understand the API, as well as the messages and the resources involved), and even after, to make sure everything works as desired.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

As 2017 comes to an end, we’re looking back at the three blog posts that were most popular with our readers. We can’t go too far back (though we’ve had 13 posts and one release already), since we founded our startup just a little over one month ago (on November 20, 2017, to be precise), but during this short period we’ve achieved a whole lot, and laid the foundation for some exciting new projects we plan to ship out early next year.

Read more...

Last week we have released the first version of Pipeline - with end to end support for cloud native apps starting from a GitHub commit hook deployed into the cloud in minutes using a fully customizable CI/CD workflow. The core part of the Pipeline PaaS is spotguides - a collection of workflow/pipeline steps defined in a .pipeline.yml file and a few Drone plugins. In this post we would like to demystify spotguides and describe step by step how they work; the next post will be a tutorial of how to write a custom spotguide and an associated plugin.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Banzai Pipeline, or simply Pipeline, is a tabletop reef break located in Hawaii, on Oahu’s North Shore. It is the most famous and infamous reef on the planet, and serves as the benchmark by which all other waves are measured. Pipeline is a PaaS with a built in CI/CD engine to deploy cloud native microservices to a public cloud or on-premise. It simplifies and abstracts all the details of provisioning cloud infrastructure, installing or reusing a Kubernetes cluster and deploying an application.

Read more...

Cloud cost management series: Overspending in the cloud Managing spot instance clusters on Kubernetes with Hollowtrees Monitor AWS spot instance terminations Diversifying AWS auto-scaling groups Draining Kubernetes nodes Cluster recommender Cloud instance type and price information as a service One of the primary advantages always discussed in the context of deciding whether to move a deployment to the cloud is cost. There are no upfront costs in moving to the cloud because you don’t have to buy any hardware, and you only pay for what you really use because you can scale your infrastructure according to your workloads.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Debug 101 Today we’re starting a new series called Debug 101, which deals with those issues that gave us particularly bad headaches and took a large amount of time to debug, understand and fix. We believe strongly in open source software and open issue resolution, and we try to describe our problems and suggest fixes as we go, so you don’t have to shave that yak. We already have, and they yak looks awesome.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

At Banzai Cloud we use different cloud providers or managed Kubernetes offerings, one of which is Microsoft Azure Managed Kubernetes. It’s a pretty soid service that allows you to deploy a managed k8s cluster without requiring you to deal with low level Kubernetes building blocks, tooling, or cloud infrastructure provisioning. However, there is one temporary issue which is a cornerstone of our PasS, Pipeline: the Azure Go-SDK does not contain the bindings for our new service.

Read more...