Kubernetes has taken the center stage in the world of container orchestration in a relatively very short time. Originally developed by Google and open sourced to the world through CNCF in 2018, its world’s hottest open source project. This fame is due to a strong developer community and a very vast ecosystem, with a total of 900+ products, services, and organizations adding value to Kubernetes in their own unique way. Navigating this huge ecosystem can become a daunting task for a beginner or even an experienced IT individual. This guide will ease the challenges of making sense of what some of these products and services do for Kubernetes or augment it’s functionality to turn it into a complete enterprise solution. Diagram below shows a set of essential components required to have a working Kubernetes platform. We will elaborate on these components in this document.
Analytics databases (also known as On-Line Analytical Processing – OLAP) systems are used to store and manage big an
There are 149 Kubernetes platforms available in the market at the time of this writing (March 2021). Some of them are hosted solutions (48), some are installers (20), and others are CNCF certified distributions (67). Here we will list some of the major players in the industry in each of the above categories.
Also known as “managed solutions”, hosted Kubernetes solutions are housed by some of the largest cloud providers. Here we will provide some of these to give you an insight into Kubernetes’ hosted solutions world.
Primary goal of installers is to make it easy for an administrator to deploy Kubernetes on a desire platform. Mostly you just need your container runtime (docker or other), and machines on which to install Kubernetes cluster.
|Rancher||Rancher Kubernetes Engine, an extremely simple, lightning fast Kubernetes installer that works everywhere. It works on bare-metal and virtualized servers. Limited to run with Docker run-time only.|
|kops||Kubernetes Operations can be thought of as kubectl of clusters Helps create, destroy, upgrade and maintain production-grade, highly available cluster Also can provision the necessary cloud infrastructure (e.g. on AWS etc.)|
|kubeadm||Creates a minimum viable Kubernetes cluster that conforms to best practices A simple way for you to try out Kubernetes, possibly for the first time A building block in other ecosystem and/or installer tools with a larger scope|
|minikube||Setup a local Kubernetes cluster on Linux, macOS, and Windows Focus of minikube is to make it easy to learn and develop for Kubernetes You only need a virtual machine to install and run minikube|
|CableLabs||An installation tool to install Kubernetes on a Linux machines that have been initialized with SNAPS-Boot It is an installation tool to install Kubernetes on a Linux machines that have been initialized with SNAPS-Boot|
CNCF ensures Certified Kubernetes products deliver consistency and portability, and that 67 Certified Kubernetes Distributions and Platforms are now available. We provide details some of them for you.
|Agile Stack||DevOps platform provides automation for cloud infrastructure, applications, and security. Auto-generate infrastructure as code scripts, significantly reducing the effort to integrate DevOps tools,such as Docker, Jenkins, Spinnaker, PostgreSQL, Git, and Okta into the stack|
|EKS Distro AWS||An open source distribution of Kubernetes. It includes binaries and containers of open source Kubernetes, etcd, networking, storage plugins, all tested for compatibility|
|Cisco Container Platform (CCP)||Its flexible and runs anywhere, with consistent deployment across hyperconverged infrastructure, VMs, bare metal, public, and private cloud. Automates insallation of Kubernetes and Docker, installing analytics tools, creating clusters, load balancing, curating the OS, and even updating the distribution|
|WMWare Tanzu||Simplifies Kubernetes for developers by running same version across data center, public cloud, secure experience for all development team. Can run Tanzu Kubernetes Grid supported by VMware with Tanzu Kubernetes Clusters on the IaaS of your choice|
|weaveworks Kubernetes||Rapidly create, update and manage production ready application clusters with all of the add-ons needed for an agile cloud native platform with a single click Minimize operations overhead with automated cluster lifecycle management: upgrades, security patches, and cluster extension updates|
From a long list of products and services in the Kubernetes ecosystem, here we list some of the critical add-ons that are a must for almost every Kubernetes implementation.
Networking is a central piece of Kubernetes cluster. Fundamental requirements for a Kubernetes networking implementation are: all pods can communicate with each other across the cluster, Kubernetes agents on a node can communicate with all pods on the node, and pods can communicate without use of a NAT. Some of the mostly used network solutions.
|Container Network Interface (CNI)||When considering a container network solution, you want to ensure that you choose a solution that adhere to CNI specification|
|Antrea by WMware||Kubernetes networking based on Open vSwitch, which is a high performance programmable virtual switch, this allows implementation of an extensive set of networking and security features. Allows implementation of Kubernetes Network Policies|
|AWS VPC CNI, Azure CNI, Google GCE, etc.||Most major public cloud providers implement their own pod networking solutions on the hosted clusters they offer.|
Implementation of Storage in Kubernetes is done by employing one of the provisioners in the ecosystem. You can have multiple types of Storage implemented with the virtue of having the ability to provision storage using StorageClass. Each StorageClass allows you to define a provisioner which can be anyone of the following.
|AWS EBS||High Performance for demanding workloads Options to attach volumes to EC2 R5b instances to get up to 60 Gbps bandwidth and 260K IOPS Up to 99.9% durability Virtually unlimited scale to petabytes of data|
|AzureFile/AzureDisk||Azure Disks are mounted as ReadWriteOnce, so only available to a single pod Azure files can be shared across nodes and pods Regional Data redundancy – store multiple copies of data|
|Longhorn||Cloud-native storage – Can run on public clouds and private clusters Very easy to deploy Automatic volume backup and restore|
|PortworxVolume||Best performance block storage for Kubernetes On-Prem or across clouds Fully integrated solution for persistent storage|
|OpenEBS||Open source block storage Can work like a local storage and use host drives Simple to set up|
You can have Kubernetes platform be set for your development team and your production environment. When setting for Production, there are additional components that allow you to manage/monitor, Secure, and fortify your production cluster.
Unlike monoliths, where monitoring of applications and hosts was enough, cloud-native environment applications running on Kubernetes have many components that require monitoring, such as nodes, containers, microservices, and more. Kubernetes ecosystem provides many management and monitoring tools for a wide range of needs and situations.
|AWS CloudWatch||Installed on EKS cluster as Daemonset Collects Cluster metrics from each node|
|DataDog||Datadog agents collect metrics, events, and logs Collects metrics from container runtime as well Automatically monitors cluster nodes|
|Grafana||Grafana can pull data from multiple resources and present as user friendly manner (graphs, charts) Connects with Prometheus, Datadog, splunk to name a few|
|WeaveCloud/WeaveScope||Visualization and monitoring tool Can be used in standalone mode (local hosts) or cluster hosts Uses APIs to gather information from the nodes|
|fluentd||A popular selection log collection Supports logs collection for multiple levels such as application logs, network protocols, cloud APIs (e.g. aws cloudWatch, AWS SQS) etc.|
|Google Stackdriver||Log aggregation and monitoring solution Stackdriver logs – Collects logs Stackdriver Monitoring – Create charts and dashboards|
System logs can be a major resource hog and not provide enough value if no automation is in place to alert the systems administrators of the failures. Kubernetes ecosystem offers many solutions to meet this challenge. Table below contains some of the solutions available for logging and alerting.
|Thanos||Enables teams to deploy Prometheus at scale. This gives a view of all cluster metrics. No size limitation, can store unlimited amounts of data It stores data in any object storage Query the same data at high speeds|
|Splunk||An event logging system. You can set it to monitor for certain items in the logs and raise alarms or monitor events, e.g. memory usage exceeds 90%.|
|LogDNA||LogDNA is a centralized log management solution that helps DevOps to be more productive It allows you to collect, monitor, parse, live tail, graph, and analyze logs with clear visualizations and smart alerting|
|Pandora||This Kubernetes plugin allows teams to obtain data from Kubernetes cluster, generate agents for each of its elements and monitor statistics Will need to install metrics-server if you wish to collect cluster metrics for your cluster|
Just like any other system, if Kubernetes is not configured correctly, it can cause many headaches to the administrators and open many security issues in the system. The Kubernetes ecosystem covers us very well in this area too. Four primary areas of Kubernetes security are known to be code, container, cluster, and cloud (known as 4C’s of cloud native security). https://kubernetes.io/docs/concepts/security/overview/
Cloud security basically means that you have to ensure that the infrastructure where your Kubernetes cluster is running, needs to be secure. Cluster security concerns with the security of components that manage your cluster and applications running on the cluster, you need to have them secure and ensure they are safe from attacks and are not harmful for each other. Container security primarily concerns itself with having to ensure that image build process checks for container vulnerabilities, have some signing authority for security, have not privileged users in containers that have more privileges than what they actually need, and solidifying container run-time.
There are over 45 different products and services that address these issues. Table below lists some of them.
|Falco||CNCF cloud native runtime security Detection of unexpected application behavior Generates alerts on threats at runtime|
|INSPEC by Chef||An open-source testing framework that lets you specify compliance, security, and policy requirements Helps detect vulnerabilities in the cluster|
|Prisma by paloalto networks||Provides a big set of customizable checks covering config, communication and more to ensure that the cluster is compliant Full Stack protection with machine learning-powered runtime protection Enforcement of vulnerability and compliance scanning directly into CI/CD workflow and implement policies via DevOps plugins|
|Kube Bench||Open-Source application written in Go Runs benchmark tests on the cluster to ensure it meet the specific requirements Easy to install – run the binary or install as a POD|
|Sysdig Secure||Secures applications thru the life of containers Machine learning based protection and image scanning Enforces compliance, identifies vulnerabilities|
Once you have components of the cluster installed and ready, you need to have a mechanism in place to ensure proper communication between the components, ability to manage them, and ability to communicate with the components outside of the cluster. In this section we will talk about some of these items.
Ingress is an API object that manages external access to the applications in a cluster, that is, someone from outside of the cluster can gain access to the applications running within the cluster. Ingress and service proxy facilitate this. A Kubernetes Ingress Resources exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
|kube-proxy||kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept It maintains network rules on the nodes|
|Traefik proxy||Traefik is an open-source Edge Router that makes publishing your services a fun and easy experience. Its modern reverse proxy and load balancer that makes deploying microservices easy. Traefik is a dynamic load balancer designed for ease of configuration, especially in dynamic environments|
|citrix||It accelerates application performance, enhances application availability with advanced L4-7 load balancing, secures mission-critical apps from attacks and lowers server expenses by offloading computationally intensive tasks Available as high performance network appliance or virtual appliance|
|Istio||Service networking layer that provides a transparent and language-independent way to flexibly and easily automate application network functions Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress.|
An API gateway acts as a single entry point for services/applications in your cluster and ensures security and reliability. The primary function of the API gateway is to provide a single, consistent entry point for multiple APIs, regardless of how they are implemented or deployed at the backend.
|NGINX||High‑performance, lightweight reverse proxy and load balancer, NGINX has the advanced HTTP processing capabilities needed for handling API traffic, ideal for building API gateway NGINX can also work as reverse-proxy, load balancer, and web server, so if you have it installed then you don’t need deploy a separate API gateway.|
|traefik||It provides key capabilities such as API security, traffic management, and observability It enables security policies, adding user authentication and authorization, while also accelerating client requests through caching and traffic shaping, and it runs on any infrastructure.|
|envoy||Originally built at Lyft, Envoy is a high performance C++ distributed proxy designed for single services and applications, as well as a communication bus and “universal data plane” designed for large microservice “service mesh” architectures Envoy is a self contained, high performance server with a small memory footprint. It runs alongside any application language or framework|
|citrix||Enforces authentication policies Rate limit access to services Advanced content routing Enforces web application firewall policies|
Job of a service mesh is to make it easier to connect, manage, and secure traffic between microservices running within the cluster.
|Istio||A modernized service networking layer that provides a transparent and language-independent way to flexibly and easily automate application network functions It is a popular solution for managing the different microservices that make up a cloud-native application|
|Consul by Hashicorp||It helps with service discovery as clients can register as service using DNS or HTTP Consul can generate and distribute TLS certificates for services to establish mutual TLS connections thus enabling secure communication Consul supports multiple datacenters, i.e. users don’t need to worry about building additional layers of abstraction to grow to multiple regions|
|meshery||Its a mesh manager that supports many service meshes. A separate mesh adapter is used for each mesh, thus providing very specific functionality is available based on connected mesh|
|Open Service Mesh||Open source service mesh developed by Microsoft Written in the Go programming language and designed to be a reference implementation of the Service Mesh Interface (SMI) specification, a standard interface for service meshes on Kubernetes|
Registries and repositories store images of the applications that need to run on your Kubernetes cluster. These registries can be public or private. You must have access to one of these to ensure that your cluster can download an image of the application you wish to launch. There are multiple options for registries.
|Docker Registry||The Registry is a stateless, highly scalable server side application that stores and lets you distribute Docker images It is open source|
|Azure Registry||Container image registry by Microsoft Azure It handles private Docker container images as well as related contents Can scale globally easily by utilizing Geo-replication|
|Amazon ECR||A fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts It works with EKS, ECS, and AWS Lambda, simplifying your development to production workflow|
|Google Container Registry||Container Registry is a single place for your team to manage Docker images, perform vulnerability analysis, and decide who can access what with fine-grained access control Existing CI/CD integrations let you set up fully automated Docker pipelines to get fast feedback|
Your developers and DevOps teams need tools to manage your applications code, deploy the applications when there is a change in the code, and ensure stability of this code integration and deployment process. This is made possible by many CI/CD tools, some of which are listed here for your reference.
|Helm (K8s package manager)||Helps manage Kubernetes applications. Use Helm charts to define, install, and upgrade Kubernetes applications|
|Jenkins||Free and open source tool that helps automate the parts of software development related to building, testing and deploying. Easy to install, configure, and extensible|
|Gitlab||GitLab is a complete DevOps platform, delivered as a single application A Git-repository manager providing wiki, issue-tracking and continuous integration and deployment pipeline features Both open source and enterprise versions available|
|Circle CI||Automatically cancel any queued or running builds on a branch (except on your default branch) when a newer build is triggered on that same branch. Do not need to wait for a build to finish if you have improved upon it Automatically split and balance your tests across multiple containers to reduce your overall build time. Parallelism speeds up test and build run times, letting you get faster feedback|
|Weave Cloud||Automatically enforces policies in Kubernetes Deploy, Explore, and Monitor assists you in your job as a developer responsible for delivering a cloud native app With an ABCDE approach to app development: Apps are developed and tested locally, Built and tested in your CI system; Container images are pushed to your registry and then automatically Deployed to the cloud Environment of your choice|
|Codefresh||Codefresh is designed to be the fastest CI/CD platform available. It’s built on Kubernetes for fast speed and unlimited scalability Advanced distributed caching allows it to be even faster because it caches images, layers, source code, dependencies and more, which results in faster builds|
Service Catalog is an extension API that enables applications running in Kubernetes clusters to easily use external managed software offerings, such as a datastore service offered by a cloud provider or streaming and messaging services. Infrastructure applications are all of your containerized business applications that would be running within your cluster to support your business.
|Apache Hadoop||Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation It provides a software framework for distributed storage and processing of big data using the MapReduce programming model|
|Cassandra||Apache Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure Cassandra outperforms many other database system, is decentralize so no single point of failure, and is scaleable|
|IBM DB2||Database solution by IBM. Enterprise version only. New IBM BLU Acceleration feature provides better performance Db2 protects against data loss from partial and complete site failures by replicating data changes from a source database|
|PostgreSQL||PostgreSQL is a powerful, open source object-relational database system PostgreSQL comes with many features aimed to help developers build applications, administrators to protect data integrity and build fault-tolerant environments PostgreSQL is highly extensible. For example, you can define your own data types, build out custom functions, even write code from different programming languages without recompiling your database|
|Oracle||One of the most famous DB solution. Only available as Enterprise solution|
|MySQL||Many of the world’s largest and fastest-growing organizations including Facebook, Google, Adobe, Alcatel Lucent and Zappos rely on MySQL to save time and money powering their high-volume Web sites, business-critical systems and packaged software Provides high availability thru redundancy, consistency, automatic fault detection and resolution. No single point of failure|
|MongoDB||Available as free community edition and licensed version MongoDB is a scalable, flexible NoSQL document database platform Provides developers with a number of useful out-of-the-box capabilities|
|Apache Spark||Free to use Provides fast processing speeds for large data processing Can process real time streaming data Provides support for complex aggregates and analytics|
|kafka||Open source and free to use. It is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol It runs as a cluster of one or more servers that can span multiple datacenters or cloud regions Apache Kafka is an event streaming platform|
|Google Cloud Dataflow||A fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing A serverless data processing service that runs jobs written using the Apache Beam libraries|
|Amazon Kinesis||A fully managed, scalable, cloud-based AWS service that allows real-time processing of streaming data at scale Kinesis Data Streams can be used to collect log and event data from sources such as servers, desktops, and mobile devices|
We have only touched a very high level information from the Kubernetes landscape. This vast landscape can be bit overwhelming for many, but fear not, there is lots of help available within CNCF community to help you navigate this landscape in further details.
Depending on your business needs and expertise of your IT DevOps teams, you can choose to take either one of the below routes to obtain a complete production ready Kubernetes environment: