Kubernetes Landscape Guide

Written by admin

Kubernetes has taken the center stage in the world of container orchestration in a relatively very short time. Originally developed by Google and open sourced to the world through CNCF in 2018, its world’s hottest open source project. This fame is due to a strong developer community and a very vast ecosystem, with a total of 900+ products, services, and organizations adding value to Kubernetes in their own unique way. Navigating this huge ecosystem can become a daunting task for a beginner or even an experienced IT individual. This guide will ease the  challenges of making sense of what some of these products and services do for Kubernetes or augment it’s functionality to turn it into a complete enterprise solution. Diagram below shows a set of essential components required to have a working Kubernetes platform. We will elaborate on these components in this document.

In the sections below, we will touch upon each of the area specified in the above diagram.

Kubernetes Platforms (core product)

There are 149 Kubernetes platforms available in the market at the time of this writing (March 2021). Some of them are hosted solutions (48), some are installers (20), and others are CNCF certified distributions (67). Here we will list some of the major players in the industry in each of the above categories.

Hosted Solutions:

Also known as “managed solutions”, hosted Kubernetes solutions are housed by some of the largest cloud providers. Here we will provide some of these to give you an insight into Kubernetes’ hosted solutions world.

Name Salient features
OpenShift Dedicated

by RedHat

  • Comprehensive and easy to get started
  • Fully managed – Never worry on patching/upgrading
  • High Availability – Multiple masters
AKS Engine for Azure Stack

by Microsoft

  • Fully managed master plane, you only manage nodes
  • Integrates with Azure Active Directory for security
  • Azure Monitor helps you monitor the cluster
  • GPU-Enabled nodes for compute-intensive and graphical workloads
Oracle Container Engine
  • Easy to create with complete REST API and CLI
  • Provides DevOps automation by streamlining development and operations of Kubernetes cluster
  • Easy administration and automatic upgrades
Alibaba Cloud Container for Kubernetes
  • Leading Kubernetes services provider for Asian-Pacific region
  • Cluster is created easily using consoles
  • Supports canary releases and blue-green releases
Google Kubernetes Engine
  • Quick and easy start
  • Secure with cloud monitoring enabled
  • High-Availability control plane – multi-zone/regional clusters
  • Autopilot Mode – Node less Experience – Pay for workload only, and not worry about provisioning bunch of nodes
EKS

by AWS

  • Managed Control plane – High Availability
  • Provides both serverless (Fargate) or EC2 based clusters
  • Network (VPC) and Access security (AWS IAM – RBAC)
  • AWS Outpost is on-prem cluster offering

 

Installers:

Primary goal of installers is to make it easy for an administrator to deploy Kubernetes on a desire platform. Mostly you just need your container runtime (docker or other), and machines on which to install Kubernetes cluster.

Name Salient features
Rancher
  • Rancher Kubernetes Engine, an extremely simple, lightning fast Kubernetes installer that works everywhere.
  • It works on bare-metal and virtualized servers.
  • Limited to run with Docker run-time only.
kops
  • Kubernetes Operations can be thought of as kubectl of clusters
  • Helps create, destroy, upgrade and maintain production-grade, highly available cluster
  • Also can provision the necessary cloud infrastructure (e.g. on AWS etc.)
kubeadm
  • Creates a minimum viable Kubernetes cluster that conforms to best practices
  • A simple way for you to try out Kubernetes, possibly for the first time
  • A building block in other ecosystem and/or installer tools with a larger scope
minikube
  • Setup a local Kubernetes cluster on Linux, macOS, and Windows
  • Focus of minikube is to make it easy to learn and develop for Kubernetes
  • You only need a virtual machine to install and run minikube
CableLabs
  • An installation tool to install Kubernetes on a Linux machines that have been initialized with SNAPS-Boot
  • It is an installation tool to install Kubernetes on a Linux machines that have been initialized with SNAPS-Boot

 

CNCF Certified Distributions:

CNCF ensures Certified Kubernetes products deliver consistency and portability, and that 67 Certified Kubernetes Distributions and Platforms are now available. We provide details some of them for you.

Name Salient features
Agile Stack
  • DevOps platform provides automation for cloud infrastructure, applications, and security.
  • Auto-generate infrastructure as code scripts, significantly reducing the effort to integrate DevOps tools,such as Docker, Jenkins, Spinnaker, PostgreSQL, Git, and Okta into the stack
EKS Distro AWS
  • An open source distribution of Kubernetes.
  • It includes binaries and containers of open source Kubernetes, etcd, networking, storage plugins, all tested for compatibility
Cisco Container Platform (CCP)
  • Its flexible and runs anywhere, with consistent deployment across hyperconverged infrastructure, VMs, bare metal, public, and private cloud.
  • Automates insallation of Kubernetes and Docker, installing analytics tools, creating clusters, load balancing, curating the OS, and even updating the distribution
WMWare Tanzu
  • Simplifies Kubernetes for developers by running same version across data center, public cloud, secure experience for all development team.
  • Can run Tanzu Kubernetes Grid supported by VMware with Tanzu Kubernetes Clusters on the IaaS of your choice
weaveworks Kubernetes
  • Rapidly create, update and manage production ready application clusters with all of the add-ons needed for an agile cloud native platform with a single click
  • Minimize operations overhead with automated cluster lifecycle management: upgrades, security patches, and cluster extension updates

 

Critical Addons

From a long list of products and services in the Kubernetes ecosystem, here we list some of the critical add-ons that are a must for almost every Kubernetes implementation.

Networking 

Networking is a central piece of Kubernetes cluster. Fundamental requirements for a Kubernetes networking implementation are: all pods can communicate with each other across the cluster, Kubernetes agents on a node can communicate with all pods on the node, and pods can communicate without use of a NAT. Some of the mostly used network solutions.

Name Salient features
Container Network Interface (CNI)
  • When considering a container network solution, you want to ensure that you choose a solution that adhere to CNI specification
Antrea

by WMware

  • Kubernetes networking based on Open vSwitch, which is a high performance programmable virtual switch, this allows implementation of an extensive set of networking and security features.
  • Allows implementation of Kubernetes Network Policies
AWS VPC CNI,

Azure CNI,

Google GCE, etc.

  • Most major public cloud providers implement their own pod networking solutions on the hosted clusters they offer.

 

Storage 

Implementation of Storage in Kubernetes is done by employing one of the provisioners in the ecosystem. You can have multiple types of Storage implemented with the virtue of having the ability to provision storage using StorageClass. Each StorageClass allows you to define a provisioner which can be anyone of the following.

 

Name Salient features
AWS EBS
  • High Performance for demanding workloads
  • Options to attach volumes to EC2 R5b instances to get up to 60 Gbps bandwidth and 260K IOPS
  • Up to 99.9% durability
  • Virtually unlimited scale to petabytes of data
AzureFile/AzureDisk
  • Azure Disks are mounted as ReadWriteOnce, so only available to a single pod
  • Azure files can be shared across nodes and pods
  • Regional Data redundancy – store multiple copies of data
Longhorn
  • Cloud-native storage – Can run on public clouds and private clusters
  • Very easy to deploy
  • Automatic volume backup and restore
PortworxVolume
  • Best performance block storage for Kubernetes
  • On-Prem or across clouds
  • Fully integrated solution for persistent storage
OpenEBS
  • Open source block storage
  • Can work like a local storage and use host drives
  • Simple to set up

 

Key Ingredients for Production Kubernetes

You can have Kubernetes platform be set for your development team and your production environment. When setting for Production, there are additional components that allow you to manage/monitor, Secure, and fortify your production cluster.

Management and Monitoring

Unlike monoliths, where monitoring of applications and hosts was enough, cloud-native environment applications running on Kubernetes have many components that require monitoring, such as nodes, containers, microservices, and more. Kubernetes ecosystem provides many management and monitoring tools for a wide range of needs and situations.

Name Salient features
AWS CloudWatch
  • Installed on EKS cluster as Daemonset
  • Collects Cluster metrics from each node
DataDog
  • Datadog agents collect metrics, events, and logs
  • Collects metrics from container runtime as well
  • Automatically monitors cluster nodes
Grafana
  • Grafana can pull data from multiple resources and present as user friendly manner (graphs, charts)
  • Connects with Prometheus, Datadog, splunk to name a few
WeaveCloud/WeaveScope
  • Visualization and monitoring tool
  • Can be used in standalone mode (local hosts) or cluster hosts
  • Uses APIs to gather information from the nodes
fluentd
  • A popular selection log collection
  • Supports logs collection for multiple levels such as application logs, network protocols, cloud APIs (e.g. aws cloudWatch, AWS SQS) etc.
Google Stackdriver
  • Log aggregation and monitoring solution
  • Stackdriver logs – Collects logs
  • Stackdriver Monitoring – Create charts and dashboards

 

Logging and Alerting

System logs can be a major resource hog and not provide enough value if no automation is in place to alert the systems administrators of the failures. Kubernetes ecosystem offers many solutions to meet this challenge. Table below contains some of the solutions available for logging and alerting.

Name Salient features
Thanos
  • Enables teams to deploy Prometheus at scale. This gives a view of all cluster metrics. No size limitation, can store unlimited amounts of data
  • It stores data in any object storage
  • Query the same data at high speeds
Splunk
  • An event logging system. You can set it to monitor for certain items in the logs and raise alarms or monitor events, e.g. memory usage exceeds 90%.
LogDNA
  • LogDNA is a centralized log management solution that helps DevOps to be more productive
  • It allows you to collect, monitor, parse, live tail, graph, and analyze logs with clear visualizations and smart alerting
Pandora
  • This Kubernetes plugin allows teams to obtain data from Kubernetes cluster, generate agents for each of its elements and monitor statistics
  • Will need to install metrics-server if you wish to collect cluster metrics for your cluster

 

Security and Auditing/Compliance

Just like any other system, if Kubernetes is not configured correctly, it can cause many headaches to the administrators and open many security issues in the system. The Kubernetes ecosystem covers us very well in this area too. Four primary areas of Kubernetes security are known to be code, container, cluster, and cloud (known as 4C’s of cloud native security). https://kubernetes.io/docs/concepts/security/overview/

Cloud security basically means that you have to ensure that the infrastructure where your Kubernetes cluster is running, needs to be secure. Cluster security concerns with the security of components that manage your cluster and applications running on the cluster, you need to have them secure and ensure they are safe from attacks and are not harmful for each other. Container security primarily concerns itself with having to ensure that image build process checks for container vulnerabilities, have some signing authority for security, have not privileged users in containers that have more privileges than what they actually need, and solidifying container run-time.

There are over 45 different products and services that address these issues. Table below lists some of them.

Name Salient features
Falco
  • CNCF cloud native runtime security
  • Detection of unexpected application behavior
  • Generates alerts on threats at runtime
INSPEC

by Chef

  • An open-source testing framework that lets you specify compliance, security, and policy requirements
  • Helps detect vulnerabilities in the cluster
Prisma

by paloalto networks

  • Provides a big set of customizable checks covering config, communication and more to ensure that the cluster is compliant
  • Full Stack protection with machine learning-powered runtime protection
  • Enforcement of vulnerability and compliance scanning directly into CI/CD workflow and implement policies via DevOps plugins
Kube Bench
  • Open-Source application written in Go
  • Runs benchmark tests on the cluster to ensure it meet the specific requirements
  • Easy to install – run the binary or install as a POD
Sysdig Secure
  • Secures applications thru the life of containers
  • Machine learning based protection and image scanning
  • Enforces compliance, identifies vulnerabilities

 

Putting Cluster to work

Once you have components of the cluster installed and ready, you need to have a mechanism in place to ensure proper communication between the components, ability to manage them, and ability to communicate with the components outside of the cluster. In this section we will talk about some of these items.

Ingress and service proxies

Ingress is an API object that manages external access to the applications in a cluster, that is, someone from outside of the cluster can gain access to the applications running within the cluster. Ingress and service proxy facilitate this. A Kubernetes Ingress Resources exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.

Name Salient features
NGINX Controller
kube-proxy
  • kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept
  • It maintains network rules on the nodes
Traefik proxy
  • Traefik is an open-source Edge Router that makes publishing your services a fun and easy experience.
  • Its modern reverse proxy and load balancer that makes deploying microservices easy.
  • Traefik is a dynamic load balancer designed for ease of configuration, especially in dynamic environments
citrix
  • It accelerates application performance, enhances application availability with advanced L4-7 load balancing, secures mission-critical apps from attacks and lowers server expenses by offloading computationally intensive tasks
  • Available as high performance network appliance or virtual appliance
Istio
  • Service networking layer that provides a transparent and language-independent way to flexibly and easily automate application network functions
  • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection
  • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress.

 

API Gateways

An API gateway acts as a single entry point for services/applications in your cluster and ensures security and reliability. The primary function of the API gateway is to provide a single, consistent entry point for multiple APIs, regardless of how they are implemented or deployed at the backend.

Name Salient features
NGINX
  • High‑performance, lightweight reverse proxy and load balancer, NGINX has the advanced HTTP processing capabilities needed for handling API traffic, ideal for building API gateway
  • NGINX can also work as reverse-proxy, load balancer, and web server, so if you have it installed then you don’t need deploy a separate API gateway.
traefik
  • It provides key capabilities such as API security, traffic management, and observability
  • It enables security policies, adding user authentication and authorization, while also accelerating client requests through caching and traffic shaping, and it runs on any infrastructure.
envoy
  • Originally built at Lyft, Envoy is a high performance C++ distributed proxy designed for single services and applications, as well as a communication bus and “universal data plane” designed for large microservice “service mesh” architectures
  • Envoy is a self contained, high performance server with a small memory footprint. It runs alongside any application language or framework
citrix
  • Enforces authentication policies
  • Rate limit access to services
  • Advanced content routing
  • Enforces web application firewall policies

 

Service Mesh

Job of a service mesh is to make it easier to connect, manage, and secure traffic between microservices running within the cluster.

Name Salient features
Istio
  • A modernized service networking layer that provides a transparent and language-independent way to flexibly and easily automate application network functions
  • It is a popular solution for managing the different microservices that make up a cloud-native application
Consul by Hashicorp
  • It helps with service discovery as clients can register as service using DNS or HTTP
  • Consul can generate and distribute TLS certificates for services to establish mutual TLS connections thus enabling secure communication
  • Consul supports multiple datacenters, i.e. users don’t need to worry about building additional layers of abstraction to grow to multiple regions
meshery
  • Its a mesh manager that supports many service meshes.
  • A separate mesh adapter is used for each mesh, thus providing very specific functionality is available based on connected mesh
Open Service Mesh
  • Open source service mesh developed by Microsoft
  • Written in the Go programming language and designed to be a reference implementation of the Service Mesh Interface (SMI) specification, a standard interface for service meshes on Kubernetes

 

Registries and Repositories

Registries and repositories store images of the applications that need to run on your Kubernetes cluster. These registries can be public or private. You must have access to one of these to ensure that your cluster can download an image of the application you wish to launch. There are multiple options for registries.

Name Salient features
Docker Registry
  • The Registry is a stateless, highly scalable server side application that stores and lets you distribute Docker images
  • It is open source
Azure Registry
  • Container image registry by Microsoft Azure
  • It handles private Docker container images as well as related contents
  • Can scale globally easily by utilizing Geo-replication
Amazon ECR
  • A fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts
  • It works with EKS, ECS, and AWS Lambda, simplifying your development to production workflow
Google Container Registry
  • Container Registry is a single place for your team to manage Docker images, perform vulnerability analysis, and decide who can access what with fine-grained access control
  • Existing CI/CD integrations let you set up fully automated Docker pipelines to get fast feedback

 

CICD and  application development tools

Your developers and DevOps teams need tools to manage your applications code, deploy the applications when there is a change in the code, and ensure stability of this code integration and deployment process. This is made possible by many CI/CD tools, some of which are listed here for your reference.

Name Salient features
Helm (K8s package manager)
  • Helps manage Kubernetes applications.
  • Use Helm charts to define, install, and upgrade Kubernetes applications
Jenkins
  • Free and open source tool that helps automate the parts of software development related to building, testing and deploying.
  • Easy to install, configure, and extensible
Gitlab
  • GitLab is a complete DevOps platform, delivered as a single application
  • A Git-repository manager providing wiki, issue-tracking and continuous integration and deployment pipeline features
  • Both open source and enterprise versions available
Circle CI
  • Automatically cancel any queued or running builds on a branch (except on your default branch) when a newer build is triggered on that same branch. Do not need to wait for a build to finish if you have improved upon it
  • Automatically split and balance your tests across multiple containers to reduce your overall build time. Parallelism speeds up test and build run times, letting you get faster feedback
Weave Cloud
  • Automatically enforces policies in Kubernetes
  • Deploy, Explore, and Monitor assists you in your job as a developer responsible for delivering a cloud native app
  • With an ABCDE approach to app development: Apps are developed and tested locally, Built and tested in your CI system; Container images are pushed to your registry and then automatically Deployed to the cloud Environment of your choice
Codefresh
  • Codefresh is designed to be the fastest CI/CD platform available. It’s built on Kubernetes for fast speed and unlimited scalability
  • Advanced distributed caching allows it to be even faster because it caches images, layers, source code, dependencies and more, which results in faster builds

 

Service Catalogs and  infrastructure apps

Service Catalog is an extension API that enables applications running in Kubernetes clusters to easily use external managed software offerings, such as a datastore service offered by a cloud provider or streaming and messaging services. Infrastructure applications are all of your containerized business applications that would be running within your cluster to support your business.

Name Salient features
Apache Hadoop
  • Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation
  • It provides a software framework for distributed storage and processing of big data using the MapReduce programming model
Cassandra
  • Apache Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure
  • Cassandra outperforms many other database system, is decentralize so no single point of failure, and is scaleable
IBM DB2
  • Database solution by IBM. Enterprise version only.
  • New IBM BLU Acceleration feature provides better performance
  • Db2 protects against data loss from partial and complete site failures by replicating data changes from a source database
PostgreSQL
  • PostgreSQL is a powerful, open source object-relational database system
  • PostgreSQL comes with many features aimed to help developers build applications, administrators to protect data integrity and build fault-tolerant environments
  • PostgreSQL is highly extensible. For example, you can define your own data types, build out custom functions, even write code from different programming languages without recompiling your database
Oracle
  • One of the most famous DB solution.
  • Only available as Enterprise solution
MySQL
  • Many of the world’s largest and fastest-growing organizations including Facebook, Google, Adobe, Alcatel Lucent and Zappos rely on MySQL to save time and money powering their high-volume Web sites, business-critical systems and packaged software
  • Provides high availability thru redundancy, consistency, automatic fault detection and resolution. No single point of failure
MongoDB
  • Available as free community edition and licensed version
  • MongoDB is a scalable, flexible NoSQL document database platform
  • Provides developers with a number of useful out-of-the-box capabilities
Apache Spark
  • Free to use
  • Provides fast processing speeds for large data processing
  • Can process real time streaming data
  • Provides support for complex aggregates and analytics
kafka
  • Open source and free to use. It is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol
  • It runs as a cluster of one or more servers that can span multiple datacenters or cloud regions
  • Apache Kafka is an event streaming platform
Google Cloud Dataflow
  • A fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing
  • A serverless data processing service that runs jobs written using the Apache Beam libraries
Amazon Kinesis
  • A fully managed, scalable, cloud-based AWS service that allows real-time processing of streaming data at scale
  • Kinesis Data Streams can be used to collect log and event data from sources such as servers, desktops, and mobile devices

 

Conclusion

We have only touched a very high level information from the Kubernetes landscape. This vast landscape can be bit overwhelming for many, but fear not, there is lots of help available within CNCF community to help you navigate this landscape in further details.

Depending on your business needs and expertise of your IT DevOps teams, you can choose to take either one of the below routes to obtain a complete production ready Kubernetes environment:

  • Hosted Kubernetes Solutions – easy setup, secure, and very expensive, not much control on how it works – AWS, Google, IBM, RedHat, etc.
  • Install your own Kubernetes – extremely complicated setup, requires team of experts to create and manage, very expensive, full control of how it works – Own
  • On-prem/Private-Cloud Kubernetes Devices – ready to go, secure, and very inexpensive – High Plains Computing, etc.