Kubernetes Architecture

What is Kubernetes?

Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation (CNCF).

Kubernetes is often referred to as "k8s" for short. The name "Kubernetes" comes from the Greek word for "helmsman" or "pilot," and the "8" in "k8s" refers to the eight letters between the "K" and the "s" in the name. The name was chosen because it represents the idea of a platform that helps guide and manage the deployment of applications, much like a helmsman guides a ship.

The "k8s" abbreviation was first used by the Kubernetes community as a way to shorten the name of the project and make it easier to reference in conversation and in code. It's now widely used by developers, engineers, and other professionals who work with Kubernetes on a regular basis.

What are the benefits of using k8s?

There are many benefits to using Kubernetes (k8s) for container orchestration. Some of the main benefits include:

Benefits of Kubernetes

Kubernetes is an open-source container orchestration platform that helps automate the deployment, scaling and management of containerized applications. Kubernetes and containerization provide numerous benefits to organizations and developers looking to build and maintain scalable, resilient and portable applications. Here are some of the key benefits of Kubernetes:

  • Containerization: Kubernetes leverages containerization technology, such as Docker, to encapsulate applications and their dependencies into isolated, lightweight units called containers. Containers offer several advantages, including improved resource utilization, easy application packaging and consistent behavior across different environments.

  • Scalability: Kubernetes enables effortless scalability of applications. It allows you to scale your microservices applications horizontally by adding or removing instances, known as pods, based on the workload demands. This helps ensure that your application can handle increased traffic or accommodate higher resource requirements. This improves performance and responsiveness and is especially needed while migrating workload to DevOps.

  • High Availability: Kubernetes supports high availability by providing automated failover and load balancing mechanisms. It can automatically restart failed containers, replace unhealthy instances and distribute traffic across healthy instances. This ensures that your application remains available even in the event of infrastructure or container failures. This helps reduce downtime and improve reliability.

  • Resource Efficiency: Kubernetes optimizes resource allocation and utilization through its advanced scheduling capabilities. It intelligently distributes containers across nodes based on resource availability and workload requirements. This helps maximize the utilization of computing resources, minimizing waste and reducing costs.

  • Self-Healing: Kubernetes has self-healing capabilities which means it automatically detects and addresses issues within the application environment. If a container or node fails, Kubernetes can reschedule containers onto healthy nodes. It can also replace failed instances and even perform automated rolling updates without interrupting the overall application availability.

  • Portability: Kubernetes offers portability, allowing applications to be easily moved between different environments, such as on-premises data centers, public clouds or hybrid setups. Its container-centric approach ensures that applications and their dependencies are bundled together. This reduces the chances of compatibility issues and enables seamless deployment across diverse infrastructure platforms.

  • DevOps Enablement: Kubernetes fosters collaboration between development and operations teams by providing a unified platform for application deployment and management. It enables developers to define application configurations as code using Kubernetes manifests, allowing for version-controlled, repeatable deployments. Operations teams can leverage Kubernetes to automate deployment workflows, monitor application health and implement continuous integration and delivery (CI/CD) pipelines.

Explain the architecture of Kubernetes.

Kubernetes is a complex system, but at a high level, its architecture can be broken down into several main components.

  1. Nodes: The nodes are the machines that run your application containers. Each node can run a subset of the containers in your application, and Kubernetes automatically distributes the containers across the nodes to optimize resource utilization.

  2. Pods: A pod is the basic execution unit in Kubernetes. It's a logical host for one or more containers and provides a way to package and deploy containers together. Each pod represents a single instance of an application and can contain one or more containers.

  3. ReplicaSets: A ReplicaSet ensures that a specified number of replicas (i.e., copies) of a pod are running at any given time. If a pod fails or is terminated, the ReplicaSet will create a new replica to replace it. This ensures that your application is always available, even if one or more of your nodes fail.

  4. Deployments: Deployment is a way to manage the rollout of new versions of an application. It allows you to specify the desired state of your application (e.g., the number of replicas, the container images to use, etc.), and Kubernetes will automatically update the application to match that state. This makes it easy to roll out new features or bug fixes without downtime.

  5. Services: A Service is a logical abstraction over a set of pods that provides a network identity and load balancing. It allows you to access your application through a stable IP address and DNS name, even as the underlying pods change. Services can also be used to expose your application to the outside world, such as through a load balancer or ingress controller.

  6. Persistent Volumes (PVs) and Persistent Volume Claims (PVCs): PVs are storage resources that are provisioned and managed by Kubernetes, while PVCs are used to request storage resources from a PV. This allows you to store data outside of containers, which can be useful for data that needs to persist across container restarts or node failures.

  7. ConfigMaps and Secrets: ConfigMaps store configuration data as key-value pairs, while Secrets store sensitive data such as passwords or API keys. Both can be used to decouple configuration and secrets from the application code, making it easier to manage and rotate them.

  8. Namespaces: Namespaces provide a way to partition resources and isolate workloads in a cluster. This allows you to run multiple applications in the same cluster without conflicts, or to create separate environments for development, staging, and production.

  9. Clusters: A Kubernetes cluster is a set of nodes that run Kubernetes and are managed by a central Kubernetes control plane. The control plane consists of components such as the API server, controller manager, and scheduler, which work together to manage the state of the cluster.

What is a Control Plane?

Kubernetes Control Plane

The control plane is the nerve center that houses Kubernetes cluster architecture components that control the cluster. It also maintains a data record of the configuration and state of all of the cluster’s Kubernetes objects.

The Kubernetes control plane is in constant contact with the compute machines to ensure that the cluster runs as configured. Controllers respond to cluster changes to manage object states and drive the actual, observed state or current status of system objects to match the desired state or specification.

Several major components comprise the control plane: the API server, the scheduler, the controller-manager, and etcd. These core Kubernetes components ensure containers are running with the necessary resources in sufficient numbers. These components can all run on one primary node, but many enterprises concerned about fault tolerance replicate them across multiple nodes to achieve high availability.

Kubernetes API Server

The front end of the Kubernetes control plane, the API Server supports updates, scaling, and other kinds of lifecycle orchestration by providing APIs for various types of applications. Clients must be able to access the API server from outside the cluster, because it serves as the gateway, supporting lifecycle orchestration at each stage. In that role, clients use the API server as a tunnel to pods, services, and nodes, and authenticate via the API server.

Kubernetes Scheduler

The Kubernetes scheduler stores the resource usage data for each compute node; determines whether a cluster is healthy; and determines whether new containers should be deployed, and if so, where they should be placed. The scheduler considers the health of the cluster generally alongside the pod’s resource demands, such as CPU or memory. Then it selects an appropriate compute node and schedules the task, pod, or service, taking resource limitations or guarantees, data locality, the quality of the service requirements, anti-affinity and affinity specifications, and other factors into account.

Kubernetes Controller Manager

There are various controllers in a Kubernetes ecosystem that drive the states of endpoints (pods and services), tokens and service accounts (namespaces), nodes, and replication (autoscaling). The controller manager—sometimes called cloud controller manager or simply controller—is a daemon which runs the Kubernetes cluster using several controller functions.

The controller watches the objects it manages in the cluster as it runs the Kubernetes core control loops. It observes them for their desired state and current state via the API server. If the current and desired states of the managed objects don’t match, the controller takes corrective steps to drive object status toward the desired state. The Kubernetes controller also performs core lifecycle functions.

ETCD

Distributed and fault-tolerant, etcd is an open source, key-value store database that stores configuration data and information about the state of the cluster. etcd may be configured externally, although it is often part of the Kubernetes control plane.

etcd stores the cluster state based on the Raft consensus algorithm. This helps cope with a common problem that arises in the context of replicated state machines and involves multiple servers agreeing on values. Raft defines three different roles: leader, candidate, and follower, and achieves consensus by electing a leader.

In this way, etcd acts as the single source of truth (SSOT) for all Kubernetes cluster components, responding to queries from the control plane and retrieving various parameters of the state of the containers, nodes, and pods. etcd is also used to store configuration details such as ConfigMaps, subnets, and Secrets, along with cluster state data.

What is the difference between kubectl and kubelets.

Kubectl and kubelets are two of the core tools for working with Kubernetes. They serve different purposes and have different responsibilities within a Kubernetes cluster.

Kubectl is the command-line tool for interacting with Kubernetes resources. It provides a unified way to manage and access your cluster, including running commands, creating and managing resources, and accessing cluster information. Kubectl is the primary way to interact with your Kubernetes cluster.

On the other hand, kubelets are components that run on each machine in your cluster and are responsible for managing the cluster's resources and providing a layer of abstraction between the runtime environment and the Kubernetes API.

The kubelet is responsible for running and managing the containers in your cluster, as well as providing a connection to the control plane (which is made up of the API server, controller manager, and scheduler).

In other words, kubectl is the command-line interface for interacting with the Kubernetes cluster, while kubelets are the components that manage the cluster's resources and provide a connection to the control plane.

Explain the role of the API server.

The API server is the front-end of the Kubernetes control plane and is the central hub for all communication between users, components, and external systems. It exposes a RESTful API that allows users to create, read, update, and delete (CRUD) Kubernetes objects. The API server also validates and authenticates requests, and performs admission control checks to ensure that requests are authorized and do not violate cluster policies.

the API server performs the following key roles:

  • Provides a unified interface for managing Kubernetes clusters. The API server is the single point of entry for all requests to manage a Kubernetes cluster. This allows users to interact with the cluster using a consistent set of commands and tools, regardless of the underlying implementation.

  • Validates and authenticates requests. The API server ensures that requests are well-formed and that they are authorized to be performed. This helps to protect the cluster from unauthorized access and malicious activity.

  • Performs admission control checks. The API server can be configured to perform admission control checks, which are used to validate requests before they are applied to the cluster. This can be used to enforce cluster policies, such as resource quotas or security constraints.

  • Stores the desired state of the cluster. The API server stores the desired state of the cluster, which is the set of Kubernetes objects that the cluster should maintain. This information is used by the other components of the Kubernetes control plane to ensure that the cluster is in the desired state.

  • Provides a watch mechanism for monitoring changes to the cluster. The API server provides a watch mechanism that allows users to be notified of changes to the cluster state. This can be used to implement event-driven workflows.

Overall, Kubernetes is a tool that provides a powerful and flexible platform for managing containerized applications, and its architecture is designed to provide scalability, reliability, and automation for modern software development and deployment practices.