Kubernetes Architecture in VMware Terms (Part 2)
- Steve Younger
- May 8
- 23 min read

Continuing our “Kubernetes for VMware Administrators” series, Part 2 builds on Part 1 – From Virtual Machines to Containers: Kubernetes Basics. In Part 1, we introduced containers and Kubernetes using VMware concepts you know and love. Now, we’ll dig deeper into Kubernetes architecture itself – breaking down its control plane and worker node components through a VMware vSphere lens. By mapping the Kubernetes internals to familiar vCenter, ESXi, DRS, and HA concepts, you’ll solidify your understanding of how Kubernetes is built and gain confidence to navigate this powerful platform with your existing virtualization expertise.

Figure 1: High-level comparison of a VMware vSphere cluster (left) and a Kubernetes cluster (right). In vSphere, a vCenter Server (with its database) centrally manages a cluster of ESXi hosts (hypervisors) organized with features like HA and DRS, and VMs can be grouped into Resource Pools. In Kubernetes, the Control Plane (“Kubernetes Master”) manages a cluster of Nodes (worker machines), and workloads (containers) are grouped into Namespaces. Both platforms integrate networking (VMware’s NSX-V/NSX-T vs. Kubernetes CNI) and storage provisioning (datastores vs. volumes via CSI), but with different approaches suited to VMs vs. containers.
Kubernetes Control Plane – The Brains of the Cluster (vCenter Analogy)
Think of Kubernetes’s control plane as the vCenter Server of the container world. Just as vCenter is the central authority managing all ESXi hosts and VMs in a vSphere environment, the Kubernetes control plane is the central brain managing all nodes and pods in the cluster. It’s often said that Kubernetes is essentially the vCenter for containers – an orchestrator that automates deployment, scheduling, and the desired state of containerized applications across your infrastructure. Let’s break down the key control plane components one by one and map them to VMware terms you’re familiar with:
API Server – Kubernetes Front Door (Like vCenter Service & UI)
The API Server is the core communication hub and entry point for all operations in a Kubernetes cluster. It’s a web service (typically accessible on port 6443) that clients interact with using RESTful API calls (e.g. via kubectl or dashboards). In VMware terms, the API server is akin to the vCenter Server service API. It plays a similar role to vCenter’s UI and API endpoint – it’s where you (or automation tools) send requests to deploy a workload, change configurations, or query the state of the system.
Authentication & Authorization: The API server handles who can do what, much like vCenter’s Single Sign-On and role-based permissions. Kubernetes can integrate with certificates, tokens, or external identity providers to authenticate users, then uses RBAC (Role-Based Access Control) to decide if a user is allowed to perform a given action. This is conceptually similar to how vCenter uses SSO/AD integration and roles—ensuring only authorized folks can e.g. create a pod (VM) or modify settings.
Validation & Admission: When a request comes in (say, to create a new pod), the API server validates it and may run it through admission controllers (think of these as automatic “approval gates” or policies, somewhat like vCenter checks compliance or DRS rules before allowing an action). Only well-formed, authorized requests make it through.
Central Command Center: Just as all vSphere management tasks funnel through vCenter, all cluster changes in K8s funnel through the API server. Behind the scenes, when you request a new pod, the API server writes the intended state to the cluster store (etcd) and then signals to other components (like the scheduler or controllers) that there’s work to do. In short, it’s the traffic cop that coordinates the rest of the control plane. If the API server is down, it’s like vCenter being down: your running workloads (VMs or pods) aren’t affected immediately, but you can’t make new changes or see cluster state until it’s back.
etcd – Cluster State Database (vCenter DB Equivalent)
Kubernetes uses etcd as its source-of-truth datastore. etcd is a distributed key-value database that stores the entire state of the cluster: configurations, node statuses, what pods exist and where, service details, secrets, and more. If you’re a vSphere admin, you can think of etcd as the Kubernetes equivalent of the vCenter Server database. Just as vCenter’s database (vPostgres or an external Oracle/SQL DB in older setups) keeps the inventory of datacenters, hosts, VMs, resource pools, configurations, and so on, etcd keeps the canonical data for Kubernetes objects.
Key points about etcd and how it compares to a vCenter DB:
Distributed and Consistent: etcd is usually run as a clustered service (commonly on 3 or 5 control plane nodes for HA). It uses a consensus algorithm (Raft) to ensure that all copies of the data agree. This is like having a highly-available clustered database for vCenter – whereas a traditional vCenter might rely on a single DB instance or an active/passive replication setup, Kubernetes’s etcd always runs as a quorum of multiple members, so there isn’t a single point of failure for the data. If one etcd instance goes down, the others continue to serve data (as long as a majority of members are up).
Stores Cluster State: etcd holds a plethora of small key-value entries representing the desired and current state of the cluster. For example, there will be keys for each Node and Pod and Namespace, etc. In vCenter, you might query the DB (through vCenter) to list all VMs in a resource pool; in Kubernetes, the API server queries etcd to list all pods in a namespace. etcd is optimized for quick reads/writes of this state data and to survive network or node failures without losing data.
Backing up etcd: Much like you’d take backups of your vCenter database (because if you lose it, you lose your vCenter’s knowledge of the environment), etcd data should be backed up regularly. etcd is the only stateful part of the Kubernetes control plane – everything else (API server, controllers, scheduler) can be restarted or replicated relatively easily if etcd’s data is intact. The same is true in VMware: if you have to rebuild vCenter, the critical piece is restoring its database to recover their cluster configuration.
In summary, etcd in Kubernetes serves a similar purpose to the vCenter database – a persistent store of cluster configuration and state – but etcd is built to be distributed and fault-tolerant from the get-go, reflecting Kubernetes’s cloud-native, highly available design.
Controller Manager – Automated Control Loops (Parallel to vCenter DRS/HA Functions)
The kube-controller-manager is a background process in the control plane that houses multiple controllers – each is a control loop that watches the state of the cluster (via the API server/etcd) and takes action to maintain or adjust the cluster to match the desired state. If that sounds abstract, think of VMware vCenter’s automation features: DRS, vSphere HA, and others constantly monitor the environment and initiate actions like migrating a VM, powering on a VM on another host after a failure, etc., to keep the environment healthy. In Kubernetes, the controller manager is effectively the “automation brain” that embeds all those little operational brains (controllers) for the cluster.
Some important controllers and their VMware analogies:
Node Controller: Watches the health of worker nodes. If a node “goes dark” (stops reporting in), after a timeout it marks that node as unavailable. This is analogous to vSphere HA’s Fault Domain Manager agent noticing a host failure (or isolation). In vSphere, HA will respond by restarting VMs that were on the failed host elsewhere. In Kubernetes, the Node Controller works with other controllers to ensure any pods that were on the lost node get rescheduled on healthy nodes (more on that in a moment).
ReplicaSet/Deployment Controller: Ensures the desired number of pod replicas are running for a given application. For example, if you said you want 3 replicas of a web server, and one goes down, this controller will notice (via etcd state) only 2 are running and will spawn a new one to get back to 3. This is very much like vSphere HA restarting a failed VM, except Kubernetes achieves it by starting a new pod (container) rather than rebooting the same VM. It’s also akin to running multiple instances (clones) of a VM for reliability – Kubernetes will maintain the count automatically.
Scheduler Hints & Others: The controller manager includes other controllers (like one that creates Endpoints for Services, handles persistent volume bindings, etc.), but another noteworthy one is the Service Account & Token Controller (which creates credentials for pods to talk to the API server) – not a VMware analog per se, but think of it like vCenter automating certificate or identity assignments to ESXi hosts. In essence, each controller has a focused duty (much like different vCenter services or VMware appliances handle specific tasks: e.g., vCenter’s DRS vs. vSphere Update Manager). The kube-controller-manager bundles all these into a single process for efficiency.
It’s worth noting that in a highly available setup, Kubernetes will run a controller manager instance on each control plane node, but only one is active at a time (leader elected). This is similar to how vCenter HA might have an active and a passive node – only the active vCenter is doing work at any moment. If the active controller manager fails, a standby takes over leadership to continue managing the cluster.
Scheduler – Pod Placement Decider (Distributed Resource Scheduler for Pods)
The kube-scheduler is Kubernetes’s equivalent of DRS (Distributed Resource Scheduler) in VMware. When a new pod needs to be placed (for example, a user just created a Deployment that requires a new pod, or a controller needs to reschedule a pod), the scheduler’s job is to decide which node that pod should run on. In VMware vSphere, when you power on a VM in a DRS-enabled cluster, DRS will suggest or automatically choose a host that has the resources and meets any rules/constraints for that VM. The Kubernetes scheduler does the same for pods:
Resource & Constraints Awareness: The scheduler looks at each candidate node’s available resources (CPU, memory, etc.) and also any specific requirements of the pod (for instance, “this pod needs a node with SSD storage” or “this pod can only run on Linux nodes with GPU support” or even anti-affinity rules like “don’t place on the same node as another pod X”). This is similar to how DRS considers host capacity and things like affinity rules or host tags when placing VMs.
No Live Migrations (Reactive vs. Proactive): One key difference: once a pod is scheduled and running, Kubernetes won’t live-migrate it elsewhere the way DRS might vMotion a running VM to balance load. If a node becomes overloaded, K8s doesn’t automatically shuffle running pods around (there is no built-in vMotion equivalent). Instead, the scheduler’s role is mostly initial placement. That said, if a node fails or a pod is evicted, the scheduler will step in to find a new home for the replacement pod. So it’s more reactive. In VMware terms, it’s as if DRS only made placement decisions at VM power-on or in response to a host failure, but didn’t constantly rebalance VMs just for load. (Newer K8s features and add-ons can simulate proactive rebalancing, but that’s beyond our scope).
Extensible: The default scheduler works for most cases, but Kubernetes allows custom schedulers or scheduling policies. Think of it like how DRS can be configured with certain policies (fully automated vs. manual, migration thresholds, etc.), or even replaced by a different scheduler in specialized environments (though that’s rare in VMware). In Kubernetes, you could have multiple schedulers for different types of workloads if needed.
Just as vCenter’s DRS automates the placement of VMs to optimize cluster performance and respect constraints, the Kubernetes scheduler automates the placement of pods to appropriate nodes, ensuring efficient use of cluster resources and compliance with any scheduling rules.
Bringing the Control Plane Together
All these control plane components – API server, etcd, controller manager, scheduler (and a cloud-controller-manager if you’re in a cloud environment) – typically run on one or more control-plane nodes. In a simple cluster (like a lab or dev setup), you might have all of them on a single VM or machine (one control-plane node). In production, you’ll often have multiple control-plane nodes for redundancy (more on that in the High Availability section).
To recap in VMware terms: vCenter Server = API Server + Controller Manager + Scheduler (plus etcd). Kubernetes splits the responsibilities into separate pieces, but together they form the “management plane” just like vCenter is the management plane for vSphere. They handle the cluster-wide decisions and state, but they do not run the actual application workloads – that’s the job of the worker nodes (ESXi hosts analog), which we’ll discuss next.
Kubernetes Worker Nodes – The Workhorses (ESXi Host Analogy)
If the control plane is the brains, the worker nodes are the brawn of the Kubernetes cluster – much like ESXi hosts in a vSphere cluster. Each node (which can be a physical server or a VM) provides the CPU, memory, storage, and network capacity to actually run your container workloads (pods). And just as ESXi has specific components (like the VMkernel, hostd, vpxa, virtual switches, etc.) that enable it to run VMs and communicate with vCenter, Kubernetes nodes have their own key components that enable them to run containers and stay in sync with the control plane.
Let’s break down the main worker node components and compare them to what you know in VMware:
Kubelet – Node Agent (Like vCenter’s vpxa/hostd on ESXi)
The kubelet is a small but critical agent that runs on every Kubernetes node. Its job is to listen for instructions from the control plane (via the API server) and ensure that the containers (pods) assigned to that node are running and healthy. In VMware terms, the kubelet combines roles similar to vpxa (vCenter agent) and hostd (host daemon) on an ESXi host:
In vSphere, hostd runs on each ESXi host, managing local VM operations, and vpxa is the agent that communicates between vCenter and the host. When you, through vCenter, say “power on this VM,” vCenter (vpxd) tells vpxa on the host, which in turn instructs hostd to perform the action on the hypervisor.
In Kubernetes, the kubelet fulfills both roles: it registers the node with the control plane, reports the node’s status, and receives pod assignments from the API server. When the control plane decides “Pod X should run on Node Y,” the API server contacts the kubelet on Node Y (essentially telling it “ensure Pod X is running there”). The kubelet then works with the container runtime to launch the pod’s containers on its node (equivalent to hostd creating or starting a VM) and later monitors their health.
The kubelet continuously watches for any pods that should run on its node (this info is stored in etcd and communicated via the API server) and makes sure they are running. If a container crashes, kubelet notices and reports back, and may try to restart it (or signals the controller manager to take action). It also reports resource usage and other stats to the control plane. In short, kubelet is the node’s caretaker on behalf of the Kubernetes control plane, much like the vCenter agents on ESXi hosts that do the bidding of vCenter.
Analogy summary: Just as an ESXi host won’t do much in a cluster without vCenter telling it what to run (and reporting back via vpxa/hostd), a Kubernetes node relies on the kubelet to know what pods to run and to keep the control plane updated. No kubelet = no cluster control over that node.
Container Runtime – Running Containers (Hypervisor Analog)
Every Kubernetes node has a container runtime installed – this is the software that actually knows how to run containers on that machine. You’ve likely heard of Docker; in Kubernetes, common runtimes today include containerd(which is the core of Docker’s runtime, now often used directly) or others like CRI-O. The container runtime is to containers what the ESXi hypervisor is to VMs:
Provisioning and Isolation: The container runtime takes an image (the container equivalent of a VM template) and instantiates it as a running container, providing it with its filesystem, network interfaces, and isolating it at the process level. Similarly, ESXi takes a VM (with its virtual disks and configuration) and runs it, providing virtual CPU, memory, virtual NICs, etc., isolating the VM at the hardware level.
Resource Enforcement: The runtime works with the host OS kernel (using things like cgroups and namespaces in Linux) to enforce resource limits on containers (CPU quotas, memory limits, etc.), akin to how a hypervisor enforces the vCPU and RAM allocations for a VM and uses scheduling to share physical CPUs among VMs.
Interaction with Kubelet: The kubelet tells the container runtime what to do (e.g., “start a container using image X with these settings”), just like vCenter via vpxa/hostd would tell the hypervisor “start this VM using this disk and these vCPU/RAM settings.” The runtime then creates and runs the container and reports status (success, failure, etc.) back up.
From a VMware admin perspective, you can imagine the container runtime as a lightweight hypervisor that runs processes instead of full VMs. On vSphere, this could even be running inside a VM (if your Kubernetes nodes are VMs on ESXi, then effectively you have ESXi as the physical hypervisor, and each node VM running Docker/containerd is like a nested “hypervisor” for containers – but that’s an implementation detail; logically Kubernetes abstracts that away).
Importantly, the container runtime is usually not one monolithic thing you interact with in Kubernetes – it’s underlying plumbing. You interact with Kubernetes (via API server), and Kubernetes through kubelet interacts with the runtime. This is unlike vSphere where you might sometimes directly interact with an ESXi host (for example, connecting to a host to do something if vCenter is down). In Kubernetes, you almost always go through the control plane for operations, not directly to the runtime.
Decoded Deep-Dive
Containerd & runc – The Two-Layer Runtime Stack
Modern Kubernetes nodes almost always use containerd or CRI-O with runc under the hood. If you picture ESXi running each VM as a separate VMX process, containerd (the daemon) is like the hypervisor manager, while runc (one binary per container) is the lightweight VMX that actually spawns and isolates the workload.
Layer | What It Does | VMware Analogy |
kubelet ↔ CRI interface | kubelet talks to the runtime over the Container Runtime Interface (CRI) gRPC API. | vCenter → vpxa communication |
containerd (system daemon) | Pulls OCI images, manages snapshots (copy-on-write layers), handles image unpacking, keeps metadata, and exposes a CRI plugin. | VMkernel + hostd (managing VM lifecycle) |
shim-v2 | For each pod sandbox (pause container) it forks a shim that survives containerd restarts. | VMX parent process |
runc (per-container binary) | Implements the OCI runtime-spec: uses clone, cgroups, and namespaces to create the container, then exits. | VMX child that actually executes the guest OS process |
Flow in practice
kubelet receives a Pod spec and calls RunPodSandbox() and CreateContainer() over CRI.
containerd pulls the image layers (if missing), unpacks them with the configured snapshotter (e.g., overlayfs), and launches a shim.
The shim execs runc, passing it the OCI-generated config.json.
runc sets up namespaces, cgroups, mounts, and executes PID 1 of the container.
When the container exits, runc is gone; the shim reports back to containerd, which reports status to kubelet.

Why two layers?
containerd handles heavy-weight lifecycle (pull, cache, statistics, CRI) while runc stays tiny, security-auditable, and purely focused on the low-level clone() / pivot_root() work—much like ESXi separates VM management services from the thin VMX process that actually runs each VM.
💡 Tip for troubleshooting:
ctr --namespace k8s.io containers ls shows containers at the containerd layer, while runc list (inside the node) reveals the runc-level state—handy when a pod appears stuck between “Created” and “Running.”
This extra detail gives VMware admins a mental model that mirrors what they know:
containerd ≈ hostd/VMkernel layer – manages downloads, snapshots, and lifecycle.
runc ≈ VMX – quick, per-container executor that becomes the running payload.
Kube-Proxy – Network Traffic Manager (Node’s Virtual Switch/Router)
The third key component on nodes is kube-proxy, which is often overlooked but essential. kube-proxy runs on each node and is responsible for implementing Kubernetes networking rules on that node, particularly for Services (the abstraction that gives a stable IP or DNS name to a set of pods). In VMware terms, kube-proxy is somewhat analogous to the virtual switch (vSwitch) or distributed virtual switch on each ESXi host, possibly combined with a bit of load balancer logic:
Service Routing: Kubernetes Services define how to reach a group of pods (say, all pods labeled “app=web”). When a Service has a virtual IP (cluster IP) or port, kube-proxy ensures that any traffic hitting that node for that Service gets forwarded to one of the pods backing that Service (wherever they may be). It does this by programming iptables rules (or IPVS or similar) on the node. This is similar to how a VMware distributed switch or NSX might ensure packets destined for a particular network or load balancer VIP get forwarded to the correct VM. In effect, kube-proxy provides a distributed load balancing across the cluster: every node knows how to forward service traffic to the right pod endpoints, whether those pods are on the same node or another node.
East-West Networking: You can think of the collection of kube-proxies and the container network (managed by CNI plugins) as the networking fabric of the cluster, analogous to how VMware’s virtual networking (standard vSwitches, distributed switches, or NSX) provides connectivity between VMs and segments. Each Kubernetes node, via kube-proxy, participates in routing traffic for cluster services. For example, if a pod on NodeA needs to talk to a Service IP that ultimately targets pods on NodeB, NodeA’s kube-proxy helps route that traffic to NodeB transparently.
Security and Policies: While the basic kube-proxy doesn’t enforce security (that’s done by network policy enforcement via the CNI or other tools, analogous to a distributed firewall in NSX), it is a piece of the data plane on each node ensuring network connectivity works. In vSphere, aside from the vSwitch, you might deploy an NSX distributed router or firewall on each host for advanced networking – in Kubernetes, those would correspond to the CNI plugins and network policy controllers, which go beyond kube-proxy’s role. For our purposes, keep in mind kube-proxy is the part of Kubernetes that ensures services and networking work seamlessly across the cluster, much as VMware’s virtual networking ensures each VM can reach others or be part of a load-balanced service.
To sum up the node components in VMware terms: an ESXi host = Node (kubelet + runtime + kube-proxy). The kubelet is like the management agent (taking orders from the “vCenter” of K8s), the container runtime is like the hypervisor running workloads, and kube-proxy + CNI networking is like the virtual switch and network stack on the host enabling connectivity.
Each node in Kubernetes, just like each ESXi host, is somewhat self-sufficient in running workloads; even if the control plane is temporarily unreachable, the node will continue running its pods. But the intelligence (what to run, when to restart something, where to send traffic) comes from the combination of these node components working in concert with the control plane.
Multi-Tenancy and Resource Segmentation:
Namespaces vs. Resource Pools
In vSphere, we use Resource Pools (and folders or vApps) to organize and partition resources among different projects, departments, or applications. For example, you might have a Production resource pool and a Development resource pool, each with certain resource limits or shares, to prevent one group of VMs from starving another. Kubernetes approaches multi-tenancy and resource management with a concept called Namespaces.
Namespaces in Kubernetes are a logical partitioning of the cluster into virtual sub-clusters, intended to group and isolate resources. Here’s how Namespaces compare to Resource Pools and similar constructs in VMware:
Logical Grouping: A Namespace lets you collect related Kubernetes objects—pods, deployments, services, and so on—under a single name. It’s similar to using a Resource Pool or folder in vCenter to gather VMs that belong to a specific business unit or application. For instance, you might create separate namespaces for individual teams (“team-alpha”, “team-beta”) or for functional domains (“payments”, “analytics”). Object names need only be unique within their own namespace, so two different namespaces can each contain a pod or service called “web-server” without any conflict—just as two resource pools can each house a VM with the same name.
Resource Quotas vs. Reservations/Limits: While simply creating a Namespace doesn’t by itself carve out hard resource reservations, Kubernetes allows you to define Resource Quotas at the namespace level. A Resource Quota can limit how much CPU, memory, or number of certain objects the namespace can use. This is conceptually similar to setting limits or reservations on a Resource Pool (e.g., max 100 GHz CPU, 200 GB RAM for all VMs in that pool). For example, you could say the “dev” namespace is limited to 10 CPU cores and 20 GB of memory usage cluster-wide. In VMware, you might reserve a chunk of resources for a resource pool; in Kubernetes, you enforce usage caps with quotas, ensuring one team can’t consume everything.
Isolation and Security: Namespaces also provide a security boundary. You can apply RBAC rules per namespace (e.g., the dev team’s account can only view/manipulate resources in the “dev” namespace, not in “prod”). This is similar to how in vCenter you might set permissions such that a user or group can only see or manage VMs in a certain folder or resource pool. Additionally, Kubernetes network policies can be applied per-namespace to restrict traffic, somewhat analogous to NSX distributed firewall rules applied to groups of VMs. So Namespaces go beyond just resource splitting – they are a unit of tenancy and security segmentation as well.
Not a Hardware Partition: It’s important to note that a Namespace doesn’t tie to specific nodes or hardware partitions; it’s a logical segmentation. In contrast, a Resource Pool in VMware actually carves out (or shares) physical host resources. Kubernetes Namespaces share the cluster’s nodes; they’re not like creating separate clusters. If you truly need complete isolation (e.g., different clusters for different teams), that’s like having separate vSphere clusters. But within one Kubernetes cluster, Namespaces give a reasonably strong isolation for most use cases, just as resource pools/folders provide isolation and organization within one vSphere cluster.

In short, a Kubernetes Namespace is like a lightweight Resource Pool + folder combination: it groups workloads for organizational and access control purposes and can enforce resource policies. It enables multi-tenancy in a single cluster by isolating teams or applications from each other (to a point), much as you might use resource pools or separate vApps to group VMs in vCenter. This lets a single Kubernetes cluster be safely shared, just as a single VMware cluster can host multiple projects without interference.
💡 Fun fact for VMware folks
vSphere with Tanzu, VMware’s Kubernetes integration, actually uses the term “Namespaces” as well – vSphere Namespaces – which under the hood maps to Kubernetes namespaces and resource pools to provide a similar concept for provisioning Kubernetes workloads in vSphere. It’s a nice convergence of terminology!
etcd vs. vCenter Server Database – Revisited
We touched on this in the control plane section, but it’s worth highlighting directly: etcd vs. the vCenter DB. Both are critical databases that store the state of the system, but they differ in design:
Type of Data: Both etcd and the vCenter database store configuration and state. In vCenter’s DB you have tables for hosts, VMs, clusters, performance stats, etc. etcd stores key-value pairs for nodes, pods, configs, and the entire Kubernetes object model. If you were to browse etcd (not that you normally do directly), you’d see keys like /nodes/node-1 or /pods/default/web-abc123 with corresponding values (structured data blob). A vCenter DB table might have a row per VM with columns for its name, UUID, host, etc. Different formats, same purpose: record what is and what should be running in the environment.
Consistency and HA: etcd is strongly consistent and requires a quorum of nodes; it’s designed to be distributed. The vCenter DB historically has been a single-instance (with failover techniques or an external HA DB cluster if you set that up). If you lose etcd quorum, the Kubernetes control plane is effectively brain-dead until it comes back (much like losing the vCenter DB means vCenter is down). VMware introduced vCenter HA in later releases to mitigate the DB/vCenter as a single point of failure by having a passive standby, but Kubernetes from day one assumes the data store is clustered and resilient. This reflects the philosophies: VMware added HA on top of a mostly centralized system, whereas Kubernetes built the control plane to be distributed from the start.
Performance and Scale: etcd is optimized for the kind of data Kubernetes needs – lots of small updates (like status of pods) and high read concurrency, with features like watch (so components can watch for changes instead of polling). The vCenter DB, being a general relational database, is very capable but might not be as lean for rapid tiny updates. This is one reason Kubernetes didn’t just use a SQL DB – etcd’s key-value and watch mechanism is very suited for config data that controllers continuously read and write. As an admin, you don’t necessarily feel this difference directly, but you might notice how Kubernetes can scale to thousands of pods with lots of state changes and still keep the control plane responsive – etcd is a big part of that.
Bottom line: etcd = vCenter database in purpose, but etcd is distributed, lightweight, and specialized for Kubernetes’s state, whereas the vCenter DB is typically a single centralized SQL database. For our mental model, treat etcd just like you treat the vCenter DB – precious data that you must protect and back up, because everything else in the control plane depends on it.
High Availability in the Control Plane – No Single Point of Failure
In enterprise VMware setups, you never want vCenter to be a single point of failure. VMware addressed this with vCenter High Availability (HA), which can run a backup vCenter Server that takes over if the primary fails. Kubernetes goes one step further – its control plane was designed to be run in an HA fashion from the ground up. Here’s how Kubernetes achieves high availability for the control plane, and how that compares to VMware’s approach:
Multiple Control Plane Nodes: In a production Kubernetes cluster, you’ll typically have multiple control plane nodes (commonly 3, but it could be more for larger setups, always an odd count to maintain quorum for etcd). Each control plane node runs an instance of the API Server, controller manager, scheduler, and etcd (in a “stacked” configuration) or etcd might be on its own separate nodes – but let’s consider the common case where each control plane node has its own etcd member. All API server instances are active-active, serving requests (usually fronted by a load balancer so you have a single API endpoint). The controller managers and schedulers coordinate via leader election (so one active at a time as mentioned). This highly available setup means if one control plane node VM or server dies, the others continue handling requests and controlling the cluster.
Compare to vCenter HA: vCenter HA (when enabled) typically has one active vCenter and one passive that’s constantly replicating data, plus a witness. If the active fails, the passive takes over (after a brief pause). That’s an active-passive model. Kubernetes control plane nodes are more like an active-active cluster: all API servers can serve at once, and the data store (etcd) is truly distributed across them. There’s no manual failover needed; it’s naturally handling node outages. In effect, Kubernetes aims for zero downtime in the control plane – you can lose one or even a few control plane nodes and still operate (as long as a majority of etcd members are up). With vCenter HA, you still have a brief failover time, and without vCenter HA, a vCenter outage means no management (though HA/DRS on cluster might still operate in limited fashion via agents).
Etcd Redundancy: etcd’s quorum design is crucial. For example, with 3 etcd nodes, any 1 can fail and the cluster still functions (2 out of 3 quorum). With 5, you can lose 2. etcd will also prevent “split-brain” scenarios by requiring quorum for any writes. VMware’s equivalent would be an external clustered database or using something like Oracle RAC or always-on SQL cluster for vCenter – which some deployments did in the past. But etcd’s clustering is seamlessly integrated and usually managed by Kubernetes installation tools (like kubeadm or cloud providers) – you as an admin just need to know it’s there and to plan for an odd number of control plane nodes.
Staying Online During Upgrades: Another angle of HA – Kubernetes can often be upgraded one control plane node at a time, with the others keeping the cluster running (much like how you might upgrade an ESXi host in a vSphere cluster one by one to avoid total downtime, or how vCenter HA can be upgraded with minimal downtime). This rolling upgrade ability of control plane nodes is a nice bonus of having multiple instances. In VMware, upgrading vCenter is usually a maintenance window where vCenter is offline for a bit (though workloads are fine). In Kubernetes, you can usually keep the API available to users even during an upgrade by staggering it.
Worker Node Independence: It’s worth noting that even if you lost the entire Kubernetes control plane (imagine all control nodes down) – the containers on the worker nodes continue to run (just like VMs keep running if vCenter goes down). Pods won’t suddenly crash because the control plane died. You just can’t schedule new ones or enforce changes until the control plane is back. Similarly, vSphere HA won’t be able to restart VMs if a host fails while vCenter is down (if HA wasn’t already configured), but existing VMs keep running on their hosts. Kubernetes takes it further: controllers running on the control plane will try to self-heal the control plane components too (for example, if the API server process crashes on a control plane node, processes like kubelet on that node or external automation can restart it; often, Kubernetes control plane components run as static pods or systemd services that auto-restart). Essentially, Kubernetes is built to expect failures and recover from themin an automated way.
In practice, as a Kubernetes admin (coming from VMware), you would set up a highly available 3 node control plane, either as VMs on vSphere or bare-metal, put them behind a virtual IP or load balancer for the API endpoint, and you’ve got a highly available control plane. This would be analogous to deploying the vCenter HA cluster or even multiple vCenters (linked mode) for resiliency – but again, Kubernetes makes this a standard, not an add-on.
Conclusion and What’s Next
In this article, we took a deep dive into Kubernetes’s architecture through VMware-tinted glasses. We saw that the Kubernetes control plane – composed of the API server, etcd, controller manager, and scheduler – plays a role very much like vCenter Server and its associated services, acting as the brain of the cluster. The worker nodes (with kubelet, container runtime, and kube-proxy) serve a role analogous to ESXi hosts, actually running workloads and handling networking under the control plane’s guidance. Along the way, we mapped Kubernetes concepts to VMware ones: for instance, Namespaces are like Resource Pools for organizing and slicing resources, and etcd is the cluster database like vCenter’s DB, albeit more distributed and fault-tolerant. We also discussed how Kubernetes builds high availability into its DNA, running multiple control plane nodes to avoid any single point of failure – a familiar goal for vSphere admins who rely on features like HA and DRS to keep the environment resilient.
The key takeaway is that Kubernetes isn’t an entirely foreign new world. Its design reflects many of the same principles VMware admins have dealt with for years in vSphere: a centralized management plane, nodes that do the work, mechanisms for scheduling, load-balancing, and self-healing, and constructs to organize and protect workloads. By leveraging what you already know (vCenter, ESXi, clusters, HA/DRS, etc.), you can quickly get comfortable with Kubernetes architecture. The specifics are different (and yes, you’ll need to learn Kubernetes YAML, kubectl commands, and new lingo), but the patterns – central control, desired state enforcement, monitoring and recovery, resource sharing – are much the same.
Up Next: In Part 3 of Kubernetes for VMware Admins, we’ll tackle networking. We’ll explore how containers communicate and how Kubernetes Services and Ingress provide load balancing and routing, drawing parallels to VMware networking concepts (virtual switches, distributed switches, load balancers, and even NSX). So stay tuned! By continuing this journey, you’ll be well-equipped to bridge your virtualization knowledge into the realm of containers and Kubernetes, enabling you to build, deploy, and innovate with confidence in a hybrid VMware-Kubernetes world.
Comments