Part 4 Conceptual Complete

Part 4: "Scaling the Team" — ReplicaSets, Deployments, and Rolling Updates

"Scaling the Team" — ReplicaSets, Deployments, and Rolling Updates

June 28, 2025 11 min read

The Morning After the Launch

Alex, a seasoned backend engineer with a decade of experience wrestling with monolithic architectures and virtual machines, took a long sip of coffee. The adrenaline from NovaCraft’s product launch the previous night was starting to wear off, replaced by a familiar, nagging anxiety. The launch had been a success—a wild success, in fact. The user numbers were climbing faster than anyone had anticipated, and the congratulatory messages were still buzzing on the company’s Slack channels. But for Alex, the celebration was muted by a growing concern.

The single Pod running their core API, the one Alex had so carefully crafted and containerized, was starting to show signs of strain. CPU and memory usage were steadily creeping up, and the latency graphs were beginning to paint a worrying picture. It was a classic scaling problem, one Alex had faced many times before in the world of VMs. But this was Kubernetes, a whole new beast. Spinning up another VM and manually configuring it wasn’t the answer here. There had to be a better way, a Kubernetes way.

Just then, Sarah, the engineering lead, walked over, a wide grin on her face. “Alex, you’re a lifesaver! The launch was incredible. But we’ve got a good problem on our hands. We need to scale. Fast.”

Alex nodded, a sense of determination replacing the anxiety. “I was just looking at the monitoring dashboards. The API is getting hammered. I’m ready to learn how to scale this thing properly.”

Sarah’s grin widened. “Excellent. Let’s talk about ReplicaSets and Deployments. It’s time to scale your one-person team into an army.”

From One to Many: The Power of Replication

In the world of software, having a single point of failure is a recipe for disaster. A single server can crash, a single process can hang, and a single Pod can be overwhelmed by traffic. The solution, in principle, is simple: run multiple copies of your application. If one copy fails, the others can pick up the slack. This is the core idea behind high availability and scalability.

In Kubernetes, the concept of running multiple identical Pods is managed by an object called a ReplicaSet. A ReplicaSet’s primary job is to ensure that a specified number of Pod replicas are running at any given time. If a Pod goes down, the ReplicaSet Controller will automatically create a new one to replace it. It’s like having a vigilant team manager who ensures that you always have the right number of team members on duty.

The Analogy: The Pizza Shop

Imagine you own a pizza shop. On a normal day, you might only need one chef to handle the orders. But on a Friday night, a single chef would be quickly overwhelmed. To handle the rush, you’d bring in more chefs. A ReplicaSet is like the shift manager who ensures you always have the right number of chefs in the kitchen. If one chef gets sick and goes home, the manager immediately calls in a replacement.

The Technical Deep-Dive: Understanding ReplicaSets

A ReplicaSet is defined by a YAML file, just like a Pod. The key components of a ReplicaSet definition are:

replicas: This field specifies the desired number of Pods.
selector: This field defines how the ReplicaSet identifies which Pods to manage. It uses labels to match the Pods.
template: This field contains the Pod definition that will be used to create new Pods.

Here’s a simple example of a ReplicaSet that ensures three replicas of a nginx Pod are always running:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2

In this example, the replicas field is set to 3, so the ReplicaSet will ensure that there are always three Pods running with the label app: nginx. The selector field tells the ReplicaSet to look for Pods with this label. The template field defines the Pod that will be created, which includes the same label.

Deployments: The Declarative Way to Manage ReplicaSets

While ReplicaSets are great for ensuring a certain number of Pods are running, they are rarely used directly. Instead, Kubernetes provides a higher-level object called a Deployment to manage ReplicaSets. A Deployment provides declarative updates to Pods and ReplicaSets.

What does “declarative” mean in this context? It means that you describe the desired state of your application in a YAML file, and the Deployment Controller will take care of making the actual state of the system match your desired state. You don’t have to worry about the individual steps of creating, updating, and deleting Pods and ReplicaSets. You simply declare what you want, and Kubernetes makes it happen.

The Analogy: The Restaurant Chain

If a ReplicaSet is like a shift manager at a single pizza shop, a Deployment is like the regional manager of a restaurant chain. The regional manager doesn’t just care about one shop; they care about all the shops in their region. They decide when to open new shops, when to renovate existing ones, and when to close down underperforming ones. They don’t get involved in the day-to-day management of each shop; they simply set the overall strategy and let the shift managers execute it.

Similarly, a Deployment manages the entire lifecycle of your application. It allows you to:

Create and scale a set of Pods.
Perform rolling updates to your application without any downtime.
Roll back to a previous version of your application if something goes wrong.

The Technical Deep-Dive: Understanding Deployments

A Deployment manifest looks very similar to a ReplicaSet manifest. The key difference is that a Deployment manages a ReplicaSet, which in turn manages the Pods. Here’s an example of a Deployment that manages a ReplicaSet of nginx Pods:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2

When you create this Deployment, the Deployment Controller will create a ReplicaSet, and the ReplicaSet Controller will create three Pods. If you then update the image in the Deployment to nginx:1.16.1, the Deployment Controller will create a new ReplicaSet with the updated image and then gradually scale down the old ReplicaSet while scaling up the new one. This process is called a rolling update.

Rolling Updates and Rollbacks: Zero-Downtime Deployments

One of the most powerful features of Deployments is the ability to perform rolling updates. A rolling update allows you to update your application to a new version without any downtime. The Deployment Controller achieves this by gradually replacing the old Pods with new ones. This ensures that there are always some Pods available to handle traffic, even during the update.

Here’s how a rolling update works:

You update the Pod template in the Deployment manifest (e.g., by changing the image).
The Deployment Controller creates a new ReplicaSet with the updated Pod template.
The Deployment Controller scales up the new ReplicaSet and scales down the old ReplicaSet one by one.
Once all the old Pods have been replaced by new ones, the old ReplicaSet is scaled down to zero.

If something goes wrong during the update, you can easily roll back to the previous version of your application. The Deployment Controller keeps a history of all the revisions of your Deployment, so you can simply tell it to roll back to a specific revision.

Deployment Strategies: RollingUpdate vs Recreate

Deployments offer two strategies for updating Pods:

RollingUpdate: This is the default strategy. It gradually replaces the old Pods with new ones, ensuring that the application remains available during the update.
Recreate: This strategy kills all the old Pods before creating the new ones. This will cause a short period of downtime, but it can be useful in situations where you need to ensure that the old and new versions of your application are not running at the same time.

You can specify the deployment strategy in the Deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2

In this example, we’ve specified the RollingUpdate strategy. We’ve also configured the maxUnavailable and maxSurge fields. maxUnavailable specifies the maximum number of Pods that can be unavailable during the update. maxSurge specifies the maximum number of new Pods that can be created above the desired number of Pods.

Labels and Selectors: How Kubernetes Connects the Dots

Labels and selectors are the glue that holds Kubernetes together. They are key-value pairs that are attached to objects, such as Pods and ReplicaSets. They are used to organize and select objects based on their attributes.

In the context of Deployments and ReplicaSets, labels and selectors are used to connect the different objects. The Deployment uses a selector to identify which ReplicaSet to manage, and the ReplicaSet uses a selector to identify which Pods to manage. This allows for a loose coupling between the different objects, which makes the system more flexible and resilient.

Hands-On: Scaling the API, Performing a Rolling Update, and Rolling Back a Bad Deploy

Now that we’ve covered the theory, it’s time to get our hands dirty. In this section, we’llwalk through a practical example of how to use Deployments to scale an application, perform a rolling update, and roll back a bad deploy.

Prerequisites

Before we begin, you’ll need to have the following tools installed on your macOS machine:

Docker Desktop: This will provide you with a local Kubernetes cluster.
kubectl: This is the command-line tool for interacting with the Kubernetes API.

Step 1: Create a Deployment

First, let’s create a simple Deployment for our API. Create a file called api-deployment.yaml with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
      - name: api
        image: your-docker-hub-username/k8s-tutorial-api:1.0.0
        ports:
        - containerPort: 8080

Replace your-docker-hub-username with your actual Docker Hub username. If you don’t have a Docker Hub account, you can use a public image for this example, such as gcr.io/google-samples/hello-app:1.0.

Now, apply the Deployment to your cluster:

kubectl apply -f api-deployment.yaml

You can check the status of the Deployment by running the following command:

kubectl get deployments

You should see that the api-deployment has been created and that one replica is running.

Step 2: Scale the Deployment

Now, let’s scale the Deployment to three replicas:

kubectl scale deployment api-deployment --replicas=3

You can check the status of the Deployment again to see that it has been scaled:

kubectl get deployments

You should now see that there are three replicas of your API running.

Step 3: Perform a Rolling Update

Next, let’s perform a rolling update to a new version of our API. First, update the image in the api-deployment.yaml file to a new version, such as your-docker-hub-username/k8s-tutorial-api:1.1.0 or gcr.io/google-samples/hello-app:2.0.

Now, apply the updated Deployment to your cluster:

kubectl apply -f api-deployment.yaml

You can watch the rolling update in action by running the following command:

kubectl rollout status deployment/api-deployment

You should see that the Deployment is gradually replacing the old Pods with new ones.

Step 4: Roll Back a Bad Deploy

Now, let’s imagine that the new version of our API has a bug and we need to roll back to the previous version. You can do this by running the following command:

kubectl rollout undo deployment/api-deployment

You can check the status of the rollout again to see that the Deployment has been rolled back to the previous version:

kubectl rollout status deployment/api-deployment

Debugging and Troubleshooting

Here are some common issues you might encounter when working with Deployments and how to solve them:

ImagePullBackOff: This error means that Kubernetes is unable to pull the container image from the registry. This could be because the image name is incorrect, the image tag does not exist, or you have not configured the necessary credentials to pull the image from a private registry.
CrashLoopBackOff: This error means that the container is starting and then immediately crashing. This is usually due to a bug in the application code. You can check the logs of the container to see why it is crashing:

    kubectl logs <pod-name>

Deployment is stuck: If your Deployment is stuck in the middle of a rolling update, it could be because the new Pods are not becoming ready. You can check the status of the Pods to see why they are not ready:

    kubectl get pods

You can also describe the Pods to get more information about their status:

    kubectl describe pod <pod-name>

Key Takeaways

ReplicaSets ensure that a specified number of Pod replicas are running at any given time.
Deployments provide a declarative way to manage ReplicaSets and Pods.
Rolling updates allow you to update your application to a new version without any downtime.
Rollbacks allow you to easily revert to a previous version of your application if something goes wrong.
Labels and selectors are the glue that holds Kubernetes together.

The Calm After the Storm

Alex leaned back in his chair, a sense of accomplishment washing over him. The API was now running on three replicas, and the latency graphs had stabilized. The fire was out. But more importantly, Alex had learned a valuable lesson. He had learned how to scale an application in Kubernetes, not by manually spinning up new servers, but by declaratively defining the desired state of the system and letting Kubernetes handle the rest.

As Alex was packing up for the day, Sarah stopped by his desk again. “Great work today, Alex. You tamed the scaling beast.”

Alex smiled. “Thanks, Sarah. I’m starting to get the hang of this Kubernetes thing. What’s next?”

Sarah’s eyes twinkled. “Next, we’re going to talk about how to expose your application to the outside world. We’re going to talk about Services.”

And with that, Alex knew that his Kubernetes journey was far from over. It was just getting started.

Series Progress 14/14 parts

0 The Container Odyssey: A Kubernetes Tutorial Series: Season 1 Complete 1 Part 1: "The New Gig" — Why Kubernetes Exists and What Problem It Solves Complete 2 Part 2: "Containers 101" — Docker Fundamentals You Actually Need for K8s Complete 3 Part 3: The First Deploy — Pods, the Atomic Unit of Kubernetes Complete 4 Part 4: "Scaling the Team" — ReplicaSets, Deployments, and Rolling Updates Complete 5 Part 5: "Opening the Doors" — Services, Networking, and Ingress Complete 6 Part 6: "The Config Puzzle" — ConfigMaps, Secrets, and Environment Management Complete 7 Part 7: "Persistent Memories" — Storage, Volumes, and StatefulSets Complete 8 Part 8: The Night Watch — Health Checks, Resource Management, and Observability Complete 9 Part 9: "The Assembly Line" — Jobs, CronJobs, and DaemonSets Complete 9 Part 9: "The Assembly Line" — Jobs, CronJobs, and DaemonSets Complete 10 Part 10: "Fortress K8s" — RBAC, Network Policies, and Security Best Practices Complete 11 Part 11: "The Helm of the Ship" — Helm, Kustomize, and Templating Complete 12 Part 12: "Production Ready" — CI/CD, GitOps, and the Road Ahead Complete