Published on

Building Kubernetes Operator

Authors
  • avatar
    Name
    Amit Bisht
    Twitter

Introduction

A Kubernetes Operator is a method of packaging, deploying, and managing a Kubernetes application. Operators extend Kubernetes' capabilities by automating the management of complex, stateful applications, such as databases or monitoring systems, using Kubernetes resources.

The operator watches for changes in Kubernetes resources (like deployments or services) and acts based on these changes. Operators use custom resources (CRDs) to manage the lifecycle of these applications, handling tasks such as installation, upgrades, backups, and scaling automatically.

BASIC KNOWLEDGE OF GO IS REQUIRED

Concepts

  • Custom Resources (CRs)

    Custom Resources define the desired state of the application being managed by the operator. They extend the Kubernetes API by introducing new types of objects that are specific to your application. For example, if you're building an operator for a database, you might define a Database custom resource.

  • Custom Resource Definitions (CRDs)

    CRDs are used to define the schema for the Custom Resources. They describe the structure and validation rules for your custom resource. When you create a CRD, Kubernetes understands the new object type and how to interact with it.

  • Operator Controller

    The Operator's controller is the core component responsible for managing the lifecycle of your application. It watches for changes in the state of resources (such as the Custom Resources) and reacts by performing specific actions to bring the system back to the desired state. The controller runs as a Kubernetes pod and interacts with the Kubernetes API server.

  • Reconciliation Loop

    The operator controller continuously watches for changes to the custom resources and executes the reconciliation process. This process involves comparing the current state of the application (actual state) with the desired state described in the CR, and performing the necessary actions (such as creating/deleting resources, scaling, or backing up data) to make the actual state match the desired state.

What are we building.

I have kept the requirements as simple as possible,

Our custom resource, DeploySyncer, will be designed to accept parameters such as RawFileUrl and intervalSeconds, and will be built using Kubebuilder. Every 60 seconds, it will check a deployments.yaml file in a public GitHub repository and apply the contents of that file to our Minikube Kubernetes cluster.

Lets get our hands dirty

  • Setup and Installation Prerequisites

  • Scaffolding the Operator Project

    # create parent folder
    mkdir deploysyncer-operator
    cd deploysyncer-operator
    
    # Command to scaffold an Operator
    kubebuilder init --domain example.com --repo github.com/mythLabs/blog-content/tree/main/kubernetes-operator/deploysyncer-operator
    
    kubebuilder setup kubebuilder scaffolding

    These are the scaffolded files

    # Command to scaffold the CRD and controller
    kubebuilder create api --group apps --version v1alpha1 --kind DeploySyncer
    
scaffolding an api
  • Define the Custom Resource's type: DeploySyncer

    api/v1alpha1/deploysyncer_types.go
    // Update the file generated by scaffolding with the required inputs and statuses needed by the custom resource.
    .
    .
    .
    // DeploySyncerSpec defines the desired state of the DeploySyncer custom resource
    type DeploySyncerSpec struct {
        // RawFileUrl specifies the URL of the GitHub repository containing the deployment file
        RawFileUrl string `json:"RawFileUrl"`
        
        // IntervalSeconds defines the time interval (in seconds) at which the sync operation should occur
        IntervalSeconds int32 `json:"intervalSeconds"`
    }
    
    // DeploySyncerStatus represents the observed state of the DeploySyncer custom resource
    type DeploySyncerStatus struct {
        // LastStatus stores the result of the last sync operation (e.g., "Success" or "Failure")
        LastStatus string `json:"lastStatus"`
        
        // LastSyncTime records the timestamp of the last successful sync operation
        LastSyncTime string `json:"lastSyncTime"`
    }
    
    .
    .
    .
    
  • Generate the CRD and Manifests

    # Generate CRDs from the typing defined in previous step.
    make generate
    make manifests
    
    # Apply the CRDs to the cluster
    kubectl apply -f config/crd/bases
    
    generate and install crd
  • Create the Controller Logic

    Check the comments in the below code for an explanation.

    internal/controller/deploysyncer_controller.go
    package controller
    
    import (
        "context"
        "fmt"
        "time"
    
        "github.com/go-resty/resty/v2" // HTTP client for fetching deployment YAML
        appsv1 "k8s.io/api/apps/v1" // Kubernetes Deployment API
        "k8s.io/apimachinery/pkg/runtime"
        ctrl "sigs.k8s.io/controller-runtime"
        "sigs.k8s.io/controller-runtime/pkg/client"
        "sigs.k8s.io/yaml" // Used for parsing YAML
    
        deploysyncerv1alpha1 "github.com/mythLabs/blog-content/tree/main/kubernetes-operator/deploysyncer-operator/api/v1alpha1"
    )
    
    // DeploySyncerReconciler is the controller that watches DeploySyncer CRs and manages Deployments
    type DeploySyncerReconciler struct {
        client.Client // Kubernetes API client
        Scheme *runtime.Scheme // Scheme for working with Kubernetes objects
    }
    
    // Reconcile is the main function that gets triggered when a DeploySyncer resource is created/updated
    func (r *DeploySyncerReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
        
        // Create an empty DeploySyncer object to store the CR details
        deploySyncer := &deploysyncerv1alpha1.DeploySyncer{}
    
        // Fetch the DeploySyncer resource from Kubernetes
        if err := r.Get(ctx, req.NamespacedName, deploySyncer); err != nil {
            return ctrl.Result{}, client.IgnoreNotFound(err) // Ignore if not found (deleted)
        }
    
        // Validate the required RawFileUrl field
        if deploySyncer.Spec.RawFileUrl == "" {
            deploySyncer.Status.LastStatus = "Missing file url"
            r.Status().Update(ctx, deploySyncer) // Update status to indicate failure
            return ctrl.Result{}, fmt.Errorf("Missing file url")
        }
    
        // Setup GitHub client to fetch the deployment YAML
        rustyClient := resty.New()
    
        // Fetch the YAML file from the provided URL
        resp, err := rustyClient.R().Get(deploySyncer.Spec.RawFileUrl)
        if err != nil {
            deploySyncer.Status.LastStatus = fmt.Sprintf("Failed to fetch: %v", err)
            r.Status().Update(ctx, deploySyncer)
            return ctrl.Result{RequeueAfter: time.Minute}, err // Retry after 1 minute
        }
    
        // Parse the fetched YAML into a Kubernetes Deployment object
        deployment := &appsv1.Deployment{}
        if err := yaml.Unmarshal(resp.Body(), deployment); err != nil {
            deploySyncer.Status.LastStatus = "Invalid YAML"
            r.Status().Update(ctx, deploySyncer)
            return ctrl.Result{}, err
        }
    
        // Check if the deployment already exists in the cluster
        err = r.Get(ctx, client.ObjectKey{Name: deployment.Name, Namespace: deployment.Namespace}, deployment)
        if err != nil && client.IgnoreNotFound(err) == nil {
            // Deployment does not exist, create a new one
            if err := r.Create(ctx, deployment); err != nil {
                deploySyncer.Status.LastStatus = fmt.Sprintf("Failed to create deployment: %v", err)
                r.Status().Update(ctx, deploySyncer)
                return ctrl.Result{}, err
            }
        } else if err == nil {
            // Deployment exists, update it with new configuration
            if err := r.Update(ctx, deployment); err != nil {
                deploySyncer.Status.LastStatus = fmt.Sprintf("Failed to update deployment: %v", err)
                r.Status().Update(ctx, deploySyncer)
                return ctrl.Result{}, err
            }
        } else {
            // Handle any other errors in fetching deployment
            deploySyncer.Status.LastStatus = fmt.Sprintf("Failed to fetch deployment: %v", err)
            r.Status().Update(ctx, deploySyncer)
            return ctrl.Result{}, err
        }
    
        // Successfully synced, update status with success message
        deploySyncer.Status.LastStatus = "Success"
        deploySyncer.Status.LastSyncTime = time.Now().Format(time.RFC3339)
        if err := r.Status().Update(ctx, deploySyncer); err != nil {
            return ctrl.Result{}, err
        }
    
        // Schedule next sync based on IntervalSeconds
        return ctrl.Result{RequeueAfter: time.Duration(deploySyncer.Spec.IntervalSeconds) * time.Second}, nil
    }
    
    // SetupWithManager registers the controller with the manager
    func (r *DeploySyncerReconciler) SetupWithManager(mgr ctrl.Manager) error {
        return ctrl.NewControllerManagedBy(mgr).
            For(&deploysyncerv1alpha1.DeploySyncer{}). // Watch DeploySyncer CRs
            Owns(&appsv1.Deployment{}). // Also watch for changes in Deployments
            Complete(r) // Register the controller
    }
    
    
  • Update the dependencies and permissions

    config/rbac/role.yaml
    # allow the service account used by operator to make changes to deployment type on the cluster
    .
    .
    .
    - apiGroups:
    - apps  # Specifies that this rule applies to resources in the "apps" API group (which includes Deployments).
    
    resources:
    - deployments  # Grants permissions for "Deployment" resources in Kubernetes.
    
    verbs:
    - create  # Allows the creation of new Deployment resources.
    - delete  # Allows the deletion of existing Deployment resources.
    - get     # Allows retrieving details of a specific Deployment.
    - list    # Allows listing all Deployments in the namespace or cluster.
    - patch   # Allows making partial updates to an existing Deployment.
    - update  # Allows fully updating an existing Deployment.
    - watch   # Allows monitoring changes to Deployments in real-time.
    
    
    # install the go-resty rest api client to fetch deployment from github
    go get github.com/go-resty/resty/v2
    

At this stage, Push the code generate till now to github as its used as a module while building the controller.

  • Build and Deploy the Operator

    # build and publish the docker image
    make docker-build docker-push IMG=<your-image-name>
    
    Build and push operator image
    existing setup

    We can see a vanilla installation.

  • Deploy the Operator to your Kubernetes cluster

    # run provided make deploy
    make deploy IMG=<your-image-name>
    
    installing
    verification

    The operator is up and running

  • Create the DeploySyncer Custom Resource

    deploysyncer.yaml
        apiVersion: apps.example.com/v1alpha1
        kind: DeploySyncer
        metadata:
            name: nginx-deploysyncer
            namespace: deploysyncer-operator-system # same namespace where the operator pod is running
        spec:
            intervalSeconds: 60
            RawFileUrl: https://raw.githubusercontent.com/mythLabs/blog-content/refs/heads/main/kubernetes-operator-app/deployments.yaml
    
    # create custom resource using remote file
    kubectl apply -f https://raw.githubusercontent.com/mythLabs/blog-content/refs/heads/main/kubernetes-operator-app/deploysyncer.yaml
    
    verification

    We can see a custom type DeploySyncer, and it will create and update the deployment from remote repo

  • Publishing

    • For alpha testing

      docker tag <local-image> <username>/<repo>:<tag>
      docker push <username>/<repo>:<tag>
      kubectl apply -f https://raw.githubusercontent.com/mythLabs/blog-content/refs/heads/main/kubernetes-operator/deploysyncer-operator/config/crd/bases/
      
    • For beta testing and release

      We will need to create a Helm chart with these CRDs, publish it, and the operator image will be in a public or private registry based on the requirement.

Thanks for reading!