Skip to content

GitOps Revamp

This project involved a complete rework of the previous GitOps implementation. It included deploying new Kubernetes clusters for each environment and performing the deployment of all workloads, including production.

  • kubernetes.svg Kubernetes
  • argo.svg ArgoCD
  • aws.svg AWS
  • helm.svg Helm
  • azure-devops.svg Azure DevOps
  • git.svg Git

Need & Benefits

The previous GitOps implementation had several critical flaws:

  • Overly Permissive Permissions:
    Access controls were too broad, increasing security risks.
  • Single Repository for All Environments:
    Storing all environments in the same repository resulted in poor change management and heightened the risk of accidental modifications, such as unintended changes to production.
  • Complex Code Organization:
    The structure of the codebase was difficult to navigate and maintain.
  • Embedded Helm Charts:
    Helm charts were stored directly within the configuration code, leading to a cluttered Git history and making changes harder to track.
  • Inefficient GitOps Bridge:
    The integration between Terraform and ArgoCD was impractical and lacked a streamlined approach.

The project addressed those flaws. See Key benefits.

My Roles & Missions

  • Lead
    I presented the project and drove it to its full potential.
  • Engineer
    I implemented the project.

Specification

I had to rethink the entire implementation. Below is a simplified overview of the code organization for one environment.

flowchart LR
    subgraph aws["<b>AWS</b>"]
        subgraph eks["<b>Kubernetes cluster</b>"]
            argocd("ArgoCD")
        end
    end

    subgraph azdo["<b>Azure DevOps</b>"]
        subgraph charts["<b>Helm Charts</b>"]
            meta_workload("helm-argocd-meta-apps<br/><i>chart repository</i>")

            subgraph app_charts["<b>Application charts</b>"]
                app_chart_1("Application 1")
                app_chart_2("Application 2")
                app_chart_3("Application 3")
                app_chart_4("...")
                app_chart_1 ~~~ app_chart_2
                app_chart_3 ~~~ app_chart_4
            end
        end

        configuration("gitops-workload-{env}<br/><i>gitops configuration repository</i>")
        configuration -.->|refers to| app_charts
    end

    argocd -->|deploys| meta_workload
    argocd -->|syncs on| configuration
    argocd -->|deploys| app_charts

This shows one environment. Find below a complete example for 3 environments (dev, staging, and production).

Complete organization example
flowchart LR
    subgraph aws["<b>AWS</b>"]
        subgraph eks_dev["<b>Kubernetes cluster<br/>DEV</b>"]
          argocd_dev("ArgoCD<br/>DEV")
        end
        subgraph eks_stg["<b>Kubernetes cluster<br/>STAGING</b>"]
          argocd_stg("ArgoCD<br/>STAGING")
        end
        subgraph eks_prd["<b>Kubernetes cluster<br/>PROD</b>"]
          argocd_prd("ArgoCD<br/>PROD")
        end
    end

    subgraph azdo["<b>Azure DevOps</b>"]
        subgraph charts["<b>Helm Charts</b>"]
            meta_workload("helm-argocd-meta-apps<br/><i>chart repository</i>")

            subgraph app_charts["<b>Application charts</b>"]
                app_chart_1("Application 1")
                app_chart_2("Application 2")
                app_chart_3("Application 3")
                app_chart_4("...")
                app_chart_1 ~~~ app_chart_2
                app_chart_3 ~~~ app_chart_4
            end
        end

        configuration_dev("gitops-workload-dev<br/><i>gitops configuration repository</i>")
        configuration_stg("gitops-workload-staging<br/><i>gitops configuration repository</i>")
        configuration_prd("gitops-workload-prod<br/><i>gitops configuration repository</i>")
        configuration_dev -.->|refers to| app_charts
        configuration_stg -.->|refers to| app_charts
        configuration_prd -.->|refers to| app_charts
    end

  argocd_dev -->|deploys| meta_workload
  argocd_dev -->|syncs on| configuration_dev
  argocd_dev -->|deploys| app_charts
  argocd_stg -->|deploys| meta_workload
  argocd_stg -->|syncs on| configuration_stg
  argocd_stg -->|deploys| app_charts
  argocd_prd -->|deploys| meta_workload
  argocd_prd -->|syncs on| configuration_prd
  argocd_prd -->|deploys| app_charts

In the previous implementation, everything was in the same Git repository. I separated the repositories, ensuring more controllable change management.

Key Benefits

The GitOps implementation was significantly improved through the revamp project, addressing previous shortcomings:

  • Tightened Permissions:
    Access controls were refined, adhering to the principle of least privilege to enhance security.
  • Environment Isolation:
    Each environment now resides in a dedicated repository, improving change management and reducing the risk of accidental modifications, especially to production.
  • Simplified Code Organization:
    The codebase was restructured to improve readability, maintainability, and ease of use.
  • Decoupled Helm Charts:
    Helm charts were moved to dedicated repositories, resulting in cleaner configuration repositories and a more comprehensible Git history.
  • Streamlined GitOps Workflow:
    A more efficient and automated bridge between Terraform and ArgoCD was implemented, reducing complexity and improving deployment reliability.

Progression

After writing the specification and validating the solution, it was time to act and promote the significant change to production.

I built new clusters for each environment:

  • First, because it made rolling back easier in case of an issue (I was thinking in production terms from the very beginning).
  • Second, because there was a lot of ongoing development. This approach prevented disturbing developers during the project, ensuring maximum productivity.

Once the new clusters were set up, we tested the switch and rollback in lower environments. We ensured everything worked in both directions. After securing this process, we moved on to production with minimal downtime.

Conclusion

This was a large project that demanded over two months of preparation and planning (1). However, the results were worth it: it reduced risks, improved tracking, and gave better control over changes. An unexpected benefit was that people started appreciating the new organization—a pleasant bonus!

  1. Due to confidentiality, I left out most of the complexity inherent to the company context.