Encrypt Helm sensitive data
A guide on how to stay safe when pushing helm values files containing Your passwords and other sensitive data to the version control tool.
Problematic confidential values for subcharts in Helm
Helm is a package manager for complex Kubernetes applications, allowing developers to manage, deploy and rollback the applications hosted on Kubernetes. It is natural that the code written in Helm will become a subject of version control, allowing the developers or the DevOps team to perform code review and raise merge requests. The necessity of storing the helm setup in code repository poses an important question of how to approach the sensitive data, which will inevitably be required for our setup — be it database credentials, all sorts of API keys, tokens, passwords. By definition sensitive data should not be stored as a plain text in the version control tool. Of course one can use Kubernetes secrets and then refer to those values in the Helm deployment.yaml file. This is, however, not applicable to the values.yaml type of files, which is used to stored default values for our helm chart . It is extremely inconvenient when we want to configure external dependency (subchart) declared in our charts. Imagine a situation where one wants to deploy backend application with PostgreSQL used as a subchart, for which database credentials have to provided in the values.yaml file. This article depicts two proposed solutions that will help us circumvent this caveat.
A short disclaimer — in this article I assume You:
- are using GitLab as code versioning and CI/CD tool;
- have a docker image with gcloud, kubectl and Helm tool built into it (if not, a proposed content for docker image is enclosed at the end of the article);
- are quite familiar with the concepts of Kubernetes, GitLab pipelines, Helm and Google Cloud Platform;
For the purpose of this article, I will present a pipeline with a single deploy stage and job that is triggered as a part of GitLab CI/CD pipeline.
Substitute sensitive data in the CI/CD workflow, using CI/CD variables.
One of the solutions, which is simple and quick to implement is to substitute the sensitive data values during the pipeline execution. This requires merely, putting the secrets in the GitLab CI/CD variables and then referring to them in the “.gitlab-ci.yaml” file, where pipeline jobs are defined. Remember to always mask the variables stored in the CI/CD pipeline and to control an access to them.
This is suitable for very straightforward setup with small amount of variables and homogenous environment. Otherwise, we end up storing a long list of masked sensitive data for all our environments in one place. From my experience, this solution has 3 major drawbacks in case of larger projects, namely:
- it is hard to test and debug the helm release upgrade locally, from the terminal, since the values.yaml files do not contain crucial data necessary for the pods to start;
- it is easy to put some data in the GitLab CI/CD variables when it’s needed, forget about it and never remove it, even when it is not needed any longer;
- the gitlab-ci.yml file gets cluttered with lots of variables and the visibility of which variables are required to run the pipeline vs which are necessary for the pods(application) becomes blurry;
In this case the pipeline setup looks as following:
image: # a public docker image from Docker Hub with google cloud sdk, kubectl, helm tools installed
stages:
- deploy
deploy:
stage: deploy
script:
- gcloud auth activate-service-account $ACC --key-file=$JSON_F
- helm upgrade -n namespace chart_name chart_location
--set secret1=$SECRET1
--set secret2=$SECRET2
--values non-secret-values.yaml
--atomic
--install
The example presented above can be used to deploy our helm chart on the Kubernetes cluster hosted on the Google Kubernetes Engine, but You can substitute the first script to authenticate in any other Kubernetes cluster of Your choice. Notice the main problem with the script — the more secrets we have the longer the list of variables stored in GitLab CI/CD variables. The problem gets even more cumbersome, when we have more than one deployment version(dev, staging, prod) with different sensitive data values for each of those and have to debug them locally with “ — dry-run” flag.
That is why in situations like that it is better to use the 2nd options, which is using helm secrets plugin.
Helm Secrets plugin
Helm does not provide an out of the box secrets management, but there is a very useful plugin called helm secrets, which can be used to encrypt the sensitive data and decrypt it during the pipeline run.
To install the plugin use the following command:
helm plugin install https://github.com/jkroepke/helm-secrets
As always it is good to install a specific version, to avoid potential compatibility issues in the future. I strongly recommend installing it in the base image used in the GitLab CI/CD pipeline.
Helm Secrets uses mozilla/sops by default as backend, but HashiCorp Vault is supported since version 3.2.0. In order to use helm secrets, one ought to:
- first install the mozilla sops plugin from here (add it to the docker image as well)
- add the .sops.yaml file in the project structure in the same catalogue, where Chart.yaml is located:
In the above mentioned file we define settings for our key, or the path to the key itself. One can for example use Google Key Management cloud service in order to create such key.
After the key is generated, one has to do 2 things:
- generate a service account and a JSON file, using which one will be able to obtain the key used for encryption and decryption;
- copy the path to the key location and place it in the .sops file, as depicted below
The location to the generated key resource can be obtained by selecting the “Copy resource name” option as below:
Once copied to the clipboard, it should be pasted in the sops.yaml file, like so:
Now, the only thing that is left is to use the downloaded json key file to encrypt our sensitive data. In the terminal we need to export the path to the json key file under the GOOGLE_APPLICATION_CREDENTIALS variable:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/json/service/account/file
Now in the same terminal we are able to encrypt the file contents with the sensitive data using:
helm secrets enc <SECRET_FILE_PATH>
After the encryption we should see an output as following:
A very nifty feature of helm secrets is the edit command, which allows us to edit the decrypted sensitive data using vim and encrypt the data back again when saving changes.
Now we can safely push our file to the GitLab. One additional thing we need to do is to copy the contents of the previously generated json file and place it in the CI/CD variables as a ‘File’ type. It will be used in the pipeline job to authenticate against Google Cloud Platform in order to decrypt the file during the helm upgrade/install executed as a pipeline job. The “.gitlab-ci.yaml” file would look now as following:
image: # a public docker image from Docker Hub with google cloud sdk, kubectl and helm tools (with secrets plugin and mozilla sops) installed
stages:
- deploy
deploy:
stage: deploy
script:
- gcloud auth activate-service-account $ACC --key-file=$JSON_F
- helm secrets upgrade -n namespace chart_prod chart_location
-f chart_location/values-prod/secrets.yaml
-f chart_location/values-prod/values.yaml
--atomic
--install
With this we should be able to safely deploy our application without revealing our secret values in the code versioning tool. Helm secrets will decrypt the files using the key from the Google Key Store and store them in a temp file, apply changes in Kubernetes resources, encrypt the file back and remove the temp file after the job is executed.
Required Docker image
As mentioned multiple times in this article, one should have a proper Docker image containing:
- gcloud-sdk;
- kubectl;
- helm 3 and helm secrets plugin;
- sops plugin;
Of course, by searching Docker Hub thoroughly, one can probably find plenty of images having all those tools on board. In case there is no suitable image yet, those commands can help in building a proper one:
FROM google/cloud-sdk:slim# install kubectl
RUN cd /tmp \
&& curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl \
&& chmod +x ./kubectl \
&& mv ./kubectl /usr/local/bin/kubectl # install helm-3
RUN cd /tmp \
&& curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \
&& chmod 700 get_helm.sh \
&& ./get_helm.sh
# install helm secrets
RUN helm plugin install https://github.com/jkroepke/helm-secrets --version v3.11.0# install mozilla sops plugin
RUN curl -L https://github.com/mozilla/sops/releases/download/v3.7.1/sops_3.7.1_amd64.deb -o sops_3.7.1_amd64.deb \
&& dpkg -i sops_3.7.1_amd64.deb\
&& rm sops_3.7.1_amd64.deb
After building it and pushing to the Docker Hub, one can use it in the CI/CD pipeline to successfully deploy helm charts on the Google Kubernetes Cluster.
Conclusions
Dealing with sensitive data is always a challenge and should be approached with proper care. This article presented two ways of dealing with confidential data in our CI/CD pipeline:
- by using GitLab CI/CD variables;
- by using mozilla sops and helm secrets;
While the first option is easy and straightforward, it can quickly become cumbersome, when dealing with lots of confidential values for different environments. The second solution requires more effort at first, but will surely pay off in the long run, especially that we can encrypt entire files.