If you’ve used Kubernetes, if you’ve even looked in its general direction, then you know that YAML configurations are the heart of deploying workloads and applications to your cluster. If you’ve used Kubernetes for a while, you’ll also know how important it is to get these YAMLs right - and not just their syntax, but the actual content in regard to security, resources, tooling, conventions, etc.
Fortunately, there are a plethora of tools out there to help you navigate the overwhelm. YAML validators, and policy engines like Monokle, are ready to perfect every meticulously crafted Kubernetes configuration property to make sure it’s in line with you, and your ops team's requirements.
Now, let’s take a step back to get the big picture on what exactly ARE YAML policies, their workflows, strategies, challenges, and how they relate to Kubernetes YAML configurations.
A YAML “Policy” is a fancy word for a set of rules that are applied to your YAML configurations. These rules can be as simple as, “property X is required” or as complex as, “If your Pod is in Namespace Y in the staging cluster, its memory allocation should be between Z and Y Mb”.
Rules can be expressed in a variety of ways:
Some policy engines have a hard-coded set of rules that can be enabled/disabled, other engines allow or require you to define your own rules.
Let’s look briefly at the types of policies that are commonly applied in Kubernetes environments.
Security-related policies are the bread-and-butter of many policy validation engines; ensuring that processes run with the right privileges, that they have access to the right resources, that networks are constrained as appropriate, etc are both easy to understand and enforce.
The Kubernetes project has defined its own “Pod Security Standards” that include three different policies to broadly cover the security spectrum. These policies are cumulative and range from highly-permissive to highly-restrictive - described in detail at https://kubernetes.io/docs/concepts/security/pod-security-standards/
Managing resource usage is key to efficient usage of often costly Kubernetes-related infrastructure, and there are many opportunities for defining resource-related constraints within Kubernetes YAML configurations. Corresponding policies ensure that these are enforced correctly, both to minimize unnecessary cost accumulation and to ensure that computation-intense processes are allocated the resources necessary.
As Kubernetes continues to evolve, so does the consensus on how to configure deployed applications to ensure compatibility and compliance with both internal and external standards. This can be related to naming, metadata requirements by other tools, or configuration management by Kubernetes applications themselves. Corresponding policies are often context-specific, so the importance of a flexible rule engine that allows for custom rules and parameterization is high when approaching these types of policies.
Ultimately many applications running under Kubernetes will have their own idioms and conventions on how to best configure their resources and related components. Enforcing these for consistency and management often becomes key to being able to scale out application deployments efficiently across an organization.
Policies can be applied or enforced at several points in the application lifecycle, and the earlier the better! However, all this depends on the nature of the policies themselves, which we will explore below.
Pre-commit policy validation is done locally in your working environment either using a validation tool from the CLI or an integration into your IDE. Pre-commit validation has the major advantage of problems being identified and fixed immediately (some IDE plugins can do validation in real-time) saving you the time of having to wait for a validation result later on in your workflow.
One potential drawback of pre-commit validation is the inability to execute rules that require access to the entire runtime environment/cluster to enforce integrity or infrastructure-related policies. Another drawback is related to consistent enforcement and how to make sure everyone on your team is actually applying and adhering to the defined rules.
Pre-deployment validation is usually performed as part of your CI/CD workflows after you have committed your configurations to your SCM (usually Git). This could be a GitHub action that runs a validation on your PR, a Jenkins build step that runs the corresponding validation as part of a build, or your GitOps reconciliation that validates your YAML configurations before syncing them to target clusters.
An advantage of pre-deployment validation is that it is harder to bypass than pre-commit validation; validation rules are configured more centrally and applied consistently in all build processes. A potential drawback is the same as for pre-commit validation; the inability to execute rules that require access to the target runtime environment. Another drawback is alluded to above; engineers have to commit their code and wait for validation results, which adds to the turnaround time for fixing and committing validation errors.
Post-deployment validation is performed inside Kubernetes using a dedicated “Admission Controller” - an in-cluster component specific to Kubernetes that validates the YAML of a resource before it is actually accepted and provisioned by Kubernetes. This is perhaps the most common way to get started with configuration policies, the advantage of being closely tied to Kubernetes itself ensures that no YAML configurations can bypass the policy check, and the immediacy of running inside your target environment allows policies and rules to take full advantage of rules that depend on existing Kubernetes resources and their configurations.
The main drawback (and it’s a big one..) is the turn-around time to fix potential problems; back-tracking an identified post-deployment validation error to the actual source/commit where it needs to be fixed can be a tedious and time-consuming process, which often results in long turn-around time for validation errors and/or problems being ignored rather than fixed.
Lastly, in-cluster validation checks the YAML manifests of your actual deployed objects running in Kubernetes to ensure they are compliant with your defined policies. In-cluster validation can be helpful when inspecting an existing cluster that has none of the above policy enforcement points in place to assess the actual need for policies, and it helps where to start when deciding on an initial policy. Some tools will also allow you to “hot-fix” problems found in your cluster - which can be a lifesaver for critical security, configuration, or resource-consumption issues.
Obviously, this is a last resort; identifying a configuration problem in a running cluster means that harm could already have been done to your environment.
Let’s also touch on some other aspects related to policies that might not be of immediate concern, but always will be as your applications and Kubernetes infrastructure evolves.
As uninspiring as this may sound - managing policies quickly becomes an issue if you’re working in an organization where policies need to be shared with team members, clusters, etc. How do you make sure that everyone is using the same policies? Should policies be versioned? Should there be different policies for different environments? For example, you might have higher security requirements or resource constraints in production vs in a local development environment - should that be taken into consideration when rolling out a policy solution? If so - how do you keep track of which policies are applied where?
These issues are well worth discussing within your team even if you decide to “cross that bridge when you get there” - at least you’ve acknowledged that there is a bridge coming up.
Platform Engineers are often tasked with providing their development teams with guidelines and guardrails for building and deploying applications compliant with organization rules and requirements. Policies are a great tool in this regard as they can be defined centrally, adapted to specific needs of the organization, and applied at multiple points in the application lifecycle as we have seen above, making it possible (if not always easy) to cater to the needs of different teams and workflows within the organization.
In this context policies also serve as a great point of collaboration; members of different teams can get together to express their needs and collaborate on the definition of shared policies - giving them an understanding of both complexities and constraints related to infrastructure and workflows. Policy-Driven-Development (PDD) anyone?
Perhaps the most pressing question after diving into the world of Kubernetes YAML policies is how to get started in a way that is not overwhelming but still productive. That will be the topic of our next blog post - and while waiting for that, why not head over to Monokle.io to learn more about how Monokle can help you achieve consistent policy enforcement across your projects, teams, and workflows.
Monokle helps you achieve high-quality Kubernetes deployments throughout the entire application lifecycle—from code to cluster. It enables your team to define Kubernetes configuration policies to ensure consistent, secure, and compliant application deployments every time. In addition to policy enforcement, Monokle’s ecosystem of tools make your team’s daily YAML configuration workflows easier. Get started with Monokle for free.
Related topics: