JSON Schema for custom Kubernetes objects

A large fraction of DevOps work is writing manifests, most often in YAML. This format is easy to read by humans, but challenging to form correctly. It is ubiquitous, but there may be better options.

JSON Schema is vocabulary that allows you to annotate and validate JSON documents. We can somewhat easily convert YAML files to JSON by validating each YAML document in the file as a separate JSON file. Some tools, such as IntelliJ IDEs even support validating YAML files with JSON Schemas natively.

The validation itself is useful on CI where it can reject invalid manifests. Where it really shines is IDE integration where it can highlight violations in real-time:


Example of invalid value in an enum field.
Example of invalid value in an enum field.

Example of alert definition without expected labels.
Example of alert definition without expected labels.

Kubernetes object types are determined with TypeMeta, which is to say there are apiVersion and kind keys in the root object that determine what other keys the object should contain. For example a YAML manifest for a Pod can also have a metadata, spec and status keys. JSON Schema since draft 7 supports conditional subschemas (example). This allows for a single JSON Schema file that validates any number of Kubernetes objects.

First, we need to require the apiVersion and kind. If the object being validated does not contain those, we reject them right away.

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "required": ["apiVersion", "kind"],
  "properties": {
    "apiVersion": {
      "enum": [
        "example.com/v1alpha1",
        "exmaple.com/v1alpha2"
      ],
      "type": "string"
    },
    "kind": {
      "type": "string"
    }
  },
  "type": "object",
  "allOf": [ "... see snippet below ..." ]
}

The interesting part comes in the allOf block. We generate a conditional subschema for each of the defined apiVersion values:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "...": "... see snippet above ...",
  "allOf": [
    {
      "if": {
        "properties": {
          "apiVersion": {
            "const": "cc.mgit.cz/v1alpha1"
          }
        }
      },
      "then": {
        "allOf": [
          {
            "properties": {
              "kind": { "enum": ["CustomKind", "AnotherCustomKind", "..."] }
            }
          },
          "... conditional sub-schemas for each Kind"
        ]
} } ] }

With the two nested conditional subschemas, first one for apiVersion and the second one for kind, we are able generate a huge json-schema that can validate any number of distinct objects. Note that it is not sufficient to use oneOf instead of conditional subschemas, because an IDE could not correctly suggest violation fixes. CI would be reporting if the input file is valid, but the error messages would be hard to read.

We are using this for custom value files for Helm Charts, Prometheus rule definitions, GitLab CI spec and pretty much any other YAML file.