Tekton’s docker-in-docker (dind) setup is surprisingly fragile because the inner Docker daemon runs as root inside a container, which itself is usually run by a non-root user on the Kubernetes node.

Here’s how to build container images securely and reliably using Tekton’s dind:

Common Causes and Fixes

  1. Insufficient Privileges for the dind Container: The dind container needs elevated privileges to start its own Docker daemon.

    • Diagnosis: Check the logs of your dind pod. You’ll likely see errors related to mounting /var/lib/docker or starting the Docker daemon.
    • Fix: Ensure the SecurityContext for your dind container in the Task or Pipeline grants privileged: true.
      - name: docker-daemon
        image: docker:20.10.17-dind
        securityContext:
          privileged: true
        script: |
          #!/bin/sh
          dockerd-entrypoint.sh
        volumeMounts:
          - name: docker-storage
            mountPath: /var/lib/docker
      
    • Why it works: privileged: true allows the container to perform all host operations, including starting a full Docker daemon with its own kernel modules and device access, bypassing many standard container restrictions.
  2. Incorrectly Mounted Docker Storage: The dind daemon needs persistent storage for its images, containers, and build cache.

    • Diagnosis: Look for volume definitions in your Task or Pipeline that mount /var/lib/docker within the dind container. If this is missing or incorrectly configured, the dind daemon won’t start or will lose state between runs.
    • Fix: Add a volume and volumeMount for /var/lib/docker. A hostPath volume is common for dind, but be mindful of security implications.
      apiVersion: tekton.dev/v1beta1
      kind: Task
      metadata:
        name: build-docker-image
      spec:
        params:
          - name: IMAGE_URL
            description: URL of the image to build
            type: string
        volumes:
          - name: docker-storage
            emptyDir: {} # Or use hostPath for persistence across pod restarts
        steps:
          - name: build-and-push
            image: docker:20.10.17
            command: ["/bin/sh", "-c"]
            args:
              - docker build -t $(params.IMAGE_URL) . && docker push $(params.IMAGE_URL)
            volumeMounts:
              - name: docker-storage
                mountPath: /var/lib/docker
      
      Note: For true persistence of the Docker image cache across pipeline runs, you’d typically use a hostPath volume pointing to a specific directory on the Kubernetes node, e.g., /mnt/docker-data. However, emptyDir is sufficient for a single pipeline run.
    • Why it works: This provides the dockerd process inside the container with a dedicated directory to store its state, allowing it to function as a full Docker daemon.
  3. Service Account Lacking Permissions for Docker Registry: If you’re pushing images, the Kubernetes Service Account used by your Tekton PipelineRun needs permissions to authenticate with your Docker registry.

    • Diagnosis: Check your pipeline logs for denied: requested access to the resource is denied or similar authentication errors when docker push is executed.
    • Fix: Create a Kubernetes Secret containing your Docker registry credentials and reference it in your PipelineRun.
      apiVersion: v1
      kind: Secret
      metadata:
        name: docker-creds
      type: kubernetes.io/dockerconfigjson
      data:
        .dockerconfigjson: <base64-encoded-docker-config-json>
      
      Then, in your PipelineRun:
      apiVersion: tekton.dev/v1beta1
      kind: PipelineRun
      metadata:
        name: my-pipeline-run
      spec:
        pipelineRef:
          name: my-pipeline
        serviceAccountName: tekton-pipelines-service-account # Ensure this SA exists and has the necessary role bindings
        secrets:
          - name: docker-creds
      
      The docker login command will then pick up these credentials.
    • Why it works: Tekton automatically makes secrets of type kubernetes.io/dockerconfigjson available to tasks, and the docker client within the task automatically uses them for authentication.
  4. Resource Constraints on the dind Pod: The Docker daemon can be resource-intensive, especially during image builds. If the dind pod doesn’t have enough CPU or memory, it can crash or become unresponsive.

    • Diagnosis: Monitor the dind pod’s resource utilization in Kubernetes. Look for OOMKilled events or high CPU usage leading to timeouts.
    • Fix: Increase the resource requests and limits for the dind container in your Task definition.
      - name: docker-daemon
        image: docker:20.10.17-dind
        securityContext:
          privileged: true
        resources:
          requests:
            cpu: "1000m"
            memory: "1Gi"
          limits:
            cpu: "2000m"
            memory: "2Gi"
        script: |
          #!/bin/sh
          dockerd-entrypoint.sh
        volumeMounts:
          - name: docker-storage
            mountPath: /var/lib/docker
      
    • Why it works: Providing adequate CPU and memory ensures the Docker daemon can effectively manage its processes and resources without being terminated by the Kubernetes scheduler.
  5. Network Issues or Firewall Blocking Docker Daemon Access: The dind container needs to communicate with the Docker registry and potentially other services.

    • Diagnosis: Check dind container logs for network-related errors (e.g., connection refused, timeout, name resolution failed).
    • Fix: Ensure your Kubernetes network policies allow egress traffic from the dind pod to your Docker registry (e.g., docker.io, gcr.io, quay.io) on port 443. Also, verify that the Kubernetes node itself has proper network connectivity.
    • Why it works: Network policies can restrict pod-to-pod and pod-to-external communication. Allowing necessary egress traffic ensures the dind daemon can reach external services like registries.
  6. Using an Outdated or Incompatible Docker-in-Docker Image: The dind image version might have bugs or incompatibilities with your Kubernetes environment or Tekton version.

    • Diagnosis: Examine the dind container logs for specific error messages that might indicate a version mismatch or known issues with that Docker version.
    • Fix: Pin your dind image to a known stable version, or try upgrading/downgrading to a different patch release. For example, docker:20.10.17-dind is a specific, tested version.
      - name: docker-daemon
        image: docker:20.10.17-dind # Pin to a specific, known-good version
        # ... rest of the configuration
      
    • Why it works: Different versions of Docker have different behaviors and dependencies. Using a specific version ensures predictable behavior and avoids introducing unexpected bugs from newer or older releases.

The next error you’ll likely encounter after fixing these is related to the actual build process failing due to syntax errors in your Dockerfile or missing build context files.

Want structured learning?

Take the full Tekton course →