Kubernetes Network Policies for Zero Trust: Complete Security Guide

Home › Blog › Kubernetes Network Policies for Zero Trust: Complete Security Guide

Kubernetes Network Policies for Zero Trust

Kubernetes network policies are the foundation of zero trust networking in containerized environments. By default, every pod can communicate with every other pod in a Kubernetes cluster — a flat network with no access control. This default-allow behavior is convenient for getting workloads talking to each other quickly, but it is a serious liability once you run anything sensitive. If an attacker compromises a single pod through a vulnerable dependency, they can immediately scan and reach every other service in the cluster. Network policies let you define granular rules about which pods can talk to which services, on which ports, and in which direction, turning that flat network into a set of explicitly authorized paths.

This guide covers everything from basic deny-all policies to advanced L7 filtering with Cilium. If you are running any production workload on Kubernetes, network policies are not optional — they are essential security hygiene. Importantly, they enforce isolation at the data path itself rather than relying on application-level checks, so a misconfigured service or a forgotten debug endpoint cannot accidentally expose the rest of the cluster.

Default Deny: The Starting Point

The first step in zero trust networking is denying all traffic by default and explicitly allowing only what is needed:

# default-deny-all.yaml
# Apply to every namespace that runs application workloads
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}    # Applies to ALL pods in namespace
  policyTypes:
    - Ingress        # Block all incoming traffic
    - Egress         # Block all outgoing traffic

Kubernetes network policies zero trust diagram — Default deny: start by blocking all traffic, then allow specific paths

After applying default-deny, your pods cannot communicate with anything — not even DNS. Now you selectively open paths.

How Policies Combine: The Additive Model

One subtlety trips up almost everyone new to network policies: they are purely additive, and there is no explicit “deny” rule. A pod’s allowed traffic is the union of every policy that selects it. In other words, you cannot write a rule that says “deny pod X” — instead, the absence of an allow rule is the denial. As soon as any policy selects a pod for a given direction (Ingress or Egress), that direction switches from default-allow to default-deny for that pod, and only the listed rules are permitted. Consequently, ordering does not matter and there is no priority field in the standard API. This is why the default-deny policy above is the cornerstone: without it, adding a narrow allow rule to one pod leaves every other pod wide open.

Another common surprise is that ingress and egress are evaluated independently on each side of a connection. For pod A to reach pod B on port 5432, A needs an egress rule permitting traffic to B, and B needs an ingress rule permitting traffic from A. Forgetting either half produces a connection that hangs and times out rather than failing fast — which makes debugging frustrating until you internalize the two-sided model.

Essential Allow Rules

# allow-dns.yaml — Required for service discovery
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

---
# allow-api-to-database.yaml — Specific service communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-api-to-db
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api-server
  policyTypes:
    - Egress
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: postgresql
      ports:
        - protocol: TCP
          port: 5432

---
# allow-ingress-to-api.yaml — External traffic to API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ingress-to-api
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api-server
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: ingress-nginx
      ports:
        - protocol: TCP
          port: 8080

The selector trap: namespaceSelector vs podSelector

The shape of your selectors controls exactly how broad each rule is, and the difference between combining them and listing them separately is easy to get wrong. Consider these two fragments. The first matches a pod with label app: backend that lives in any namespace labeled team: payments. The second matches a pod labeled app: backend or any pod in a team: payments namespace — a much wider rule.

# AND: pod app=backend AND in a team=payments namespace
ingress:
  - from:
      - namespaceSelector:
          matchLabels:
            team: payments
        podSelector:           # same list item = logical AND
          matchLabels:
            app: backend

# OR: any pod app=backend in this namespace, OR any pod in team=payments
ingress:
  - from:
      - namespaceSelector:     # separate list items = logical OR
          matchLabels:
            team: payments
      - podSelector:
          matchLabels:
            app: backend

Whether two selectors live under the same - list item (AND) or under separate items (OR) changes the security posture entirely. When in doubt, prefer the explicit AND form, because over-broad OR rules silently defeat the isolation you are trying to build. Also remember that a bare podSelector with no namespaceSelector only matches pods in the policy’s own namespace — cross-namespace traffic always requires a namespace selector.

Advanced Policies with Cilium

Standard policies work at L3/L4 (IP and port). Cilium extends this to L7, letting you filter by HTTP method, path, and headers:

# cilium-l7-policy.yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-l7-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api-server
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              # Only allow GET and POST to specific paths
              - method: "GET"
                path: "/api/v1/products.*"
              - method: "POST"
                path: "/api/v1/orders"
                headers:
                  - 'Content-Type: application/json'
              # Block access to admin endpoints
              # (deny by default, only specified paths allowed)

Cilium network policy visualization — Cilium provides L7 visibility and filtering for HTTP, gRPC, and Kafka traffic

DNS-Based Egress Control

# cilium-dns-policy.yaml — Control external API access
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-external-apis
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: payment-service
  egress:
    # Allow only specific external domains
    - toFQDNs:
        - matchName: "api.stripe.com"
        - matchName: "api.sendgrid.com"
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP
    # Allow DNS for FQDN resolution
    - toEndpoints:
        - matchLabels:
            k8s:io.kubernetes.pod.namespace: kube-system
            k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: ANY

FQDN-based egress matters because external IP addresses are a moving target. Cloud APIs like Stripe or SendGrid sit behind CDNs and rotate through large IP ranges, so a static CIDR allow-list breaks the moment they add a new edge node. Cilium solves this by transparently inspecting DNS responses: when your pod resolves api.stripe.com, Cilium captures the returned IPs and programs an ephemeral allow rule for exactly those addresses. As a result, you express intent in human terms (“this service may reach Stripe”) while the data path stays accurate automatically.

L3/L4 Versus L7: Choosing the Right Layer

It is tempting to reach for L7 rules everywhere, but they come with real costs. To inspect HTTP, Cilium transparently redirects matched traffic through an Envoy proxy embedded in the datapath, which adds latency and CPU overhead and only works for protocols Cilium understands (HTTP, gRPC, Kafka). For the vast majority of pod-to-pod links, plain L3/L4 rules — “frontend may reach api-server on 8080” — give you the isolation you need at near-zero cost. Reserve L7 filtering for genuinely high-value boundaries: locking an admin API to GET-only for one caller, or enforcing that a service can only publish to a specific Kafka topic. The table below summarizes the trade-off.

Layer    Matches on              Overhead   Use for
-------  ----------------------  ---------  --------------------------------
L3/L4    IP, port, protocol      Minimal    Default for all internal traffic
L7 HTTP  method, path, headers   Proxy hop  Admin APIs, sensitive endpoints
L7 DNS   destination FQDN        DNS proxy  Egress to external SaaS APIs

Production Implementation Pattern

Implementation Order:

1. Audit current traffic (Cilium Hubble / Calico flow logs)
2. Document required communication paths
3. Apply default-deny in staging FIRST
4. Add allow rules based on audit
5. Test thoroughly — broken policies = outage
6. Apply to production namespace by namespace
7. Monitor with network policy dashboards
8. Alert on denied traffic (may indicate misconfiguration OR attack)

The audit step deserves emphasis, because rolling out default-deny without first observing real traffic is the fastest way to cause an outage. Tools such as Cilium Hubble or Calico flow logs record actual flows so you can generate allow rules from observed behavior rather than guesswork. A safer rollout technique is to deploy your policies in an audit or “policy-audit” mode where violations are logged but not dropped; you watch the logs for a few days, confirm nothing legitimate is being denied, and only then flip to enforcement. This staged approach catches the surprises — a metrics sidecar scraping pods, a cron job hitting an internal API — that no architecture diagram ever captures.

When NOT to Use Network Policies

Network policies add operational complexity. Consequently, they may not be worth it for: development/sandbox clusters, single-tenant applications with no compliance requirements, or very small clusters where all pods are trusted. Additionally, ensure your CNI plugin supports network policies — not all do (Flannel doesn’t by default). It is also worth being honest about their limits: network policies are not a substitute for mutual TLS or authentication. They restrict which pods can connect, but they do not encrypt traffic or verify identity cryptographically, so a compromised pod that is on an allowed path can still send malicious requests. For defense in depth, pair network policies with a service mesh or mTLS and with the RBAC controls covered in the related reading below.

Network security monitoring dashboard — Monitoring denied traffic to identify misconfigurations and potential attacks

Key Takeaways

These policies are essential for zero trust security. Start with default-deny, allow DNS, then open specific paths between services. Remember the additive, two-sided evaluation model, watch your AND/OR selector semantics, and reach for L7 only where it earns its overhead. As a result, even if an attacker compromises one pod, lateral movement is restricted to only the explicitly allowed communication paths.

External Resources

In conclusion, Kubernetes network policies are an essential building block for modern, security-conscious software delivery. By applying the patterns and practices covered in this guide — default-deny first, audited allow rules, the right enforcement layer, and honest awareness of their limits — you can build more robust, scalable, and maintainable systems. Start with the fundamentals, iterate on your implementation, and continuously measure denied-traffic signals to ensure you are getting the most value from these approaches.

Kubernetes Network Policies for Zero Trust: Complete Security Guide