Setting up the network for Kubernetes
I’m reading the Kubernetes “Getting Started from Scratch” Guide and have arrived at the dreaded Network Section, where they state:
Kubernetes imposes the following fundamental requirements on any networking implementation (barring any intentional network segmentation policies): * all containers can communicate with all other containers without NAT * all nodes can communicate with all containers (and vice-versa) without NAT * the IP that a container sees itself as is the same IP that others see it as
My first source of confusion is: How is this different than the “standard” Docker model? How is Docker different w.r.t. those 3 Kubernetes requirements?
The article then goes on to summarize how GCE achieves these requirements:
For the Google Compute Engine cluster configuration scripts, we use advanced routing to assign each VM a subnet (default is /24 – 254 IPs). Any traffic bound for that subnet will be routed directly to the VM by the GCE network fabric. This is in addition to the “main” IP address assigned to the VM, which is NAT’ed for outbound internet access. A linux bridge (called cbr0) is configured to exist on that subnet, and is passed to docker‘s –bridge flag.
My question here is: Which requirement(s) from the 3 above does this paragraph address? More importantly, how does it achieve the requirement(s)? I guess I just don’t understand how 1-subnet-per-VM accomplishes: container-container communication, node-container communication, and static IP.
And, as a bonus/stretch concern: Why doesn’t Marathon suffer from the same networking concerns as what Kubernetes is addressing here?
One Solution collect form web for “Setting up the network for Kubernetes”
Docker’s standard networking configuration picks a container subnet for you out of its chosen defaults. As long as it doesn’t conflict with any interfaces on your host, Docker is okay with it.
Then, Docker inserts an iptables MASQUERADE rule that allows containers to talk to the external world using the host’s default interface.
Kubernetes’ 3 requirements are violated by the fact that subnets are chosen only based on addresses in use on the host, which forces the requirement to NAT all container traffic using the MASQUERADE rule.
Consider the following 3-host Docker setup (a little contrived to highlight things):
Let’s say container-B wants to access an HTTP service on port 80 of container-A. You can get docker to expose container-A‘s port 80 somewhere on Host 1. Then container-B might make a request to 10.1.2.3:43210. This will be received on container-A‘s port 80, but will look like it came from some random port on 10.1.2.4 because of the NAT on the way out of Host 2. This violates the all containers communicate without NAT and the container sees same IP as others requirements. Try to access container-A‘s service directly from Host 2 and you get your nodes can communicate with containers without NAT violation.
Now if either of those containers want to talk to Host 3, they’re SOL (just a general argument for being careful with the auto-assigned docker0 subnets).
Kubernetes approach on GCE/AWS/Flannel/… is to assign each host VM a subnet carved out of a flat private network. No subnets overlap with VM addresses or with each other. This lets containers and VMs communicate NATlessly.