Kafka Endless NotLeaderForPartitionException for ReplicaFetcherThread

I have a 3 nodes Kafka cluster which runs on top of Kubernetes using the image wurstmeister/kafka:0.10.1.1.

The Zookeeper cluster is composed by 3 nodes with version 3.4.8.

  • kafka zookeeper docker no connection
  • Jenkins deployment to docker swarm by using Ansible
  • Spark hangs on authentication with a Docker Mesos cluster
  • Docker Swarm with Zookeeper - No elected primary cluster manager
  • Apache Kafka on Docker - stops after few hundred pools
  • How to place secure file in rancher storage?
  • I noticed that the Kafka broker with id 2 is endlessly printing the message:

    [2017-05-08 13:51:16,748] ERROR [ReplicaFetcherThread-0-0], Error for partition [partition_name,5] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)

    This message is printed for a lot of partitions every seconds. The broker 2 logs reached more than 10GB.

    Looking into Zookeeper, I can see that the broker 2 is not listed in the znode /brokers/ids.

    Each Zookeeper and Kafka nodes has its own k8s Deployment and Service (like zk-service-1 -> zk-deployment-1, zk-service-2 -> zk-deployment-2…).
    Zookeeper nodes know each other through k8s service names. For example in the file properties, server 1 has the line: server.1=zk-service-1:2888:3888.

    The same happens for Kafka: the broker X has the advertised.host.name property = kafka-X, where kafka-X is the Service name associated to that broker’s pod.
    The brokers zookeeper.connect property is zk-service-1,zk-service-2,zk-service-3.

    I set the hostname of the pods as the service name which is attached to it.

    I don’t know how to properly debug it and which information can help me in understanding what this issue is about. Do you please have any clue?

  • Dockerfile RUN command returning “No such file or directory”
  • Change IP address pool of docker bridge
  • Unable to start any container when Volumes are enabled Docker Toolbox
  • Port forward for mysql replication (within docker)
  • Disable docker image being run as daemon (restart always policy)
  • Is there a docker registry cluster solution for private purpose?
  • One Solution collect form web for “Kafka Endless NotLeaderForPartitionException for ReplicaFetcherThread”

    The way I see it is that you are looking for your nodes to have a consistent hostname.

    I am pretty sure that if you deploy zookeeper using the controller called as statefulset your problem of having a consistent hostname is solved. You don’t have to do all the hacks of naming pods.

    deployment creates pods but the names of pods are not same, but statefulset creates pods that have consistent names, and you expose it via headless service. So basically you are directly talking to pods.

    Read more about statefulsets here and basics here.

    Or for the configurations you can take help from following zookeeper configurations and kafka configurations.

    Above configurations can be found here.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.