Kafka Endless NotLeaderForPartitionException for ReplicaFetcherThread

I have a 3 nodes Kafka cluster which runs on top of Kubernetes using the image wurstmeister/kafka:0.10.1.1.

The Zookeeper cluster is composed by 3 nodes with version 3.4.8.

  • node.JS application in docker container
  • Check data replication in zookeeper ensemble
  • How to set Zookeeper dataDir in Docker (fig.yml)
  • Which hostname to choose from a group of zookeepers
  • How to connect local kafka in docker container?
  • Confluent Platform : Update Schema Registry to use Avro 1.8.1 and use new build in docker and host
  • I noticed that the Kafka broker with id 2 is endlessly printing the message:

    [2017-05-08 13:51:16,748] ERROR [ReplicaFetcherThread-0-0], Error for partition [partition_name,5] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)

    This message is printed for a lot of partitions every seconds. The broker 2 logs reached more than 10GB.

    Looking into Zookeeper, I can see that the broker 2 is not listed in the znode /brokers/ids.

    Each Zookeeper and Kafka nodes has its own k8s Deployment and Service (like zk-service-1 -> zk-deployment-1, zk-service-2 -> zk-deployment-2…).
    Zookeeper nodes know each other through k8s service names. For example in the file properties, server 1 has the line: server.1=zk-service-1:2888:3888.

    The same happens for Kafka: the broker X has the advertised.host.name property = kafka-X, where kafka-X is the Service name associated to that broker’s pod.
    The brokers zookeeper.connect property is zk-service-1,zk-service-2,zk-service-3.

    I set the hostname of the pods as the service name which is attached to it.

    I don’t know how to properly debug it and which information can help me in understanding what this issue is about. Do you please have any clue?

  • Why use a data-only container over a host mount?
  • Docker MariaDB/Mysql dump
  • What is the “only” argument of pm2-docker for?
  • Configure Nginx for routing
  • Microservices and database
  • Running tensorflow summarisation service with docker instance
  • One Solution collect form web for “Kafka Endless NotLeaderForPartitionException for ReplicaFetcherThread”

    The way I see it is that you are looking for your nodes to have a consistent hostname.

    I am pretty sure that if you deploy zookeeper using the controller called as statefulset your problem of having a consistent hostname is solved. You don’t have to do all the hacks of naming pods.

    deployment creates pods but the names of pods are not same, but statefulset creates pods that have consistent names, and you expose it via headless service. So basically you are directly talking to pods.

    Read more about statefulsets here and basics here.

    Or for the configurations you can take help from following zookeeper configurations and kafka configurations.

    Above configurations can be found here.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.