Kafka Endless NotLeaderForPartitionException for ReplicaFetcherThread

I have a 3 nodes Kafka cluster which runs on top of Kubernetes using the image wurstmeister/kafka:0.10.1.1.

The Zookeeper cluster is composed by 3 nodes with version 3.4.8.

  • How to replicate microservices when consuming same kafka topic?
  • Configuring Kafka to accept clients both from inside and outside docker
  • Not able to connect to wurstmeister/kafka
  • Cannot produce message to kafka from service running in docker
  • Kafka Docker - Difference between links and KAFKA_ADVERTISED_HOST_NAME
  • For a Docker container based implementation, does it make sense to run a pair of Kafka server and Zookeeper server inside the same container?
  • I noticed that the Kafka broker with id 2 is endlessly printing the message:

    [2017-05-08 13:51:16,748] ERROR [ReplicaFetcherThread-0-0], Error for partition [partition_name,5] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)

    This message is printed for a lot of partitions every seconds. The broker 2 logs reached more than 10GB.

    Looking into Zookeeper, I can see that the broker 2 is not listed in the znode /brokers/ids.

    Each Zookeeper and Kafka nodes has its own k8s Deployment and Service (like zk-service-1 -> zk-deployment-1, zk-service-2 -> zk-deployment-2…).
    Zookeeper nodes know each other through k8s service names. For example in the file properties, server 1 has the line: server.1=zk-service-1:2888:3888.

    The same happens for Kafka: the broker X has the advertised.host.name property = kafka-X, where kafka-X is the Service name associated to that broker’s pod.
    The brokers zookeeper.connect property is zk-service-1,zk-service-2,zk-service-3.

    I set the hostname of the pods as the service name which is attached to it.

    I don’t know how to properly debug it and which information can help me in understanding what this issue is about. Do you please have any clue?

  • Mapping an existing local neo4j database to a neo4j docker container
  • docker root crontab job not executing
  • How to install docker daemon when resizing data center cluster size in Mesosphere?
  • Storing submodules for micro services, but still using forks
  • Limiting a Docker Container to a single cpu core
  • Handling software updates in Docker images
  • One Solution collect form web for “Kafka Endless NotLeaderForPartitionException for ReplicaFetcherThread”

    The way I see it is that you are looking for your nodes to have a consistent hostname.

    I am pretty sure that if you deploy zookeeper using the controller called as statefulset your problem of having a consistent hostname is solved. You don’t have to do all the hacks of naming pods.

    deployment creates pods but the names of pods are not same, but statefulset creates pods that have consistent names, and you expose it via headless service. So basically you are directly talking to pods.

    Read more about statefulsets here and basics here.

    Or for the configurations you can take help from following zookeeper configurations and kafka configurations.

    Above configurations can be found here.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.