kafka.common.ConsumerRebalanceFailedException
2016-01-25 16:49
441 查看
最近用到kafka,使用过程中发送和接收数据却出现了诸多异常。有一个异常很诡异,如下:
Exception in thread "main" kafka.common.ConsumerRebalanceFailedException:
groupB_ip-10-38-19-230-1414174925481-97fa3f2a can't rebalance after 4
retries
at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:432)
at
kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:722)
at
kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:212)
at kafka.javaapi.consumer.Zookeeper……
debug发现,在Consumer端,代码跑到
这一行就“卡住不动”了,并且出现上述异常。上网搜索相关解决方案。说是把Consumer端的zookeeper.sync.time.ms属性设置得大一点,尝试之后,问题依旧。。
直到我在下面的地址发现了一个比较靠谱的解决方法:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped,why?>
援引自英文原文:
consumer rebalancing fails (you will see ConsumerRebalanceFailedException): This is due to conflicts when two consumers are trying to own the same topic partition. The log will show you what caused the conflict (search for "conflict in ").
If your consumer subscribes to many topics and your ZK server is busy, this could be caused by consumers not having enough time to see a consistent view of all consumers in the same group. If this is the case, try Increasing rebalance.max.retries and rebalance.backoff.ms.
Another reason could be that one of the consumers is hard killed. Other consumers during rebalancing won't realize that consumer is gone after zookeeper.session.timeout.ms time. In the case, make sure that rebalance.max.retries
* rebalance.backoff.ms > zookeeper.session.timeout.ms.
然后我尝试了粗体部分的解决方法,在Consumer端设置两个属性如下:
并确保5*1200=6000的值大于zookeeper.session.timeout.ms属性对应的值(这里我是5000)。再次分别启动Producer端和Comsumer端,问题果然解决了。
注:服务端Producer的metadata.broker.list属性最好不止一个,这样也就要求你做负载均衡。
Exception in thread "main" kafka.common.ConsumerRebalanceFailedException:
groupB_ip-10-38-19-230-1414174925481-97fa3f2a can't rebalance after 4
retries
at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:432)
at
kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:722)
at
kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:212)
at kafka.javaapi.consumer.Zookeeper……
debug发现,在Consumer端,代码跑到
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = this.consumer .createMessageStreams(topicCountMap);
这一行就“卡住不动”了,并且出现上述异常。上网搜索相关解决方案。说是把Consumer端的zookeeper.sync.time.ms属性设置得大一点,尝试之后,问题依旧。。
直到我在下面的地址发现了一个比较靠谱的解决方法:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped,why?>
援引自英文原文:
consumer rebalancing fails (you will see ConsumerRebalanceFailedException): This is due to conflicts when two consumers are trying to own the same topic partition. The log will show you what caused the conflict (search for "conflict in ").
If your consumer subscribes to many topics and your ZK server is busy, this could be caused by consumers not having enough time to see a consistent view of all consumers in the same group. If this is the case, try Increasing rebalance.max.retries and rebalance.backoff.ms.
Another reason could be that one of the consumers is hard killed. Other consumers during rebalancing won't realize that consumer is gone after zookeeper.session.timeout.ms time. In the case, make sure that rebalance.max.retries
* rebalance.backoff.ms > zookeeper.session.timeout.ms.
然后我尝试了粗体部分的解决方法,在Consumer端设置两个属性如下:
props.put("rebalance.max.retries", "5"); props.put("rebalance.backoff.ms", "1200");
并确保5*1200=6000的值大于zookeeper.session.timeout.ms属性对应的值(这里我是5000)。再次分别启动Producer端和Comsumer端,问题果然解决了。
注:服务端Producer的metadata.broker.list属性最好不止一个,这样也就要求你做负载均衡。
相关文章推荐
- Kafka 之 中级
- Linux下Kafka单机安装配置方法(图文)
- Kafka使用入门教程第1/2页
- Logstash 与Elasticsearch整合使用示例
- Kafka+Log4j实现日志集中管理
- Kafka深度解析
- Kafka设计解析(三)- Kafka High Availability (下)
- kafka+storm初探
- storm集群 + kafka单机性能测试
- flume、kafka、storm常用命令
- kafka 一些基本知识
- Kafka入门经典教程
- Kafka初步学习总结
- note of kafka learning (first part, before replication)
- flume部署
- Kafka集群安装
- Kafka的副本策略——《Learning Apache Kafka》学习笔记(原书第三章第4节)
- kafka性能测试
- kafka集群搭建