CentOS 7 Zookeeper 和 Kafka 集群搭建
2020-05-18 22:55
92 查看
环境
- CentOS 7.4
- Zookeeper-3.6.1
- Kafka_2.13-2.4.1
- Kafka-manager-2.0.0.2
本次安装的软件全部在
/home/javateam目录下。
Zookeeper 集群搭建
- 添加三台机器的
hosts
,使用vim /etc/hosts
命令添加以下内容:
192.168.30.78 node-78 192.168.30.79 node-79 192.168.30.80 node-80
- 首先解压缩:
tar -zxvf apache-zookeeper-3.6.1-bin.tar.gz
修改文件夹名称:
mv apache-zookeeper-3.6.1-bin.tar.gz zookeeper
- 向
/etc/profile
配置文件添加以下内容,并执行source /etc/profile
命令使配置生效:
export ZOOKEEPER_HOME=/home/javateam/zookeeper export PATH=$PATH:$ZOOKEEPER_HOME/bin
- 在上面配置文件中
dataDir
的目录下创建一个myid
文件,并写入一个数值,比如0。myid
文件里存放的是服务器的编号。 - 修改zookeeper配置文件。首先进入
$ZOOKEEPER_HOME/conf
目录,复制一份zoo_sample.cfg
并将名称修改为zoo.cfg
:
# zookeeper服务器心跳时间,单位为ms tickTime=2000 # 投票选举新leader的初始化时间 initLimit=10 # leader与follower心跳检测最大容忍时间,响应超过 syncLimit * tickTime,leader认为follower死掉,从服务器列表删除follower syncLimit=5 # 数据目录 dataDir=/home/javateam/zookeeper/data/ # 日志目录 dataLogDir=/home/javateam/zookeeper/logs/ # 对外服务的端口 clientPort=2181 # 集群ip配置 server.78=node-78:2888:3888 server.79=node-79:2888:3888 server.80=node-80:2888:3888
注意: 上面配置文件中的数据目录和日志目录需自行去创建对应的文件夹。这里server后的数字,与myid文件中的id是一致的。
- zookeeper启动会占用三个端口,分别的作用是:
2181:对cline端提供服务 3888:选举leader使用 2888:集群内机器通讯使用(Leader监听此端口)
记得使用以下命令开启防火墙端口,并重启防火墙:
firewall-cmd --zone=public --add-port=2181/tcp --permanent firewall-cmd --zone=public --add-port=3888/tcp --permanent firewall-cmd --zone=public --add-port=2888/tcp --permanent firewall-cmd --reload
- 然后用
zkServer.sh start
分别启动三台机器上的zookeeper,启动后用zkServer.sh status
查看状态,如下图所以有一个leader两个follower即代表成功:
Kafka 集群搭建
- 首先解压缩:
tar -zxvf kafka_2.13-2.4.1.tgz
- 改文件夹名称:
mv kafka_2.13-2.4.1.tgz kafka
- 向
/etc/profile
配置文件添加以下内容,并执行source /etc/profile
命令使配置生效:
export KAFKA_HOME=/home/javateam/kafka export PATH=$PATH:$KAFKA_HOME/bin
- JVM级别参数调优,修改
kafka/bin/kafka-server-start.sh
,添加以下内容:
# 调整堆大小,默认1G太小了 export KAFKA_HEAP_OPTS="-Xmx6G -Xms6G" # 选用G1垃圾收集器 export KAFKA_JVM_PERFORMANCE_OPTS="-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true" # 指定JMX暴露端口 export JMX_PORT="8999"
添加后,文件内容如下图所示:
- 操作系统级别参数调优,增加文件描述符的限制,使用
vim /etc/security/limits.conf
添加以下内容:
* soft nofile 100000 * hard nofile 100000 * soft nproc 65535 * hard nproc 65535
- 修改kafka的配置文件
$KAFKA_HOME/conf/server.properties
,如下:
############################# Server Basics ############################# # 每一个broker在集群中的唯一标示,要求是正数。在改变IP地址,不改变broker.id的话不会影响consumers broker.id=78 ############################# Socket Server Settings ############################# # 提供给客户端响应的地址和端口 listeners=PLAINTEXT://node-78:9092 # broker 处理消息的最大线程数 num.network.threads=3 # broker处理磁盘IO的线程数 ,数值应该大于你的硬盘数 num.io.threads=8 # socket的发送缓冲区大小 socket.send.buffer.bytes=102400 # socket的接收缓冲区,socket的调优参数SO_SNDBUFF socket.receive.buffer.bytes=102400 # socket请求的最大数值,防止serverOOM,message.max.bytes必然要小于socket.request.max.bytes,会被topic创建时的指定参数覆盖 socket.request.max.bytes=104857600 ############################# Log Basics ############################# # kafka数据的存放地址,多个地址的话用逗号分割 log.dirs=/home/javateam/kafka/logs # 每个topic的分区个数,若是在topic创建时候没有指定的话会被topic创建时的指定参数覆盖 num.partitions=3 # 每个分区的副本数 replication.factor=2 # 我们知道segment文件默认会被保留7天的时间,超时的话就会被清理,那么清理这件事情就需要有一些线程来做。这里就是用来设置恢复和清理data下数据的线程数量 num.recovery.threads.per.data.dir=1 ############################# Internal Topic Settings ############################# # The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state" # For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3. offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 ############################# Log Flush Policy ############################# # Messages are immediately written to the filesystem but by default we only fsync() to sync # the OS cache lazily. The following configurations control the flush of data to disk. # There are a few important trade-offs here: # 1. Durability: Unflushed data may be lost if you are not using replication. # 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush. # 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks. # The settings below allow one to configure the flush policy to flush data after a period of time or # every N messages (or both). This can be done globally and overridden on a per-topic basis. # The number of messages to accept before forcing a flush of data to disk #log.flush.interval.messages=10000 # The maximum amount of time a message can sit in a log before we force a flush #log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# # 控制一条消息数据被保存多长时间,默认是7天 log.retention.hours=168 # 指定Broker为消息保存的总磁盘容量大小,-1代表不限制 log.retention.bytes=-1 # Broker能处理的最大消息大小,默认976KB(1000012),此处改为100MB message.max.bytes=104857600 # 日志文件中每个segment的大小,默认为1G log.segment.bytes=1073741824 #上面的参数设置了每一个segment文件的大小是1G,那么就需要有一个东西去定期检查segment文件有没有达到1G,多长时间去检查一次,就需要设置一个周期性检查文件大小的时间(单位是毫秒)。 log.retention.check.interval.ms=300000 ############################# Zookeeper ############################# # 消费者集群通过连接Zookeeper来找到broker。zookeeper连接服务器地址 zookeeper.connect=node-78:2181,node-79:2181,node-80:2181 # Timeout in ms for connecting to zookeeper zookeeper.connection.timeout.ms=6000 ############################# Group Coordinator Settings ############################# # The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance. # The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms. # The default value for this is 3 seconds. # We override this to 0 here as it makes for a better out-of-the-box experience for development and testing. # However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup. group.initial.rebalance.delay.ms=0 ############################# Broker Settings ############################# # 不让落后太多的副本竞选Leader unclean.leader.election.enable=false # 关闭kafka定期对一些topic分区进行Leader重选举 auto.leader.rebalance.enable=false
- 编写kafka启动脚本,
vim startup.sh
内容如下所示:
# 进程守护模式启动kafka kafka-server-start.sh -daemon /home/javateam/kafka/config/server.properties
- 编写kafka停止脚本,
vim shutdown.sh
内容如下所示:
# 停止kafka服务 kafka-server-stop.sh
- 用如下命令,分别启动kafka服务:
sh /home/javateam/kafka/startup.sh
注意:后面的路径换成你自己脚本所在的路径。
- 启动成功后,连接zookeeper查看节点
ids
信息:
zkCli.sh -server 127.0.0.1:2181 ls /brokers/ids
如下图所示,代表集群搭建成功:
Kafka-manager 搭建
- 首先解压缩:
unzip kafka-manager-2.0.0.2.zip
- 改文件夹名称
mv kafka-manager-2.0.0.2.zip kafka-manager
- 修改配置文件
kafka-manager/conf/application.conf
,把里面的kafka-manager.zkhosts
换成你自己的zookeeper 集群地址就好了,例如:kafka-manager.zkhosts="node-78:2181,node-79:2181,node-80:2181"
- 编写 kafka-manager 启动脚本,
vim startup.sh
内容如下:
nohup /home/javateam/kafka-manager/bin/kafka-manager -Dhttp.port=9000 > /home/javateam/kafka-manager/nohup.out 2>&1 &
- 使用
sh /home/javateam/kafka-manager/startup.sh
启动 kafka-manager,然后访问9000端口,如下图所示代表成功:
不知道怎么使用的话就去 google,这里不再赘述。
相关文章推荐
- zookeeper集群环境搭建(centos)
- zookeeper+kafka集群搭建
- Centos7---kafka集群搭建
- Hadoop+Hdfs+Hbase+Kafka+Zookeeper集群搭建
- 记一次 Centos7.4 手动搭建Zookeeper集群
- ZooKeeper在centos6.4的集群搭建
- linux CentOS 7下zookeeper集群环境搭建
- zookeeper+kafka集群搭建
- Kafka 0.10.0.+zookeeper3.4.8集群搭建、配置,新Client API介绍
- Centos7环境---zookeeper集群搭建
- Zookeeper + Kafka 集群搭建
- VMware+CentOS+zookeeper+solr集群环境搭建_org.apache.solr.handler.dataimport.DataImportHandler
- kafka集群搭建(使用外部zookeeper集群环境方式)
- centos 6.5 搭建zookeeper集群
- zookeeper+kafka集群搭建
- CentOS6.5搭建ZooKeeper集群与单机
- zookeeper集群环境搭建(centos)
- Kafka 0.9+Zookeeper3.4.6集群搭建、配置,新Client API的使用要点,高可用性测试,以及各种坑
- kafka+zookeeper集群安装与配置(CENTOS7环境)及开发中遇到的问题解决