您的位置：首页 > 理论基础 > 计算机网络

网络丢包问题处理

2015-01-18 20:19 106 查看

最近测试过程中发现数据库中间件程序会出现网络丢包。具体测试工具为mysqlslap。

发现执行过程中当并发数达到一定程度时，有一定概率会出现mysqlslap一直hold住，无法返回。

测试语句为：

[root@db_slave1 cwinfocenter]# mysqlslap
--concurrency=300,300,300,400,500 --number-of-queries=6000
--iterations=1
--create-schema=chinaweather_infocenter -h172.16.80.71 -P3307
-uroot -p111111 --query=test4.sql

Benchmark

Average
number of seconds to run all queries: 2.613 seconds

Minimum
number of seconds to run all queries: 2.613 seconds

Maximum
number of seconds to run all queries: 2.613 seconds

Number of
clients running queries: 300

Average
number of queries per client: 20

Benchmark

Average
number of seconds to run all queries: 2.677 seconds

Minimum
number of seconds to run all queries: 2.677 seconds

Maximum
number of seconds to run all queries: 2.677 seconds

Number of
clients running queries: 300

Average
number of queries per client: 20

Benchmark

Average
number of seconds to run all queries: 2.689 seconds

Minimum
number of seconds to run all queries: 2.689 seconds

Maximum
number of seconds to run all queries: 2.689 seconds

Number of
clients running queries: 300

Average
number of queries per client: 20

Benchmark

Average
number of seconds to run all queries: 2.906 seconds

Minimum
number of seconds to run all queries: 2.906 seconds

Maximum
number of seconds to run all queries: 2.906 seconds

Number of
clients running queries: 400

Average
number of queries per client: 15

并发到500的时候mysqlslap一直不返回。

[root@db_slave1 cwinfocenter]# ps -eLf | grep mysqldslap
>/tmp/ps-slap

发现有大约93个线程没有返回，使用pstack跟踪未返回线程：

[root@db_slave1 cwinfocenter]# pstack 23085

Thread 1 (process 23085):

#0 0x0000003259e0e54d in read () from
/lib64/libpthread.so.0

#1 0x000000000042a002 in vio_read_buff ()

#2 0x000000000041a659 in my_real_read(st_net*,
unsigned long*) ()

#3 0x000000000041aa34 in my_net_read ()

#4 0x000000000041498a in cli_safe_read ()

#5 0x0000000000416938 in mysql_real_connect
()

#6 0x0000000000408a0d in slap_connect ()

#7 0x000000000040c5b6 in run_task ()

#8 0x0000003259e07851 in start_thread () from
/lib64/libpthread.so.0

#9 0x0000003259ae767d in clone () from
/lib64/libc.so.6

发现mysqlslap的现场是hold在connect上了，那就是连接包丢失了。

修改中间件程序的操作系统配置，调高句柄数和backlog：

ulimit -n 10240

echo 20480 > /proc/sys/net/ipv4/tcp_max_syn_backlog

再测发现还是有问题。。。

google之后发现，还有一个参数需要调整

echo 20480 > /proc/sys/net/core/somaxconn

具体原因（摘抄自网上）：

The behavior of the backlog argument on TCP sockets changed
with Linux 2.2. Now it specifies the queue length for completely
established sockets waiting to be accepted, instead of the number
of incomplete connection requests.

上面这句要注意，现在他指的是已连接但未进行accept
处理的套接字，而不是syn的套接字，我一般设成64左右。所以现在关注的可能是
/proc/sys/net/core/somaxconn这个参数，而非tcp_,ax_sync_backlog,这个参数对一些防火墙应该有用（半syn攻击）

The maximum length of the queue for incomplete sockets can be
set using /proc/sys/net/ipv4/tcp_max_syn_backlog. When syncookies
are enabled there is no logical maximum length and this setting is
ignored. Seetcp(7) for more information.

If the backlog argument is greater than the value in
/proc/sys/net/core/somaxconn, then it is silently truncated to that
value; the default value in this file is 128. In kernels before
2.4.25, this limit was a hard coded value, SOMAXCONN, with
the value 128.

修改somaxconn之后，测试就不会出现丢包了。

转载请注明转自高孝鑫的博客

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航