您的位置:首页 > 理论基础 > 计算机网络

UNIX网络编程6 从tcpdump了解到的TCP/UDP发送限制和缓冲区问题

2015-08-27 20:43 423 查看
本节主要解决的问题:MTU和TCP/UDP一次发送的大小限制,read/write 或者 send/recv返回时机和数值,内核发送和接收缓冲区大小,未解决通告窗口win大小和发送包长length的关系,内核TCP再往下层的实现未关注。

一些结论:
首先明白,用户空间缓冲区就是指你的程序中存放原始数据的和用于接收数据的char数组,内核空间的tcp收发缓冲区有默认值,也会自动动态地增加,也可以手动setsockopt()设置,应用程序中要发送的数据首先是从用户空间中拷贝到内核中的缓冲区里面。
关于MTU和TCP/UDP一次发送字节的大小:
1)最大传输单元MTU,不同的链路有不同的值,比如lo本地回环接口的MTU是65536,以太网的MTU是1500(这表示了IP层及以上,一个包的长度最大为1500,除了IP头和TCP头,可用的数据长度为1460字节)。
2)对于UDP比较简单,不管是本地回环测试,还是远程测试,发送端一次能发送的数据大小为65507(超过这个值,sendto直接返回-1,并且errno对应的出错信息为Message too long),tcpdump抓到的UDP包也是65507,接收端应用程序中的用户空间缓冲区(传递给recvfrom的字符数组大小)足够大的话,也是一次性返回这么多的数据,如果用户空间缓冲区不够大的话只能收到缓冲区这么多的数据。所以说,对于UDP的使用,一次发送的数据要限制在65507以内,并且保证接收端的用户空间缓冲区足够大,UDP的发送限制可能是受限于65535(65535-20IP头-8UDP头),并不受限于MTU。
3)对于TCP而言,情况比较复杂。经过以太网链路的,受到MTU=1500的限制,tcpdump抓到的包最大是1460(通常都是连续的1460包,只有最后一个比1460小),发送端发送如1024*1024的字节,send()或者write()调用都是一次返回;接收端recv()和read()可能一次返回也可能多次返回(视数据量大小和用户空间缓冲区大小而定),一般是将多个连续分组组装成一段后返回给用户空间。当本地回环测试时,MTU=65536,每个包的大小通常是32741甚至还有53056这么大。所以说,对于TCP的使用,考虑MTU=1500,如果用户数据不超过1460,TCP发向底层时是不需要分片的;当超过MTU-40,TCP层就不得不拆分数据,发送多个分组。而用户在调用send()或write()时,这里基本上不会有大小限制(具体有待考证,查看内核源码,至少不能超过INT_MAX),你可以一下子发送几MB的数据,send()和write()调用也可以正确返回。
关于read/write和send/recv返回时机和数据的问题:
上面已经做了一些分析,对于write和send发送数据,用户空间的数据被发送到系统中底层的TCP发送缓冲区后,函数就返回发送的字节数,暂时没测到发送阻塞;read和recv是在socket为BLOCK时,如果没有数据则阻塞,发送端发送N个字节,接收端调用read/recv不一定会一次性收到N个字节,这是由于存在缓冲区、数据分片重组的存在,但一定是按序的,非阻塞、数据接收完、数据填满了一次用户空间缓冲区,read/recv都会返回,即便是用户空间缓冲区足够大,也有可能只收到一部分就返回,具体需要根据read/recv的返回值确定,使用while来保证读取完毕。另外,write()函数对应的系统调用是write(),而send()和sendto()对应的系统调用都是sendto(),具体可以通过strace跟踪得到,或者直接去看socket源码。
关于内核中TCP发送和接收缓冲区的大小:
可以查看 /proc/sys/net/ipv4/tcp_rmem和tcp_wmem,接收和发送缓冲区都有默认值,典型的读、写缓冲区大小分别为87380、16384。这两个缓冲区的大小可以手动改变,通过setsockopt设置,而且也会自动扩大,当发送较大数据时,发送端的写缓冲区SO_SNDBUF会增大但并不一定非要比用户空间要发送的字节数大。

具体实验:
tcpdump使用介绍:http://blog.csdn.net/kobejayandy/article/details/17208137
只需要了解:-i 网络接口,-nn以IP和端口的方式显示主机,-A以ascii的方式显示数据包可以查看包中的内容,最后的参数在单引号中增加一些端口和IP的限制,比如'port 8888'就只抓取8888端口的数据包。

一、TCP
本地回环测试TCP,发送端发送数据1024*1024字节,接收端用户空间缓冲区1024*1024:
[ec2-user@ip-172-31-11-211 cplus]$ sudo tcpdump -i lo
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes

三次握手:

07:32:47.786242 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [S], seq 517704113, win 65495, options [mss 65495,sackOK,TS val 4246348730 ecr 0,nop,wscale
6], length 0

07:32:47.786253 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [S.], seq 3437879959, ack 517704114, win
65483
, options [mss 65495,sackOK,TS val 4246348730 ecr 4246348730,nop,wscale 6], length 0

07:32:47.786262 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], ack 1, win 1024, options [nop,nop,TS val 4246348730 ecr 4246348730], length
0

07:32:47.787397 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 1:32742, ack 1, win 1024, options [nop,nop,TS val 4246348731 ecr 4246348730], length
32741


07:32:47.787403 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 32742, win 1658, options [nop,nop,TS val 4246348732 ecr 4246348731],
length 0

07:32:47.787598 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 32742:65483, ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr 4246348730], length
32741


07:32:47.787816 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 65483:98224, ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr 4246348732], length
32741


07:32:47.787823 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 98224:130965, ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr
4246348732], length 32741

07:32:47.788262 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 130965, win 2047, options [nop,nop,TS val 4246348732 ecr 4246348732],
length 0
收到了1:130964,向上返回给用户空间,一次recv或者read返回,长度130964

07:32:47.788329 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 130965:163706, ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr
4246348732], length 32741

07:32:47.788337 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 163706:196447, ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr
4246348732], length 32741
07:32:47.788345 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 196447:229188,
ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr 4246348732], length 32741

07:32:47.788351 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 229188:261929, ack 1, win 1024, options [nop,nop,TS val 4246348732 ecr
4246348732], length 32741

07:32:47.788486 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 261929, win 2047, options [nop,nop,TS val 4246348733 ecr 4246348732],
length 0
收到了130965:261928,recv或者read第二次返回,长度130964

07:32:47.788571 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 261929:294670, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788581 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 294670:327411, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788591 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 327411:360152, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788599 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 360152:392893, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788599 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 294670, win 2682, options [nop,nop,TS val 4246348733 ecr 4246348733],
length 0

07:32:47.788620 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 392893:425634, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788623 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 425634:458375, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788768 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 458375, win 3072, options [nop,nop,TS val 4246348733 ecr 4246348733],
length 0
收到了261929:458374,recv或者read第三次返回,长度130964

07:32:47.788822 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 458375:491116, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788826 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 491116:523857, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788828 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 523857:556598, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788831 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 556598:589339, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788834 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 589339:622080, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 32741

07:32:47.788895 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 622080, win 3072, options [nop,nop,TS val 4246348733 ecr 4246348733],
length 0

07:32:47.788962 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 622080:675136, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.788970 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 675136:728192, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.788976 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 728192:781248, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.789026 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 781248, win 3072, options [nop,nop,TS val 4246348733 ecr 4246348733],
length 0

07:32:47.789090 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 781248:834304, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.789098 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 834304:887360, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.789104 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 887360:940416, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.789166 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 940416, win 3072, options [nop,nop,TS val 4246348733 ecr 4246348733],
length 0

07:32:47.789226 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], seq 940416:993472, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.789233 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 993472:1046528, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 53056

07:32:47.789286 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 1046528, win 3072, options [nop,nop,TS val 4246348733 ecr 4246348733],
length 0

07:32:47.789312 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [P.], seq 1046528:1048578, ack 1, win 1024, options [nop,nop,TS val 4246348733 ecr
4246348733], length 2050
07:32:47.789337 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [.], ack 1048578,
win 3072, options [nop,nop,TS val 4246348733 ecr 4246348733], length 0

07:33:14.617671 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [F.], seq 1048578, ack 1, win 1024, options [nop,nop,TS
val 4246375562 ecr 4246348733], length 0

07:33:14.617699 IP ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1 > ip-172-31-11-211.ap-northeast-1.compute.internal.35318: Flags [F.], seq 1, ack
1048579, win 4093, options [nop,nop,TS val 4246375562 ecr 4246375562], length 0

07:33:14.617711 IP ip-172-31-11-211.ap-northeast-1.compute.internal.35318 > ip-172-31-11-211.ap-northeast-1.compute.internal.ddi-tcp-1: Flags [.], ack 2, win 1024, options [nop,nop,TS val 4246375562 ecr 4246375562], length
0

服务端recv()返回:
[ec2-user@ip-172-31-11-211 cplus]$ ./server

accept client 172.31.11.211: 62857
Receive client message: 130964

Receive client message: 130964

Receive client message: 130964

Receive client message: 130964

Receive client message: 98223

Receive client message: 106112

Receive client message: 106112

Receive client message: 106112

Receive client message: 106112

Receive client message: 2050
第二次测试:
accept client 172.31.11.211: 63113
Receive client message: 130964 (130964)

Receive client message: 130964 (261928)

Receive client message: 196446 (458374)

Receive client message: 163705 (622079)

Receive client message: 159168 (781247)

Receive client message: 159168 (940415)

Receive client message: 106112 (1046527)

Receive client message: 2050 (1048577)

[ec2-user@ip-172-31-11-211 cplus]$ strace -t -o client.strace ./client

Connected! You can send message.

send size: 1048577 //直接一口气写出了1024*1024+1个字节到内核TCP发送缓冲区

远程TCP测试,MTU=1500,每次收到的字节数为1460(见tcpdump,read返回给用户空间的并不一定是1460,还有可能是几个连续的1460合并在一起后返回如2920),比1500少40,正好是TCP头20字节+IP头20字节:
客户端发送128*1024字节=131072
服务器端收到:
Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 2920

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 2920

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 2920

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1096

Receive client message: 1460

Receive client message: 36

客户端发送64*1024字节,服务端收到:
[ec2-user@ip-172-31-11-211 cplus]$ ./server

accept client 202.114.6.66: 56605

Receive client message: 2920

Receive client message: 1460

Receive client message: 1460

Receive client message: 2920

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 2920

Receive client message: 1460

Receive client message: 1460

Receive client message: 2920

Receive client message: 4380

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1460

Receive client message: 1296

用tcpdump在服务端抓包分析:
[ec2-user@ip-172-31-11-211 cplus]$ sudo tcpdump -i eth0 -nn 'port 8888'

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

三次握手

10:03:59.109074 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [S], seq 1467956558, win 64512, options [mss 1460,nop,wscale 0,nop,nop,sackOK], length 0

10:03:59.109107 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [S.], seq 3475540437, ack 1467956559, win 17922, options
[mss 8961,nop,nop,sackOK,nop,wscale 6], length 0

10:03:59.319545 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], ack 1, win 64512, length 0

10:03:59.320406 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 1:1461, ack 1, win 64512, length 1460

10:03:59.320463 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 1461:2921, ack 1, win 64512, length 1460

10:03:59.320565 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 1461, win 561, length 0

10:03:59.320595 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 2921, win 841, length 0

10:03:59.320669 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 2921:4381, ack 1, win 64512, length 1460

10:03:59.320692 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 4381, win 884, length 0

10:03:59.320717 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 4381:5841, ack 1, win 64512, length 1460

10:03:59.320722 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 5841, win 884, length 0
10:03:59.531388 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 5841:7301, ack 1, win 64512, length 1460

10:03:59.531419 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 7301:8761, ack 1, win 64512, length 1460

10:03:59.531476 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 7301, win 884, length 0

10:03:59.531512 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 8761, win 884, length 0
10:03:59.531756 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 8761:10221, ack 1, win 64512, length 1460

10:03:59.531785 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 10221, win 884, length 0

10:03:59.531810 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 10221:11681, ack 1, win 64512, length 1460

10:03:59.531814 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 11681, win 884, length 0

10:03:59.531989 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 11681:13141, ack 1, win 64512, length 1460

10:03:59.532008 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 13141, win 884, length 0
10:03:59.532173 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 13141:14601, ack 1, win 64512, length 1460

10:03:59.532213 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 14601:16061, ack 1, win 64512, length 1460

10:03:59.532241 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 16061, win 884, length 0

10:03:59.532498 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 16061:17521, ack 1, win 64512, length 1460

10:03:59.572398 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 17521, win 884, length 0

10:03:59.743855 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 17521:18981, ack 1, win 64512, length 1460

10:03:59.744031 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 18981:20441, ack 1, win 64512, length 1460

10:03:59.744052 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 20441, win 884, length 0

10:03:59.744295 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 20441:21901, ack 1, win 64512, length 1460

10:03:59.744587 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 21901:23361, ack 1, win 64512, length 1460

10:03:59.744619 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 23361, win 884, length 0

10:03:59.744736 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 23361:24821, ack 1, win 64512, length 1460

10:03:59.745009 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 24821:26281, ack 1, win 64512, length 1460

10:03:59.745032 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 26281, win 884, length 0

10:03:59.745252 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 26281:27741, ack 1, win 64512, length 1460

10:03:59.745493 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 27741:29201, ack 1, win 64512, length 1460

10:03:59.745540 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 29201, win 884, length 0

10:03:59.745773 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 29201:30661, ack 1, win 64512, length 1460

10:03:59.745999 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 30661:32121, ack 1, win 64512, length 1460

10:03:59.746024 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 32121, win 884, length 0

10:03:59.746618 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 32121:33581, ack 1, win 64512, length 1460

10:03:59.746636 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 33581:35041, ack 1, win 64512, length 1460

10:03:59.746659 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 35041, win 884, length 0

10:03:59.746808 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 35041:36501, ack 1, win 64512, length 1460

10:03:59.747090 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 36501:37961, ack 1, win 64512, length 1460

10:03:59.747108 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 37961, win 884, length 0

10:03:59.783130 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 37961:39421, ack 1, win 64512, length 1460

10:03:59.783152 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 39421:40881, ack 1, win 64512, length 1460

10:03:59.783179 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 40881, win 884, length 0

10:03:59.954796 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 40881:42341, ack 1, win 64512, length 1460

10:03:59.954808 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 42341:43801, ack 1, win 64512, length 1460

10:03:59.954859 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 43801:45261, ack 1, win 64512, length 1460

10:03:59.954902 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 43801, win 884, length 0

10:03:59.955052 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 45261:46721, ack 1, win 64512, length 1460

10:03:59.955072 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 46721, win 884, length 0

10:03:59.955171 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 46721:48181, ack 1, win 64512, length 1460

10:03:59.955303 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 48181:49641, ack 1, win 64512, length 1460

10:03:59.955321 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 49641, win 884, length 0

10:03:59.955385 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 49641:51101, ack 1, win 64512, length 1460

10:03:59.955542 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 51101:52561, ack 1, win 64512, length 1460

10:03:59.955561 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 52561, win 884, length 0

10:03:59.955699 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 52561:54021, ack 1, win 64512, length 1460

10:03:59.955841 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 54021:55481, ack 1, win 64512, length 1460

10:03:59.955863 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 55481, win 884, length 0

10:03:59.956002 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 55481:56941, ack 1, win 64512, length 1460

10:03:59.956063 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 56941:58401, ack 1, win 64512, length 1460

10:03:59.956090 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 58401, win 884, length 0

10:03:59.956255 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 58401:59861, ack 1, win 64512, length 1460

10:03:59.956394 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 59861:61321, ack 1, win 64512, length 1460

10:03:59.956470 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 61321, win 884, length 0

10:03:59.956508 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 61321:62781, ack 1, win 64512, length 1460

10:03:59.956647 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], seq 62781:64241, ack 1, win 64512, length 1460

10:03:59.956683 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 64241, win 884, length 0

10:03:59.956734 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [P.], seq 64241:65537, ack 1, win 64512, length 1296

10:03:59.956754 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [.], ack 65537, win 884, length 0

挥手包
10:07:50.643136 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [F.], seq 65537, ack 1, win 64512, length 0

10:07:50.643216 IP 172.31.11.211.8888 > 202.114.6.66.7645: Flags [F.], seq 1, ack 65538, win 884, length 0

10:07:50.853756 IP 202.114.6.66.7645 > 172.31.11.211.8888: Flags [.], ack 2, win 64512, length 0

二、UDP
UDP不管是本地测试还是远程测试,发送的最大包都是65507(不到64K),正好是65536-20IP头-8UDP头-1。客户端、服务端、tcpdump都只有一次返回。
[ec2-user@ip-172-31-11-211 cplus]$ strace -o udpserver.strace ./udpserver

Received: 65507
查看strace中调用的系统调用:
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3

bind(3, {sa_family=AF_INET, sin_port=htons(6001), sin_addr=inet_addr("0.0.0.0")}, 16) = 0

recvfrom(3, "????????????????????????????????"..., 131072, 0, {sa_family=AF_INET, sin_port=htons(6001), sin_addr=inet_addr("0.0.0.0")}, [16]) = 65507

fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5fd61b700 0

write(1, "Received: 65507\n", 16) = 16

[ec2-user@ip-172-31-11-211 cplus]$ sudo tcpdump -i eth0 -nn "port 6001"

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

21:49:36.338490 IP 202.114.6.66.58645 > 172.31.11.211.6001: UDP, length 65507
所以说,UDP的sendto有发送大小限制,超过直接返回-1,典型的是65507字节,发送出去以后不涉及分片、排序、重组包,服务端recvfrom可一次性读取返回。

三、内核空间的TCP读写缓冲区
Linux中socket收发的默认缓冲区大小:
tcp接收缓冲区的默认值是87380
[ec2-user@ip-172-31-11-211 cplus]$ cat /proc/sys/net/ipv4/tcp_rmem

4096 87380 3981312
tcp发送缓冲区的默认值是16384
[ec2-user@ip-172-31-11-211 cplus]$ cat /proc/sys/net/ipv4/tcp_wmem

4096 16384 3981312
[ec2-user@ip-172-31-11-211 cplus]$ cat /proc/sys/net/ipv4/tcp_mem

93312 124416 186624

测试发送端的缓冲区,默认的发送缓冲区大小为174758,接收缓冲区大小为660150,当发送1048576自己的数据时,发送缓冲区的大小自动增大到1320300字节,write()写一次就完成并返回1048576(这也是所发送的字符数组的大小),strace跟踪的系统调用发现调用的是write()也是直接返回write(3,
"????????????????????????????????"..., 1048576) = 1048576:
[ec2-user@ip-172-31-11-211 cplus]$ ./client

Default buffer size: 174758, 660150

3:34:16 Connected! Sleep 5secs before sending message

3:34:21 Sending message.

3:34:21 Send end.

send len: 1048576

After buffer size: 174758, 1320300



接收端的缓冲区,接收数据之前和接收数据之后的缓冲区大小都没变,读缓冲区仍然是87380(因为lo网卡接口链路层MTU为65536,每次收到的包大小都小于65535,比如在上面的tcpdump中最大是53056,写缓冲区大小为16384,read()的返回值远大于接收缓冲区,这里的接收缓冲区是指内核的,并不是我们在应用程序中传递给read()的字节数组的大小,传递给read的字节数组设置成了1048576足够大:
[ec2-user@ip-172-31-11-211 cplus]$ ./server

Default buffer size: 87380, 16384 //和/proc/sys/net/ipv4/中的默认值吻合

3:34:14 After listen..

3:34:14 Sleep 10 secs before accept

3:34:24 Accept returned.

accept client 127.0.0.1: 50397

3:34:24 Sleep 10 secs before accept

3:34:24 Receive message

Receive message size: 138900

3:34:24 Receive message

Receive message size: 123028

3:34:24 Receive message

Receive message size: 130964

3:34:24 Receive message

Receive message size: 130964

3:34:24 Receive message

Receive message size: 522671

3:34:24 Receive message

Receive message size: 2049

After buffer size: 87380, 16384

将发送端数据增加到4*1024*1024(大于tcp_rmem中的最大值)测试:
发送端结果,send()或write()直接返回4194304,而TCP发送缓冲区的大小并没有比4194304大,反而只有2112480,说明我们调用的send()或write()
API直接返回正确,strace跟踪系统调用发现调用的是sendto()也是直接返回4194304:
[ec2-user@ip-172-31-11-211 cplus]$ ./client

Default buffer size: 174758, 660150

3:52:59 Connected! Sleep 5secs before sending message

3:53:4 Sending message.

3:53:6 Send end.

send len: 4194304

After buffer size: 174758, 2112480
接收端结果,总共收到的数据大小为4194304字节,与发送字节数吻合:
4:1:30 Receive message
Receive message size: 654820
4:1:30 Receive message
Receive message size: 1048576
4:1:30 Receive message
Receive message size: 1046400
4:1:30 Receive message
Receive message size: 1048576
4:1:30 Receive message
Receive message size: 64380
4:1:30 Receive message
Receive message size: 331552
After buffer size: 87380, 16384
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: