您的位置:首页 > 运维架构 > Linux

Linux内核源代码解析之——欲三次握手,先构造传输控制块!

2015-12-01 23:53 411 查看
本文原创为freas_1990,转载请标明出处:http://blog.csdn.net/freas_1990/article/details/18999825

众所周知,TCP/IP最为广泛的考题非三次握手、四次挥手莫属。

当server端接收到syn包时,开始进入三次握手,tcp_sock的state变为SYN_RECV。完成了三次握手的第一次握手,server端开始发送ack+syn包回客户端,等待第二次握手完成。

然而,你想过没有,这里的tcp_sock是如何构造的呢?我们熟悉的socket编程里面的socket()\bind()\listen()\accept()系统调用和三次握手有什么关系呢?他们是如何对应的呢?

简化的过程如下:

1,Server端调用socket()。本系统调用主要做2件事:

       A,Linux内核的sys_socket()会创建socket描述符。

       B,inet_create()会调用sk_alloc()创建一个传输控制块。

2,inet_connection_sock中的request_sock_queue保存了正在建立链接和已经建立链接但是违背accept的传输控制块。在request_sock_queue的rskq_accept_head和rskq_accept_tail构成的链表中,保存了已经完成链接的链接请求快。而在listen_sock的syn_table散列表(不是链表)中保存着两个链接状态中的链接请求块,用ld_next形成链表。

3,当server端调用了listen系统调用后,就可以接收新的连接。当client发送了syn段到server端后,就会为该链接请求创建链接请求块,建立完成之后,就发送ack+syn段。当服务器端再次收到ack段时,就会建立TCP传输控制块(如果是tcp协议),并将这个tcp_sock挂接到tcp_requst_sock的sk成员上去。同时,讲已经完成连接的连接请求快移动到rskq_accept_head队列中,等待server端的accept调用。

4,accept系统调用哦册那rskq_accept_head队列中取走请求传输控制块,与套接口相关联后释放该链接请求块(kfree)。

实现的核心代码如下:
/* This is not only more efficient than what we used to do, it eliminates
* a lot of code duplication between IPv4/IPv6 SYN recv processing. -DaveM
*
* Actually, we could lots of memory writes here. tp of listening
* socket contains all necessary default parameters.
*/
struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req, struct sk_buff *skb)
{
struct sock *newsk = inet_csk_clone(sk, req, GFP_ATOMIC);

if (newsk != NULL) {
const struct inet_request_sock *ireq = inet_rsk(req);
struct tcp_request_sock *treq = tcp_rsk(req);
struct inet_connection_sock *newicsk = inet_csk(sk);
struct tcp_sock *newtp;

/* Now setup tcp_sock */
newtp = tcp_sk(newsk);
newtp->pred_flags = 0;
newtp->rcv_nxt = treq->rcv_isn + 1;
newtp->snd_nxt = newtp->snd_una = newtp->snd_sml = treq->snt_isn + 1;

tcp_prequeue_init(newtp);

tcp_init_wl(newtp, treq->snt_isn, treq->rcv_isn);

newtp->srtt = 0;
newtp->mdev = TCP_TIMEOUT_INIT;
newicsk->icsk_rto = TCP_TIMEOUT_INIT;

newtp->packets_out = 0;
newtp->left_out = 0;
newtp->retrans_out = 0;
newtp->sacked_out = 0;
newtp->fackets_out = 0;
newtp->snd_ssthresh = 0x7fffffff;

/* So many TCP implementations out there (incorrectly) count the
* initial SYN frame in their delayed-ACK and congestion control
* algorithms that we must have the following bandaid to talk
* efficiently to them.  -DaveM
*/
newtp->snd_cwnd = 2;
newtp->snd_cwnd_cnt = 0;
newtp->bytes_acked = 0;

newtp->frto_counter = 0;
newtp->frto_highmark = 0;

newicsk->icsk_ca_ops = &tcp_init_congestion_ops;

tcp_set_ca_state(newsk, TCP_CA_Open);
tcp_init_xmit_timers(newsk);
skb_queue_head_init(&newtp->out_of_order_queue);
newtp->rcv_wup = treq->rcv_isn + 1;
newtp->write_seq = treq->snt_isn + 1;
newtp->pushed_seq = newtp->write_seq;
newtp->copied_seq = treq->rcv_isn + 1;

newtp->rx_opt.saw_tstamp = 0;

newtp->rx_opt.dsack = 0;
newtp->rx_opt.eff_sacks = 0;

newtp->rx_opt.num_sacks = 0;
newtp->urg_data = 0;

if (sock_flag(newsk, SOCK_KEEPOPEN))
inet_csk_reset_keepalive_timer(newsk,
keepalive_time_when(newtp));

newtp->rx_opt.tstamp_ok = ireq->tstamp_ok;
if((newtp->rx_opt.sack_ok = ireq->sack_ok) != 0) {
if (sysctl_tcp_fack)
newtp->rx_opt.sack_ok |= 2;
}
newtp->window_clamp = req->window_clamp;
newtp->rcv_ssthresh = req->rcv_wnd;
newtp->rcv_wnd = req->rcv_wnd;
newtp->rx_opt.wscale_ok = ireq->wscale_ok;
if (newtp->rx_opt.wscale_ok) {
newtp->rx_opt.snd_wscale = ireq->snd_wscale;
newtp->rx_opt.rcv_wscale = ireq->rcv_wscale;
} else {
newtp->rx_opt.snd_wscale = newtp->rx_opt.rcv_wscale = 0;
newtp->window_clamp = min(newtp->window_clamp, 65535U);
}
newtp->snd_wnd = ntohs(skb->h.th->window) << newtp->rx_opt.snd_wscale;
newtp->max_window = newtp->snd_wnd;

if (newtp->rx_opt.tstamp_ok) {
newtp->rx_opt.ts_recent = req->ts_recent;
newtp->rx_opt.ts_recent_stamp = xtime.tv_sec;
newtp->tcp_header_len = sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED;
} else {
newtp->rx_opt.ts_recent_stamp = 0;
newtp->tcp_header_len = sizeof(struct tcphdr);
}
#ifdef CONFIG_TCP_MD5SIG
newtp->md5sig_info = NULL;	/*XXX*/
if (newtp->af_specific->md5_lookup(sk, newsk))
newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
#endif
if (skb->len >= TCP_MIN_RCVMSS+newtp->tcp_header_len)
newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
newtp->rx_opt.mss_clamp = req->mss;
TCP_ECN_openreq_child(newtp, req);

TCP_INC_STATS_BH(TCP_MIB_PASSIVEOPENS);
}
return newsk;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: