Docker跨主机容器互传数据问题及解决方法
2015-05-27 14:41
1426 查看
目前我这里docker主要使用1.5版本,用途是给研发、运维做测试环境,给游戏与平台业务做生产应用,昨天接到某游戏研发反馈,2个不同宿主机进行数据同步的时候,出现以下错误
我容器跨主机互通是使用ovs+vxlan,为了测试是否真为此问题,我使用下面方法进行测试。
1、测试环境
3、测试通宿主机容器数据同步情况
4、解决跨宿主机容器数据同步方法
先在宿主机B里获取容器test_mac_2的pid
然后使用nsenter来设置容器的内网网卡eth1关闭校验和
然后在宿主机B里使用test_mac_2进行数据同步宿主机A的test_mac数据
本文出自 “吟―技术交流” 博客,请务必保留此出处http://dl528888.blog.51cto.com/2382721/1655631
orrupted MAC on input. Disconnecting: Packet corrupt lost connection经过谷歌搜索发现问题原因是
"Corrupted MAC on input" This situation happens when the packet is decrypted, the length field is checked, the MAC is computed over the decrypted data and then checked against the MAC field from the SSH packet (see the picture above). If those two MACs don't match we print the "bad mac" error message. Possible reasons for "Corrupted MAC on input" If you see those messages instead of the "Bad packet length" one you can safely assume that the encryption/decryption works fine. If it wasn't then the packet length check could hardly pass a few times in a row - assuming we have seen the message a couple of times at least. That means that we have a data corruption somewhere. There are many situations this could happen. It could be a mulfunctioning: firewall, or NAT, or NIC device driver, or NIC itself, or switch/router along the way, or ...something else that corrupted the data in between the two SSH parties Again, it could also be the SSH implementation itself but as with the "bad packet length" problem that's usually not the case. Note that all those corruptions assume that the TCP packet passes the checksum test but that can easily happen. The checksum is basically a sum of all 16 bit words in the TCP frame; see RFC 793 (Transmission Control Protocol) for the details.具体网页地址是https://blogs.oracle.com/janp/entry/ssh_messages_code_bad_packet
我容器跨主机互通是使用ovs+vxlan,为了测试是否真为此问题,我使用下面方法进行测试。
1、测试环境
容器名 内网IP 宿主机 test_mac 172.16.2.114 A test_mac_2 172.16.2.115 B test_mac_3 172.16.2.116 A在宿主机A的容器test_mac里生成3G测试文件
09:56:16 # dd if=/dev/zero of=/tmp/test.tgz bs=1G count=3 3+0 records in 3+0 records out 3221225472 bytes (3.2 GB) copied, 5.21114 s, 618 MB/s2、测试另外宿主机B的跨主机容器数据同步情况
root@41255dbf85b4:/tmp可以看到出现了问题。
09:56:21 # scp -P 22 172.16.2.114:/tmp/test.tgz .
root@172.16.2.114's password:
test.tgz 3% 99MB 50.6MB/s 00:58 ETA
Corrupted MAC on input. Disconnecting: Packet corrupt lost connection
3、测试通宿主机容器数据同步情况
09:59:50 # scp -P 22 172.16.2.114:/tmp/test.tgz . root@172.16.2.114's password: test.tgz 100% 3072MB 80.8MB/s 00:38 root@5805f5017f89:/tmp可以看到正常同步。
4、解决跨宿主机容器数据同步方法
先在宿主机B里获取容器test_mac_2的pid
[root@ip-10-10-125-10 ~]# docker inspect test_mac_2|grep -i pid "PidMode": "", "Pid": 29416,可以看到pid是29416
然后使用nsenter来设置容器的内网网卡eth1关闭校验和
nsenter --target 29416 --mount --uts --ipc --net --pid -- /bin/bash -c "ethtool -k eth1"使用nsenter是因为此方法可以直接拥有最高权限来修改,如果直接在容器里输入会有下面错误
10:12:52 # ethtool -K eth0 tx off rx off Cannot get device feature names: No such device root@41255dbf85b4:/下面是解决前eth1网卡信息
[root@ip-10-10-125-10 ~]# nsenter --target 29416 --mount --uts --ipc --net --pid -- /bin/bash -c "ethtool -k eth1" Features for eth1: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: on tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: on generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: on [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: on tx-ipip-segmentation: on tx-sit-segmentation: on tx-udp_tnl-segmentation: on tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: on loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: on rx-vlan-stag-hw-parse: on rx-vlan-stag-filter: off [fixed] busy-poll: off [fixed]开始运行命令关闭校验和
[root@ip-10-10-125-10 ~]# nsenter --target 29416 --mount --uts --ipc --net --pid -- /bin/bash -c "ethtool -K eth1 tx off rx off " Actual changes: rx-checksumming: off tx-checksumming: off tx-checksum-ip-generic: off tcp-segmentation-offload: off tx-tcp-segmentation: off [requested on] tx-tcp-ecn-segmentation: off [requested on] tx-tcp6-segmentation: off [requested on] udp-fragmentation-offload: off [requested on]然后在查看一下eth1网卡信息
Features for eth1: rx-checksumming: off tx-checksumming: off tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: off tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: on tcp-segmentation-offload: off tx-tcp-segmentation: off [requested on] tx-tcp-ecn-segmentation: off [requested on] tx-tcp6-segmentation: off [requested on] udp-fragmentation-offload: off [requested on] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: on [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: on tx-ipip-segmentation: on tx-sit-segmentation: on tx-udp_tnl-segmentation: on tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: on loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: on rx-vlan-stag-hw-parse: on rx-vlan-stag-filter: off [fixed] busy-poll: off [fixed]可以很明显看到rx-checksumming与tx-checksumming都从on变为off了。
然后在宿主机B里使用test_mac_2进行数据同步宿主机A的test_mac数据
10:16:09 # scp -P 22 172.16.2.114:/tmp/test.tgz . root@172.16.2.114's password: test.tgz 100% 3072MB 46.6MB/s 01:06 root@41255dbf85b4:/可以看到数据传输正常。
本文出自 “吟―技术交流” 博客,请务必保留此出处http://dl528888.blog.51cto.com/2382721/1655631
相关文章推荐
- VirtualBox的数据空间使用方法,解决客户机和主机共享问题
- 主机网络切换后,docker toolbox里的容器网络不通了,解决方法
- 主机网络切换后,docker toolbox里的容器网络不通的解决方法
- Docker 解决容器时间与主机时间不一致的问题三种解决方案
- 【081】使用Nginx的官方Docker镜像,启动容器后无法显示自己网站页面,总显示Nginx官方默认页面的问题的解决方法
- Docker 解决容器时间与主机时间不一致的问题三种解决方案
- docker容器时区与宿主机不一致的解决方法
- Docker 解决容器时间与主机时间不一致的问题三种解决方案
- MySQL中使用group_concat()函数数据字符过长报错的问题解决方法
- 解决Docker Image镜像无法删除问题的方法
- 宿主机和docker容器时间不同步问题
- Dokcer 解决docker容器或者docker宿主机进程重启 IP地址丢失问题
- django中批量导入文本数据出现问题的解决方法
- 记录一次java ssm框架下数据回滚问题以及解决方法
- 解决angular的$http.post()提交数据时后台接收不到参数值问题的方法
- [置顶] 文本分类问题中数据不均衡的解决方法的探索
- MYSQL的binary解决mysql数据大小写敏感问题的方法
- 解决javaWEB中前台传数据到后台中文乱码问题的3种方法
- 解决SpringMvc后台接收json数据中文乱码问题的几种方法
- 解决javaWEB中前台传数据到后台中文乱码问题的3种方法