您的位置:首页 > 其它

ceph (luminous 版) data disk 故障测试

2017-11-24 18:31 330 查看

目的

模拟 ceph (luminous 版) data disk 故障
修复上述问题


环境

参考手动部署 ceph 环境说明 (luminous 版)

参考当前 ceph 环境

ceph -s

cluster:
id:     c45b752d-5d4d-4d3a-a3b2-04e73eff4ccd
health: HEALTH_OK

services:
mon: 3 daemons, quorum hh-ceph-128040,hh-ceph-128214,hh-ceph-128215
mgr: openstack(active)
osd: 36 osds: 36 up, 36 in

data:
pools:   1 pools, 2048 pgs
objects: 28024 objects, 109 GB
usage:   331 GB used, 196 TB / 196 TB avail
pgs:     2048 active+clean


osd tree (取部分)

[root@hh-ceph-128214 ceph]# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                   STATUS REWEIGHT PRI-AFF
-1       216.00000 root default
-10        72.00000     rack racka07
-3        72.00000         host hh-ceph-128214
12   hdd   6.00000             osd.12              up  1.00000 1.00000
13   hdd   6.00000             osd.13              up  1.00000 1.00000
14   hdd   6.00000             osd.14              up  1.00000 1.00000
15   hdd   6.00000             osd.15              up  1.00000 1.00000
16   hdd   6.00000             osd.16              up  1.00000 1.00000
17   hdd   6.00000             osd.17              up  1.00000 1.00000
18   hdd   6.00000             osd.18              up  1.00000 1.00000
19   hdd   6.00000             osd.19              up  1.00000 1.00000
20   hdd   6.00000             osd.20              up  1.00000 1.00000
21   hdd   6.00000             osd.21              up  1.00000 1.00000
22   hdd   6.00000             osd.22              up  1.00000 1.00000
23   hdd   6.00000             osd.23              up  1.00000 1.00000
-9        72.00000     rack racka12
-2        72.00000         host hh-ceph-128040
0   hdd   6.00000             osd.0               up  1.00000 0.50000
1   hdd   6.00000             osd.1               up  1.00000 1.00000
2   hdd   6.00000             osd.2               up  1.00000 1.00000
3   hdd   6.00000             osd.3               up  1.00000 1.00000


故障模拟

[root@hh-ceph-128214 ceph]# df -h | grep ceph-14
/dev/sdc1       5.5T  8.8G  5.5T    1% /var/lib/ceph/osd/ceph-14
/dev/sdn3       4.7G  2.1G  2.7G   44% /var/lib/ceph/journal/ceph-14
[root@hh-ceph-128214 ceph]# rm -rf  /var/lib/ceph/osd/ceph-14/*
[root@hh-ceph-128214 ceph]# ls /var/lib/ceph/osd/ceph-14/


查询当前状态

cluster:
id:     c45b752d-5d4d-4d3a-a3b2-04e73eff4ccd
health: HEALTH_WARN
1 osds down
Degraded data redundancy: 3246/121608 objects degraded (2.669%), 124 pgs unclean, 155 pgs degraded

services:
mon: 3 daemons, quorum hh-ceph-128040,hh-ceph-128214,hh-ceph-128215
mgr: openstack(active)
osd: 36 osds: 35 up, 36 in

data:
pools:   1 pools, 2048 pgs
objects: 40536 objects, 157 GB
usage:   493 GB used, 195 TB / 196 TB avail
pgs:     3246/121608 objects degraded (2.669%)
1893 active+clean
155  active+undersized+degraded

io:
client:   132 kB/s rd, 177 MB/s wr, 165 op/s rd, 175 op/s wr


参考 osd tree

[root@hh-ceph-128214 ceph]# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                   STATUS REWEIGHT PRI-AFF
-1       216.00000 root default
-10        72.00000     rack racka07
-3        72.00000         host hh-ceph-128214
12   hdd   6.00000             osd.12              up  1.00000 1.00000
13   hdd   6.00000             osd.13              up  1.00000 1.00000
14   hdd   6.00000             osd.14            down  1.00000 1.00000
15   hdd   6.00000             osd.15              up  1.00000 1.00000
16   hdd   6.00000             osd.16              up  1.00000 1.00000
17   hdd   6.00000             osd.17              up  1.00000 1.00000
18   hdd   6.00000             osd.18              up  1.00000 1.00000
19   hdd   6.00000             osd.19              up  1.00000 1.00000
20   hdd   6.00000             osd.20              up  1.00000 1.00000
21   hdd   6.00000             osd.21              up  1.00000 1.00000
22   hdd   6.00000             osd.22              up  1.00000 1.00000
23   hdd   6.00000             osd.23              up  1.00000 1.00000
-9        72.00000     rack racka12
-2        72.00000         host hh-ceph-128040
0   hdd   6.00000             osd.0               up  1.00000 0.50000
1   hdd   6.00000             osd.1               up  1.00000 1.00000


参考错误日志

orting failure:1
2017-11-24 16:09:24.767761 7fdd215c1700  0 log_channel(cluster) log [DBG] : osd.14 10.199.128.214:6804/11943 reported immediately failed by osd.10 10.199.128.40:6820/12617
2017-11-24 16:09:24.996514 7fdd215c1700  1 mon.hh-ceph-128040@0(leader).osd e328 prepare_failure osd.14 10.199.128.214:6804/11943 from osd.6 10.199.128.40:6812/12317 is reporting failure:1
2017-11-24 16:09:24.996545 7fdd215c1700  0 log_channel(cluster) log [DBG] : osd.14 10.199.128.214:6804/11943 reported immediately failed by osd.6 10.199.128.40:6812/12317
2017-11-24 16:09:25.083523 7fdd23dc6700  0 log_channel(cluster) log [WRN] : Health check failed: 1 osds down (OSD_DOWN)
2017-11-24 16:09:25.087241 7fdd1cdb8700  1 mon.hh-ceph-128040@0(leader).log v17642 check_sub sending message to client.94503 10.199.128.40:0/161437639 with 1 entries (version 17642)
2017-11-24 16:09:25.093344 7fdd1cdb8700  1 mon.hh-ceph-128040@0(leader).osd e329 e329: 36 total, 35 up, 36 in
2017-11-24 16:09:25.093857 7fdd1cdb8700  0 log_channel(cluster) log [DBG] : osdmap e329: 36 total, 35 up, 36 in
2017-11-24 16:09:25.094151 7fdd215c1700  0 mon.hh-ceph-128040@0(leader) e1 handle_command mon_command({"prefix": "osd metadata", "id": 30} v 0) v1
2017-11-24 16:09:25.094192 7fdd215c1700  0 log_channel(audit) log [DBG] : from='client.94503 10.199.128.40:0/161437639' entity='mgr.openstack' cmd=[{"prefix": "osd metadata", "id": 30}]: dispatch


恢复过程

删除 osd.14 auth 授权

[root@hh-ceph-128040 tmp]# ceph auth del osd.14
updated


移除 osd.14 osd map

[root@hh-ceph-128214 ~]# ceph osd crush remove osd.14
removed item id 14 name 'osd.14' from crush map


移除 OSD.14

[root@hh-ceph-128214 ~]# ceph osd rm osd.14
removed osd.14


参考osd tree

Every 2.0s: ceph osd tree                                                                                                                            Sat Nov 25 15:27:41 2017

ID  CLASS WEIGHT    TYPE NAME                   STATUS REWEIGHT PRI-AFF
-1       210.00000 root default
-10        66.00000     rack racka07
-3        66.00000         host hh-ceph-128214
12   hdd   6.00000             osd.12              up  1.00000 1.00000
13   hdd   6.00000             osd.13              up  1.00000 1.00000
15   hdd   6.00000             osd.15              up  1.00000 1.00000
16   hdd   6.00000             osd.16              up  1.00000 1.00000
17   hdd   6.00000             osd.17              up  1.00000 1.00000
18   hdd   6.00000             osd.18              up  1.00000 1.00000
19   hdd   6.00000             osd.19              up  1.00000 1.00000
20   hdd   6.00000             osd.20              up  1.00000 1.00000
21   hdd   6.00000             osd.21              up  1.00000 1.00000
22   hdd   6.00000             osd.22              up  1.00000 1.00000
23   hdd   6.00000             osd.23              up  1.00000 1.00000


删除 journal 文件

[root@hh-ceph-128214 ceph]# rm -rf /var/lib/ceph/journal/ceph-14/journal
[root@hh-ceph-128214 /]# umount /dev/sdn3
[root@hh-ceph-128214 /]# mkfs -t xfs -f /dev/sdn3
meta-data=/dev/sdn3              isize=256    agcount=4, agsize=305152 blks
=                       sectsz=4096  attr=2, projid32bit=1
=                       crc=0        finobt=0
data     =                       bsize=4096   blocks=1220608, imaxpct=25
=                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
=                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@hh-ceph-128214 ~]# mount /dev/sdn3 /var/lib/ceph/journal/ceph-14/


恢复分区

[root@hh-ceph-128214 tmp]# umount /dev/sdc1

[root@hh-ceph-128214 /]# dd if=/dev/zero of=/dev/sdc bs=1M count=100
记录了100+0 的读入
记录了100+0 的写出
104857600字节(105 MB)已复制,0.59539 秒,176 MB/秒

[root@hh-ceph-128214 tmp]# parted -s /dev/sdc  mklabel gpt
[root@hh-ceph-128214 tmp]# parted /dev/sdc mkpart primary xfs 1 100%
信息: You may need to update /etc/fstab.

[root@hh-ceph-128214 tmp]# mkfs.xfs -f -i size=1024  /dev/sdc1
meta-data=/dev/sdc1              isize=1024   agcount=6, agsize=268435455 blks
=                       sectsz=4096  attr=2, projid32bit=1
=                       crc=0        finobt=0
data     =                       bsize=4096   blocks=1465130240, imaxpct=5
=                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=521728, version=2
=                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@hh-ceph-128214 tmp]# mount /dev/sdc1 /var/lib/ceph/osd/ceph-14/


初始化 ceph osd (自动恢复 journal 文件)

[root@hh-ceph-128214 /]# ceph-osd -i 14 --mkfs --mkkey
2017-11-24 18:21:42.297329 7fc7dc79bd00 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2017-11-24 18:21:42.473203 7fc7dc79bd00 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2017-11-24 18:21:42.473725 7fc7dc79bd00 -1 read_settings error reading settings: (2) No such file or directory
2017-11-24 18:21:42.782000 7fc7dc79bd00 -1 created object store /var/lib/ceph/osd/ceph-14 for osd.14 fsid c45b752d-5d4d-4d3a-a3b2-04e73eff4ccd
2017-11-24 18:21:42.782044 7fc7dc79bd00 -1 auth: error reading file: /var/lib/ceph/osd/ceph-14/keyring: can't open /var/lib/ceph/osd/ceph-14/keyring: (2) No such file or directory
2017-11-24 18:21:42.782202 7fc7dc79bd00 -1 created new key in keyring /var/lib/ceph/osd/ceph-14/keyring


创建 osd

[root@hh-ceph-128214 ~]# ceph osd create
14


恢复 auth 认证

[root@hh-ceph-128214 tmp]# ceph auth add osd.14 osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/ceph-14/keyring
added key for osd.14


恢复文件权限

[root@hh-ceph-128214 /]# ls -l /var/lib/ceph/journal/ceph-14/  /var/lib/ceph/osd/ceph-14/
/var/lib/ceph/journal/ceph-14/:
总用量 2097152
-rw-r--r-- 1 root root 2147483648 11月 24 18:21 journal

/var/lib/ceph/osd/ceph-14/:
总用量 36
-rw-r--r-- 1 root root 37 11月 24 18:21 ceph_fsid
drwxr-xr-x 4 root root 61 11月 24 18:21 current
-rw-r--r-- 1 root root 37 11月 24 18:21 fsid
-rw------- 1 root root 57 11月 24 18:21 keyring
-rw-r--r-- 1 root root 21 11月 24 18:21 magic
-rw-r--r-- 1 root root  6 11月 24 18:21 ready
-rw-r--r-- 1 root root  4 11月 24 18:21 store_version
-rw-r--r-- 1 root root 53 11月 24 18:21 superblock
-rw-r--r-- 1 root root 10 11月 24 18:21 type
-rw-r--r-- 1 root root  3 11月 24 18:21 whoami
[root@hh-ceph-128214 /]# chown ceph:ceph -R  /var/lib/ceph/journal/ceph-14/  /var/lib/ceph/osd/ceph-14/


启动 ceph osd

[root@hh-ceph-128214 tmp]# systemctl status ceph-osd@14
● ceph-osd@14.service - Ceph object storage daemon osd.14
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
Active: failed (Result: start-limit) since 五 2017-11-24 17:35:00 CST; 1min 51s ago
Process: 106773 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Process: 106767 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 106773 (code=exited, status=1/FAILURE)

11月 24 17:34:40 hh-ceph-128214.vclound.com systemd[1]: Unit ceph-osd@14.service entered failed state.
11月 24 17:34:40 hh-ceph-128214.vclound.com systemd[1]: ceph-osd@14.service failed.
11月 24 17:35:00 hh-ceph-128214.vclound.com systemd[1]: ceph-osd@14.service holdoff time over, scheduling restart.
11月 24 17:35:00 hh-ceph-128214.vclound.com systemd[1]: start request repeated too quickly for ceph-osd@14.service
11月 24 17:35:00 hh-ceph-128214.vclound.com systemd[1]: Failed to start Ceph object storage daemon osd.14.
11月 24 17:35:00 hh-ceph-128214.vclound.com systemd[1]: Unit ceph-osd@14.service entered failed state.
11月 24 17:35:00 hh-ceph-128214.vclound.com systemd[1]: ceph-osd@14.service failed.
[root@hh-ceph-128214 tmp]# systemctl start ceph-osd@14
Job for ceph-osd@14.service failed because start of the service was attempted too often. See "systemctl status ceph-osd@14.service" and "journalctl -xe" for details.
To force a start use "systemctl reset-failed ceph-osd@14.service" followed by "systemctl start ceph-osd@14.service" again.

[root@hh-ceph-128214 tmp]# systemctl reset-failed ceph-osd@14

[root@hh-ceph-128214 tmp]# systemctl start ceph-osd@14

[root@hh-ceph-128214 tmp]# systemctl status ceph-osd@14
● ceph-osd@14.service - Ceph object storage daemon osd.14
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
Active: active (running) since 五 2017-11-24 17:37:17 CST; 3s ago
Process: 106871 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 106877 (ceph-osd)
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@14.service
└─106877 /usr/bin/ceph-osd -f --cluster ceph --id 14 --setuser ceph --setgroup ceph

11月 24 17:37:17 hh-ceph-128214.vclound.com systemd[1]: Starting Ceph object storage daemon osd.14...
11月 24 17:37:17 hh-ceph-128214.vclound.com systemd[1]: Started Ceph object storage daemon osd.14.
11月 24 17:37:17 hh-ceph-128214.vclound.com ceph-osd[106877]: starting osd.14 at - osd_data /var/lib/ceph/osd/ceph-14 /var/lib/ceph/journal/ceph-14/journal
11月 24 17:37:18 hh-ceph-128214.vclound.com ceph-osd[106877]: 2017-11-24 17:37:18.035052 7fbaaf369d00 -1 journal FileJournal::_open: disabling aio for non-block ...o anyway
11月 24 17:37:18 hh-ceph-128214.vclound.com ceph-osd[106877]: 2017-11-24 17:37:18.047920 7fbaaf369d00 -1 osd.14 0 log_to_monitors {default=true}
11月 24 17:37:18 hh-ceph-128214.vclound.com ceph-osd[106877]: 2017-11-24 17:37:18.054256 7fba96117700 -1 osd.14 0 waiting for initial osdmap
Hint: Some lines were ellipsized, use -l to show in full.


检测

参考当前 ceph 状态

cluster:
id:     c45b752d-5d4d-4d3a-a3b2-04e73eff4ccd
health: HEALTH_WARN
Degraded data redundancy: 8965/137559 objects degraded (6.517%), 60 pgs unclean, 206 pgs degraded

services:
mon: 3 daemons, quorum hh-ceph-128040,hh-ceph-128214,hh-ceph-128215
mgr: openstack(active)
osd: 36 osds: 36 up, 36 in    <- 参考这里

data:
pools:   1 pools, 2048 pgs
objects: 45853 objects, 178 GB
usage:   540 GB used, 195 TB / 196 TB avail
pgs:     8965/137559 objects degraded (6.517%)
1842 active+clean
201  active+recovery_wait+degraded
5    active+recovering+degraded

io:
recovery: 168 MB/s, 42 objects/s


参考 osd tree

[root@hh-ceph-128214 ceph]#  ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                   STATUS REWEIGHT PRI-AFF
-1       215.45609 root default
-10        71.45609     rack racka07
-3        71.45609         host hh-ceph-128214
12   hdd   6.00000             osd.12              up  1.00000 1.00000
13   hdd   6.00000             osd.13              up  1.00000 1.00000
14   hdd   5.45609             osd.14              up  1.00000 1.00000
15   hdd   6.00000             osd.15              up  1.00000 1.00000
16   hdd   6.00000             osd.16              up  1.00000 1.00000
17   hdd   6.00000             osd.17              up  1.00000 1.00000
18   hdd   6.00000             osd.18              up  1.00000 1.00000
19   hdd   6.00000             osd.19              up  1.00000 1.00000
20   hdd   6.00000             osd.20              up  1.00000 1.00000
21   hdd   6.00000             osd.21              up  1.00000 1.00000
22   hdd   6.00000             osd.22              up  1.00000 1.00000
23   hdd   6.00000             osd.23              up  1.00000 1.00000
-9        72.00000     rack racka12
-2        72.00000         host hh-ceph-128040
0   hdd   6.00000             osd.0               up  1.00000 0.50000
1   hdd   6.00000             osd.1               up  1.00000 1.00000
2   hdd   6.00000             osd.2               up  1.00000 1.00000
3   hdd   6.00000             osd.3               up  1.00000 1.00000


总结

在恢复 data disk 时候, 必须要把故障 osd 移除,  (ceph osd rm osd.14)
之前 ceph 0.87 版本在恢复时候不需要执行这个步骤
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  ceph