Ceph Outage

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

skunky

New Member
Apr 25, 2017
24
1
3
43
Hi Guys,
I'm running into a very big issue on the ceph production cluster.
All begun with an osd that died. Then, another one ( bad coincidence ).
After adding new osds to the cluster here's my "ceph -s":
cluster:
id: 2806fcbd-4c9a-4805-a16a-10c01f3a9f32
health: HEALTH_ERR
1 filesystem is degraded
2 nearfull osd(s)
3 pool(s) nearfull
501181/7041372 objects misplaced (7.118%)
Reduced data availability: 717 pgs inactive, 1 pg peering
Degraded data redundancy: 11420/7041372 objects degraded (0.162%), 1341 pgs unclean, 378
pgs degraded, 366 pgs undersized
22 slow requests are blocked > 32 sec
68 stuck requests are blocked > 4096 sec
too many PGs per OSD (318 > max 200)

services:
mon: 3 daemons, quorum ceph1,ceph2,ceph3
mgr: ceph3(active), standbys: ceph1, ceph2
mds: cephfs-1/1/1 up {0=ceph3=up:replay}, 5 up:standby
osd: 37 osds: 36 up, 36 in; 1263 remapped pgs

data:
pools: 20 pools, 4712 pgs
objects: 2745k objects, 7965 GB
usage: 20779 GB used, 61933 GB / 82713 GB avail
pgs: 15.216% pgs not active
11420/7041372 objects degraded (0.162%)
501181/7041372 objects misplaced (7.118%)
3371 active+clean
604 active+remapped+backfill_wait
339 activating+undersized+degraded+remapped
308 activating+remapped
47 activating
12 activating+degraded
10 undersized+degraded+peered
9 active+undersized+degraded
8 active+undersized+degraded+remapped+backfill_wait
3 active+remapped+backfilling
1 remapped+peering

Rdb images can't be mapped for proxmox vms, cephfs is unavailable :(
Re: CephFS - it is stucked in "replay" and the last lines ( and stucked here ) on the mds log are:

2018-02-10 16:40:56.914031 7f3028ee11c0 0 set uid:gid to 167:167 (ceph:ceph)
2018-02-10 16:40:56.914050 7f3028ee11c0 0 ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process (unknown), pid 1818
2018-02-10 16:40:56.920917 7f3028ee11c0 0 pidfile_write: ignore empty --pid-file
2018-02-10 16:41:15.514658 7f3021d85700 1 mds.ceph3 handle_mds_map standby
2018-02-10 16:41:15.531339 7f3021d85700 1 mds.0.3606 handle_mds_map i am now mds.0.3606
2018-02-10 16:41:15.531349 7f3021d85700 1 mds.0.3606 handle_mds_map state change up:boot --> up:replay
2018-02-10 16:41:15.531359 7f3021d85700 1 mds.0.3606 replay_start
2018-02-10 16:41:15.531364 7f3021d85700 1 mds.0.3606 recovery set is
2018-02-10 16:41:15.531371 7f3021d85700 1 mds.0.3606 waiting for osdmap 181020 (which blacklists prior instance)

Cluster is not balanced yet, but before the adding new osds it got balanced at a point but still 300 PGs inactive and rbd mount & cephfs not available.

Does anybody have an ideea how could i get this cluster back up&running ?
Would the "HEALTH_ERR" state being displauyed because of "1 filesystem is degraded" or " 717 pgs inactive" ?

Thank you very much guys, please let me know fi any thoughts on this...

Nice weekend,

Leo
 

MiniKnight

Well-Known Member
Mar 30, 2012
3,073
974
113
NYC
Is that a 3 node 20 drive setup?

What's the redundancy set to? 2 copies?

I'd check ceph IRC. That's very scary. I've seen drives fail but I've had extra capacity online and ready to handle the rebalances.
 

skunky

New Member
Apr 25, 2017
24
1
3
43
7 nodes, 2-7 osd each. one of the pools was size 2... :(
Any chances to put it on again ?