2014-10-08

Ceph repair inconsistent pg placement groups

If you run Ceph for any length of time you may find some placement groups become inconsistent. The Ceph website has a handy list of placement groups statuses. The entry for "inconsistent" is what you'd expect; there's a difference between replicas of an object.

ceph pg dump | grep -i incons | cut -f 1 | while read i; do ceph pg repair ${i} ; done

(from here)

Get the cluster as healthy as you can before attempt this. Ideally the inconsistent placement groups should be at "active+clean+inconsistent". That means first resolving any missing OSDs and allowing them time to heal. If the OSDs don't seem to cooperate try restarting them and then retry the above command.

Explanation of the command:
ceph pg dump
gives a list of all pgs and their current status.
| grep -i incons
find only the lines containing "incons" - short for inconsistent
| cut -f 1
we only want the first field from the output
| while read i; do
loop through each line (one per pg), storing the pg number in a shell variable called i
ceph pg repair ${i}
instructs ceph to repair the pg
; done
signals the closing of the loop

The above command has always worked for me, but there are things you can try if this command doesn't work.

The ceph website says that inconsistent placement groups can happen as an error during scrubbing or if there are media errors. Check your other system logs to rule out media-errors because they may indicate a failing storage device.

Good luck!