My hyperconverged Ceph cluster was reporting an inconsistent placement group on one of my OSDs. Proxmox doesn’t provide any method in the UI to repair these, so I had to SSH into the node and run a repair using the Ceph CLI.
- Check for inconsistent placement groups (PGs):
ceph pg dump pgs | grep "inconsistent"
- The first column of the output lists the PGID (e.g.
13.d
). The first part (13
) is the ID of the pool. You can see all the pools by ID and name with:
ceph osd pool ls detail | awk '{print $2, $3}'
- You may get more info on the exact objects affected with these commands, where
<pool_name>
and<pgid>
are from the outputs above.
rados list-inconsistent-pg <pool_name>
rados list-inconsistent-obj <pgid>
rados list-inconsistent-snapset <pgid>
- Start a repair on the errored PGs with:
ceph pg repair <pgid>