Proxmox VE 6.0 Released New ZFS and Ceph Improvements

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
I migrated a test box to get the screenshots while @cliffr was writing. It took a good amount of time but worked well.

I am probably going to upgrade the lab's ceph cluster this evening
 

niekbergboer

Active Member
Jun 21, 2016
154
59
28
46
Switzerland
The improvements in Nautilus are a huge step forward. Better performance, stability and tools. Highly recommend anyone using Ceph on Proxmox to take this upgrade soon.
They are; I particularly like the dynamic pg-num management.

The upgrade instructions on the Proxmox site regarding Ceph are good, too. There is an important ordering of events, and if you stick to that you're fine.

The one single upgrade "problem" I found occurs when you run hyperconverged, i.e. with MDSses and OSDs on the same machines:

At some point in the upgrade, you will have done a rolling restart of all OSDs in the cluster, and you will have run the ceph-volume tool to scan/activate all your OSDs in the new database. Then, the docs say that you should do a rolling reboot of OSD machines to ensure that Ceph correctly starts all OSDs on boot. *DON'T!* At least, not at that point: you will first want to do the MDS upgrade dance, since otherwise that will happen in an uncontrolled manner when doing the rolling reboot. If you run with just one active MDS (plus standby MDSses), as I do, you're fine. If your OSDs and MDSses run on different machines, this is a moot point, too. However, if you run with multiple active MDSses, you may run into trouble there.

Do the rolling reboot after the MDS upgrade in that case.

Edit: the one other thing is that you'll *first* want to remove the monitors' port numbers in ceph.conf, *before* you enable the mgr-v2 protocol.

I'm a Cowboy, and I did the upgrade on a live cluster with all VMs running and with RBD and CephFS traffic going on, and that went fine.
 
Last edited:
  • Like
Reactions: gigatexal

Laugh|nGMan

Member
Nov 27, 2012
36
7
8
Same here... i did the upgrade on a home lab 3-node IntelNUC E3815 1.46GHz with 6x128Gb USB SanDisk UltraFit OSDs. With MDSses [one active MDS + two standby MDSses] and OSDs on the same machines.

Cluster power consumption idle 21W including nodes os ssd drives (Intel 311 20GB, 520 60GB, Samsung 850 PRO 256GB) without 1Gb switch. 2-3 active lxc containers running Grafana+Telegraf+influxdb+PlexMediaServer+sometimes zabbix+mysql

Funny, but its enough to direct play from pms almost any media that sits on ceph including 4k to iPhone7. Of course. . . without transcoding.

Cluster total storage used (raw): 0.688 TB (0.739 TB) .
Cluster total single-channel memory used (raw): 23.04 GB (24 GB).
The Average CPU usage for idle node without VMs usually sits at 30% and IO delay 6.1


Upgrade to nautilus went smooth, except for some weird quirks/behavior with active MDS during reboots.


p.s. any ideas on upgrade path for nodes that fits in 30W total power consumption?
sorry for offtopic
:)
 
Last edited: