Minisforum MS-01 ProxmoxVE Clusters

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

jdpdata

Member
Jan 31, 2024
60
32
18
My people!
Welcome! Hoping this thread will provide some of us tips and tricks for setting up networking properly for CEPH. I have received my 3rd MS-01 on Monday. Very fast shipping as I got notice just the Friday before from DHL. Looks like Minisforum is finally cranking these puppies out in numbers. So far I have just clustered the 3 nodes. But haven't gotten time to setup CEPH yet. Maybe this weekend...
 

dialbat

New Member
Feb 23, 2024
10
0
1
Planning to do CEPH pools. I haven't yet setup it haven't gotten any free time. But one of my nodes is using 2x PM983a in ZFS mirrored just fine.
I have same config, just deciding how to organize 2 m2s, ZFS mirror or stripping. Ill have NAS for the backups.
 

jdpdata

Member
Jan 31, 2024
60
32
18
PSA: if you update to latest PVE 8.2 that was release today, it will change your 10G interface names from enp2s0f0 > enp2s0f0np0, enp2s0f1 > enp2s0f1np1. This will break networking for any VMs you have using those interfaces. To fix, simply nano /etc/network/interfaces to update to new names. I think I'm going to wipe all my nodes and install PVE 8.2 from scratch. Not sure what else change from 8.1 that may cause problems later. Thankfully I haven't setup CEPH yet.That would have been a major PITA to redo.
 

jdpdata

Member
Jan 31, 2024
60
32
18
I have same config, just deciding how to organize 2 m2s, ZFS mirror or stripping. Ill have NAS for the backups.
I don't think CEPH will work with RAID storage. I think you create OSDs for each NVMe and let CEPH manage them. I also have Synology NAS for backup stores. I may create "fast" pool for slot 2 NVMe since those are GEN3x4. And "slow" pool for slot3 NVMe since those are slower GEN3x2. Then create CEPHFS for the slower pool for ISOs storage or something. But I don't think it really matters unless you're doing 25G or 100G networking. GEN3x2 will support up to 2GB/s plenty fast for 10G.
 
  • Like
Reactions: dialbat

dialbat

New Member
Feb 23, 2024
10
0
1
PSA: if you update to latest PVE 8.2 that was release today, it will change your 10G interface names from enp2s0f0 > enp2s0f0np0, enp2s0f1 > enp2s0f1np1. This will break networking for any VMs you have using those interfaces. To fix, simply nano /etc/network/interfaces to update to new names. I think I'm going to wipe all my nodes and install PVE 8.2 from scratch. Not sure what else change from 8.1 that may cause problems later. Thankfully I haven't setup CEPH yet.That would have been a major PITA to redo.
Have you installed intel microcode ? Do you know if 8.2 would come with new microcode enabled, or should it be installed after fresh 8.2 install?
 

jdpdata

Member
Jan 31, 2024
60
32
18
Have you installed intel microcode ? Do you know if 8.2 would come with new microcode enabled, or should it be installed after fresh 8.2 install?
Yes, I did the microcode update with TTeck scripts. So I use his PVE scripts for everything. It's saves me time. I don't know if 8.2 comes with microcode update, probably not. I will do fresh PVE 8.2 install this weekend. I got Plex running now and people are watching TV so I can't take down my cluster yet.
 
  • Like
Reactions: dialbat

Techrantula

New Member
Apr 24, 2024
3
2
3
Trying to figure out how I want to setup my network.

I have a Mikrotik 24-port 1GB switch and a Mikrotik 8-port 10gb SFP+ switch.

With the 4 network ports that come with the MS-01, I was tempted to separate it out like this:
  • 1x 2.5gb for Corosync
  • 1x 2.5gb for Management / VM Traffic
  • 1x 10gb for Ceph Private
  • 1x 10gb for Ceph Public
What do you guys think about this? I don't have the luxury of multiple other switches. I thought about getting an addon PCie card for two additional 1Gb uplinks. Then I could put VM Public + VM Traffic on their own uplinks from Management. That would give each traffic type its own NIC.

Typing this out, that actually sounds better. I assume any kind of Intel 2x 1Gb RJ45 PCIe cards should work? Something like this?
https://www.amazon.com/Dual-Port-Gigabit-Network-Express-Ethernet/dp/B0BV648GSG/

I would imagine with it being just 1Gb uplinks, it wouldn't add too much additional heat? I did plan on adding the same SSDs that @jdpdata added to the two extra M2 22110 slots.
 

anewsome

Member
Mar 15, 2024
32
33
18
I got my proxmox cluster built earlier this week. 5x MS01 with 3x consumer class SSD in each. The slow NVME has proxmox installed, the other 2 are ceph OSD. That's 10x 2TB NVME for Ceph. My networking probably needs to be reworked, as I currently have the 10g interfaces bonded with LACP and both public and private Ceph addresses point to the 10g bond1. Crystaldiskmark tests in the VMs running on Ceph show acceptable performance (for me at least), but I could probably optimize it with a better setup.

Regarding stability, all 5 systems ran solid for a few days but this morning I woke up to 2 of the 5 nodes down. Couldn't get anything from the HDMI or any of the network on those systems, so I couldn't figure out if it was a kernel panic or whatever. A bit troubling, since Ceph was working overtime trying to rebalance everything. I bounced the two nodes and they came back just fine, Ceph did it's thing and they've been running all day now - but I'm very nervous what exactly happened.

I didn't burn the memory with memtest before I started configuring the cluster and migrating the VMs.
 
  • Like
Reactions: Techrantula

jdpdata

Member
Jan 31, 2024
60
32
18
@Techrantula are your switches L3 capable of inter-vlan routing? I'm using USW-Pro-MAX-24-POE with 8x 2.5G ports and USW-Pro-Aggregation with 28x 10G ports, both are L3 switches. So I have plenty of ports to do LACP link aggregation and the switches can offload hardware routing to give my UDM-SE a break. I created bond0 (2x2.5G) and bond1(2x10G) in PVE networking then use VLAN tagging to separate out the different networks (Management, corosync, VMnet, CEPH public/private). Bonding the interfaces will allow for redundancy and greater bandwidth. At least that's my plan. I haven't gotten free time to setup CEPH yet so don't know if my networking setup will work. So far done iperf testing they all communicate just fine.

I think your plan will work just fine too. Just create different VLANs to keep each of those networks separate you should be OK. The PVE recommended networking is a bit overkill of homelabbers like you and like. Separate switches for each networks will get crazy complex and expensive. I think if you have L3 switches and setup VLANs for each networks, that should be OK as you're not pushing a ton of traffic like in enterprise environment.
 

jdpdata

Member
Jan 31, 2024
60
32
18
I got my proxmox cluster built earlier this week. 5x MS01 with 3x consumer class SSD in each. The slow NVME has proxmox installed, the other 2 are ceph OSD. That's 10x 2TB NVME for Ceph. My networking probably needs to be reworked, as I currently have the 10g interfaces bonded with LACP and both public and private Ceph addresses point to the 10g bond1. Crystaldiskmark tests in the VMs running on Ceph show acceptable performance (for me at least), but I could probably optimize it with a better setup.

Regarding stability, all 5 systems ran solid for a few days but this morning I woke up to 2 of the 5 nodes down. Couldn't get anything from the HDMI or any of the network on those systems, so I couldn't figure out if it was a kernel panic or whatever. A bit troubling, since Ceph was working overtime trying to rebalance everything. I bounced the two nodes and they came back just fine, Ceph did it's thing and they've been running all day now - but I'm very nervous what exactly happened.

I didn't burn the memory with memtest before I started configuring the cluster and migrating the VMs.
That's troubling to hear two nodes shutdown randomly. I'm beginning to reconsider setting up CEPH at all due to the greater complexity and greater resources requirement. I've been running my 3 nodes MS-01 with Local ZFS storage and VMs replicating between the nodes with HA. That's been working fine for me over a year now with my old Lenovo tiny setup now also working same with MS-01. CEPH is not going to offer me anything extra other than maybe slightly faster migration time. I don't mind waiting 1-2 mins for migration. Nothing I run is mission critical.
 
  • Like
Reactions: anewsome

Techrantula

New Member
Apr 24, 2024
3
2
3
@Techrantula are your switches L3 capable of inter-vlan routing? I'm using USW-Pro-MAX-24-POE with 8x 2.5G ports and USW-Pro-Aggregation with 28x 10G ports, both are L3 switches. So I have plenty of ports to do LACP link aggregation and the switches can offload hardware routing to give my UDM-SE a break. I created bond0 (2x2.5G) and bond1(2x10G) in PVE networking then use VLAN tagging to separate out the different networks (Management, corosync, VMnet, CEPH public/private). Bonding the interfaces will allow for redundancy and greater bandwidth. At least that's my plan. I haven't gotten free time to setup CEPH yet so don't know if my networking setup will work. So far done iperf testing they all communicate just fine.

I think your plan will work just fine too. Just create different VLANs to keep each of those networks separate you should be OK. The PVE recommended networking is a bit overkill of homelabbers like you and like. Separate switches for each networks will get crazy complex and expensive. I think if you have L3 switches and setup VLANs for each networks, that should be OK as you're not pushing a ton of traffic like in enterprise environment.
That makes sense. Yes, I have a CRS326 from Mikrotik which does support inter-VLAN routing (doc for anyone who finds this post in posterity for configuring this with RouterOS in Mikrotik).

I figured the networking was a bit overkill for the homelab. I know there is a large part of us who like to follow 'best practices', but sometimes you work with what you got. Appreciate the feedback.
 
  • Like
Reactions: jdpdata

jdpdata

Member
Jan 31, 2024
60
32
18
Well, I gave in and spent this afternoon rebuilding my cluster from scratch. Reloaded fresh PVE 8.2-2 to all 3x MS-01. Then join them to cluster and setup CEPH. Took a couple hours to restore all my VMs/LXCs. They're all working incredibly well. I tested by shutdown a node while pinging VM running on that node. During shutdown it automatically migrated all VMs to an active node super fast like within seconds. My pings was never interrupted. Incredible! Now I understand why CEPH is so great!! I'm super happy with this setup. My networking setup is below if someone wants to see. Only thing now is figuring out why my NAS storage is using slower 2.5G network. I want it on the 10G network. Someone give me a clue where to look??

1714178895415.png
 
  • Like
Reactions: dialbat

dialbat

New Member
Feb 23, 2024
10
0
1
I don't think CEPH will work with RAID storage. I think you create OSDs for each NVMe and let CEPH manage them. I also have Synology NAS for backup stores. I may create "fast" pool for slot 2 NVMe since those are GEN3x4. And "slow" pool for slot3 NVMe since those are slower GEN3x2. Then create CEPHFS for the slower pool for ISOs storage or something. But I don't think it really matters unless you're doing 25G or 100G networking. GEN3x2 will support up to 2GB/s plenty fast for 10G.
How is temp on your MS-01 with two Samsung PM983a running?
 

jdpdata

Member
Jan 31, 2024
60
32
18
How is temp on your MS-01 with two Samsung PM983a running?
The Samsung PM983a NVMe are holding steady at 53-55C. That's with all 3 nodes inside my 19" rack with less than optimal airflow. The Kingston 1TB with heatsink is even lower at ~35C. Would be great if we have a couple mm to add heatsinks for slot2/3.

I've been keeping an eye on CPU thermals and really have no major issues to report. CPU will bump up to 60-70C when Plex is running background tasks, but will quickly drop back down to ~40C. I don't see any reasons to change die paste at this point.
 

anewsome

Member
Mar 15, 2024
32
33
18
That's troubling to hear two nodes shutdown randomly. I'm beginning to reconsider setting up CEPH at all due to the greater complexity and greater resources requirement. I've been running my 3 nodes MS-01 with Local ZFS storage and VMs replicating between the nodes with HA. That's been working fine for me over a year now with my old Lenovo tiny setup now also working same with MS-01. CEPH is not going to offer me anything extra other than maybe slightly faster migration time. I don't mind waiting 1-2 mins for migration. Nothing I run is mission critical.
As it turns out, all 5 nodes were crashing rather regularly. I never had more than 1 day or so of uptime across the cluster. I detailed a bit of this in a different thread. I'm still on PVE 8.1.11, since I installed 2 days before 8.2 - not upgrading them because I don't feel like dealing with the renamed interfaces.

The cluster has been much more stable now, fingers crossed. Not sure what made the difference. I did the microcode update and I simplified the networking too. The networking on each node is super simple. bond0 is now LACP with both 2.5g ethernets. bond1 is LACP with both 10g ethernets. Ceph public/private is on bond1. Everything else, including corosync, migrations, VM traffic is over bond 0.

The crashes also may have been attributed to the network switch. I noticed that the CPU on the switch was stuck on 100% for over 7 days straight and my original attempts at LACP bonding on that switch were failing too (link would never come up, all communication to the hosts were lost). I contacted support for the switch and they wanted to see all the details of why the LACP bond was failing and the 100% CPU util. While taking screenshots for the LACP setup for support, I applied the settings and the LACP bond came right up. The CPU also came down to ~5-10%. I applied the other LACP bond settings across the 5 MS01 and the 4x 1Gb LACP bond for the Synology NAS and they all came up fine too.

So maybe it was the microcode? Maybe it was the switch? Who knows. I've got over 3 days of uptime on all 5 MS01s now. Let's hope it stays that way.
 

jdpdata

Member
Jan 31, 2024
60
32
18
Glad you got it stable now. I never had any of these issues. Before deploying any nodes I did 24hrs burn-in MemTest. Only after it PASSED, then I install fresh PVE8.2-2. Afterward, applied the Intel microcode. I actually ran single node for over 2 weeks for testing before deploying my 3 nodes cluster. Guess I'm lucky.
 
  • Like
Reactions: anewsome