Hey everyone. This is my first time posting here, though I have been watching the channel and have used many of the articles on the site for a long time now, and signed up to join the community and hopefully get some other ideas, opinions and help with my homelab. I have been running a home for many years now (I started back when I was in high school back in circa 2001 ish) and looking at modernizing the hardware I am running to get a little more performance but more so to reduce the energy consumption. I also apologize for this being a super long post but felt it important to provide some context of the use case and the hardware before sharing the current options I am considering.
My lab currently consists of 7 Proxmox 7 nodes which are mostly HPE DL380 G5s (32GB ram) though there is an older SuperMicro custom-build (96GB ram) and a Dell PE2950 G3 (32GB ram) in there as well. They all have a 250GB SSD for the Proxmox OS and another as a Ceph OSD and are networked together using two DGS-1100-24 port switches with each server having the following gigabit connections:
The Proxmox nodes and storage side are linked using another DGS-1100-24 that also connects my homelab to the rest of the house and also connects my modem to the homelab on a dedicated VLAN. I use a number of VLANs to separate out traffic with the important ones being:
The first plan was to replace all the Proxmox nodes with five custom-built 4U systems using the following:
The second idea that came to me was to replace all the Proxmox nodes with multiple 1L mini PC systems that would be configured as follows:
Right now the big areas I am researching are:
I would love to hear any feedback, ideas, thoughts, suggestions, or comments, from the community on this overhaul plan as I want to make sure that this next iteration of my homelab not only brings the hardware up to a more modern age but also the deployment is done with the goal of reducing the energy usage while expanding the capabilities of my homelab and learning and implementing new and better practices along the way.
My lab currently consists of 7 Proxmox 7 nodes which are mostly HPE DL380 G5s (32GB ram) though there is an older SuperMicro custom-build (96GB ram) and a Dell PE2950 G3 (32GB ram) in there as well. They all have a 250GB SSD for the Proxmox OS and another as a Ceph OSD and are networked together using two DGS-1100-24 port switches with each server having the following gigabit connections:
- 1x IPMI
- 1 x Proxmox Management & WebUI
- 1 x Storage Traffic
- 3 x Virtual Machine Traffic
The Proxmox nodes and storage side are linked using another DGS-1100-24 that also connects my homelab to the rest of the house and also connects my modem to the homelab on a dedicated VLAN. I use a number of VLANs to separate out traffic with the important ones being:
- 100 - Core Network (Switches and WAPs)
- 200 - IPMI Network
- 300 - Compute Network (Proxmox Management)
- 400 - Storage Network
- 500 - Virtual Machine Network
- 600 - Testing Network
- 700 - WAN 1 Network
- 800 - WAN 2 Network
- 900 - Family Network
The first plan was to replace all the Proxmox nodes with five custom-built 4U systems using the following:
- Rosewill 4U 12-Bay Hot Swap Chassis
- Ryzen 5700G CPU
- ATX X570 motherboard
- 64GB RAM (upgradeable to 120GB down the road)
- single ATX power supply
- 2 x consumer SSDs for Proxmox OS using ZFS mirror (250GB ish, could also M.2 or SATA)
- SSDs for Ceph OSDs (Would start with possibly 2 500GB with room for 12 total)
- They would either have IPMI on the motherboard or would be connected to a KVM switch that is attached to a Raspberry Pi 4 running PiKVM.
- 2 x Gigabit connections for Proxmox Management & WebUI
- 2 x Gigabit connections for Storage (upgradeable to a dual 10 gigabit RJ45 NIC)
- 2 x Gigabit connections for Ceph (upgradeable to a dual 10 gigabit RJ45 NIC)
- 4 x Gigabit connections for virtual machine traffic (upgradeable to a dual 10 gigabit RJ45 NIC)
The second idea that came to me was to replace all the Proxmox nodes with multiple 1L mini PC systems that would be configured as follows:
- Dell 7080 or Lenovo M920q
- 1 (ideally 2) x M.2 SSD for Proxmox OS using ZFS mirror (250GB ish)
- 5 x SATA SSD for Ceph OSD (500GB ish to start)
- 2 x Gigabit connections for Proxmox Management & WebUI
- 2 x Gigabit connections for Storage (upgradeable to a dual 10 gigabit RJ45 NIC)
- 2 x Gigabit connections for Ceph (upgradeable to a dual 10 gigabit RJ45 NIC)
- 4 x Gigabit connections for virtual machine traffic (upgradeable to a dual 10 gigabit RJ45 NIC)
- HDMI / Display Port = Using an adapter would convert it to VGA to make it compatible with my existing KVM, though I might not need this with vPro, though I do not know much about it I believe it is very similar to IPMI in the server world.
- NIC 1 = Proxmox Management & WebUI
- WIFI = Unused unless passed through to a virtual machine
- USB 1 = Mouse/Keyboard connection to the KVM
- USB 2 = USB Hub 01 (This would be a 4-port hub)
- Port 01 = USB gigabit NIC for Proxmox Management & WebUI
- Port 02 = USB gigabit NIC for Storage
- Port 03 = USB gigabit NIC for Ceph
- Port 04 = USB gigabit NIC for virtual machine traffic
- USB 3 = USB Hub 02 (This would be a 4-port hub, details below)
- Port 01 = USB gigabit NIC for Storage
- Port 02 = USB gigabit NIC for Ceph
- Port 03 = USB gigabit NIC for virtual machine traffic
- Port 04 = External USB SSD for Ceph OSD (500GB ish to start)
- USB 4 = USB Hub 03 (This would be a 4-port hub, details below)
- Port 01 = External USB SSD for Ceph OSD (500GB ish to start)
- Port 02 = External USB SSD for Ceph OSD (500GB ish to start)
- Port 03 = External USB SSD for Ceph OSD (500GB ish to start)
- Port 04 = External USB SSD for Ceph OSD (500GB ish to start)
Right now the big areas I am researching are:
- How much of the 1-Gigabit network capacity is truly being utilized based on the traffic on that particular network / NIC. This is to determine if I need that many network connections and or if I truly need to be looking at 10-Gigabit networking in the next 1 to 3 years. I am not sure how to measure this properly and still working on how this will be figured out. Ideally, I would be able to place a device (think Raspberry Pi) or create a virtual machine that would monitor the various networks and or NICs on each machine and report how much traffic they are passing and how much bandwidth they are using to see where any bottlenecks are.
- Power consumption, right now my homelab is running around 2200 watts which is costing around $100 - $150 a month on power. I know the Proxmox nodes are using around 200 to 250 watts of power each. The NAS devices are a lot more power efficient though the drives themselves consume a noticeable amount of power I am willing to keep that consumption and work on the Proxmox node side to get the usage down.
- Standardization and upgradeability. I would ideally like all of the Proxmox nodes to be the same as it makes migration and HA failover a little easier as well as I want to get away from the RAID cards that are in the Dell and HPE servers and get direct access to the disks for both ZFS and Ceph. I would also like to get past the 32GB ram limit in my Proxmox nodes as this is mostly where I find myself running into limitations with the number of virtual machines and services I can host come from, though containerizing services into groups and running them on a smaller number of virtual machines has helped vs the only way was one service per virtual machine. The last is getting support for PCI-e passthrough and CPU feature sets for things like hardware encryption for my virtualized pfSense install for example or passing through the onboard GPU for a status monitor mounted in the rack.
- I am also looking at swapping out the DGS-1100 switches for Unifi models though not sure I want to jump to 10-Gigabit now or have a new Unifi 1-Gigabit setup in the middle. I would love to have my switches and WAPs on a single pane of glass for management, mostly for VLANs.
- Some of the services I run are pfSense, Unifi Controller, TrueCommand, Nginx as a reverse proxy, Mailcow, Proxmox Mail Gateway, Portainer, Jellyfin, Navidrome, BookStack, Gitea, Vaultwarden, Snipe-IT, OS Ticket, Sonarr/Radarr/Lidarr/Tdarr, etc., WordPress sites, and a few others I am probably forgetting.
I would love to hear any feedback, ideas, thoughts, suggestions, or comments, from the community on this overhaul plan as I want to make sure that this next iteration of my homelab not only brings the hardware up to a more modern age but also the deployment is done with the goal of reducing the energy usage while expanding the capabilities of my homelab and learning and implementing new and better practices along the way.