lga3647 esxi build *Nov 2024 --> Began Switching from vSphere to Proxmox

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
When importing an ESXi VM into Proxmox, one of the fields it's needing is to know where we want to store the converted vm. It was only showing me the local-lvm which is on my proxmox boot disk ssd. I wanted to import into my newly created LVM-Thin Pools. To do that I have to tell my Datacenter-Storage that I have an LVM-ThinPool(s). After that is done I can select those thinPools for placing my imported and converted VM.


1733107972344.png
1733108187415.png

Now I should be able to Import a VM into one of those custom miniprox1_thinpool# storage areas:
1733108437918.png
 

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
So far so good. My new little supermicro minitower has 10Gb adapters, but my 10Gb switch was full so I'm connected to a 1Gb switch. It's moving pretty good. Here is my first ESXi import progress so far:

1733110851040.png
 

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
It completed successfully

1733112677406.png

This 100GB vm imported and converted from the ESXi to Proxmox in about 30 minutes over gigabit LAN.
1733112733557.png
1733112748891.png

1733112769897.png

Actually, I won't know if it was successful until I try booting it up.. tomorrow. Time for bed now.

Next up will be to boot into the new converted VM, uninstall vmware-tools, install proxmox variant called qemu-guest-agent and test it out.

Then it's only about 50 more VMs to convert.. Many of those are over 500GB so I may want to connect to my 10Gb switch for the rest of those conversions. Im very HAPPY with this.

*edit: I later dropped the VM and reimported it again, but this time using 10Gbit connection between hosts and it finished much faster, as you'd expect:
1733188273983.png
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
I've not had much time to work in this in past few weeks, but my next step will be to install qemu agents in the imported VMs. That is proxmix equivalent to vmware-tools or vmtools which handles the virtual driver's in the OS.

I ran into some issues trying to install that qemu agent software a few weeks ago into the Oracle Linux guest vm in my proxmox host. Qemu installation and service was failing to start because of, maybe, missing dependencies packages and such. I need to figure that out and it's my top priority for this project before i continue importing and converting other vms from esxi to prox
 

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
Screenshot of me installing qemu into my newly converted guest vm.
1734647406874.png

But qemu (manages the drivers, similar to vmware-tools) would fail to startup:
1734647422400.png
Screenshot of the Proxmox 8.3 ESXi Importer plugin settings screen which should fix the qemu_guest_agent:
1734647174650.png

I think I've discovered the reason for my issues installing and starting up the qemu-guest-agent (equivalent to vmware tools). My guest linux vm that I was converting, I needed to set the virtual network adapters to be VirtIO and also for the Scsi disks. I had those set to defaults which were the vmware settings as they were in esxi. I'm re-importing the guest into prox again from esxi with those changes to see if that helps.
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
Nevermind, that didn't fix it. Now the guest won't even bootup. The prox guest vm just gives a spinning circle until it gives up and goes into Emergency shell mode.

Okay, I think I have it figured out. When I convert my Linux VMs from ESXi to Proxmox, I needed to change the import settings as shown in my previous posting (shown in Advanced tab) to be VirtIO. But then I also needed to change a few things in the resulting proxmox guest vm .conf file after the import is completed.

Edited /etc/pve/qemu-server/101.conf from this:

bios: ovmf
boot: order=sata0;scsi0
cores: 4
cpu: x86-64-v2-AES
efidisk0: miniprox1_thinpool001:vm-101-disk-0,size=4M
memory: 10240
meta: creation-qemu=9.0.2,ctime=1734646745
name: BRTAT20b
net0: virtio=BC:24:11:20:44:85,bridge=vmbr0
net1: virtio=BC:24:11:08:10:6C,bridge=vmbr0
ostype: l26
sata0: none,media=cdrom
scsi0: miniprox1_thinpool001:vm-101-disk-1,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=4231016d-8b18-8bd2-6e22-39429ef526ec
sockets: 1
vmgenid: 7ea239f7-8420-468d-afb5-b27ee5b909ee


to this:

bios: ovmf
boot: order=sata0;sata1
cores: 4
cpu: x86-64-v2-AES
efidisk0: miniprox1_thinpool001:vm-101-disk-0,size=4M
memory: 10240
meta: creation-qemu=9.0.2,ctime=1734646745
name: BRTAT20b
net0: virtio=BC:24:11:20:44:85,bridge=vmbr0
net1: virtio=BC:24:11:08:10:6C,bridge=vmbr0
ostype: l26
sata1: none,media=cdrom
sata0: miniprox1_thinpool001:vm-101-disk-1,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=4231016d-8b18-8bd2-6e22-39429ef526ec
sockets: 1
vmgenid: 7ea239f7-8420-468d-afb5-b27ee5b909ee


Now I can boot into the VM. Next I'll uninstall the vmware-tools and then install qemu.
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
OH MY WORD! It is working now. qemu-guest-agent is working now in my guest linux VM. I cannot believe how much time I've spent on this. The solution was so easy, but it took me a LONG time to figure it out.

Here is the error I would see.

1734660548599.png
I tried everything I could think of. I've been messing around with .conf file as I've mentioned in previous posts. I've tried editing my grub to diagnose in single-user mode, adding enterprise linux repo URLs to my yum repositories to install special virtio packages, changing rebuilding the kernal with different virtio drivers, nothing was working. 'lsmod|rgep virtio' shows I had everthing I needed. but I could not get that qemu-guest-agent service to start and run.

Then I read some guy in a forum saying he had to ENABLE Qemu guest agent while it was off. That didn't make any sense to me. How could I 'systemctl enable qemu-guest-agent' in the VM command line if the VM is off. Turns out I misunderstood what they meant. They were meaning to ENABLE this OPTION in the proxmox GUI for that VM. THIS IS THE SOLUTION:

1734660858118.png

Now the qemu-guest-agent is working in my guest linux VMs. so easy, yet hidden from me all this time.

1734661296544.png

I probably didn't have to change any of that crap I did to the .conf files. I'll re-import from esxi again, uninstall vmware tools in guest, install qemu-guest-agent. shutdown - flip that QEMU Guest Agent Enabled option in the gui. Restart VM and test. I bet that is all I had to do. I'lll try that tomorrow to ensure I have the steps correct.
 

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
Now that I know how to ENABLE qemu-guest-agent in the vm (see my previous post above), the next question is what should the virtual disk controller be that I use in proxmox?

When I import an ESXi Guest VM (linux in this case), the disk controller is VMWARE PVSCI. That will would still work once the VM is imported and converted into Proxmox, however the performance is not good using VMWARE PVSCI as Prox doesn't optimize for a VMWARE PVSCI. We want to change the scsi controller from "VMware PVSCI" to instead "VirtIO SCSI".
* please refer to JLAURO's post (scroll down in the link --> ) showing his IOP testing and comparing virtio scsi vs vmware pvcsi in a proxmox guest: Worth swapping vmware pvscsi to virtio scsi?

However, if we import the guest vm into prox and setup to use VirtIO SCSI then the linux guest won't bootup at all (windows guests don't seem to have that problem). To get it the linux guest to boot using VIRTIO SCSI controller I'd have to change the line containing scsi0 to instead references sata0, which is weird right?
cd /etc/pve/qemu-server/
vi 102.conf #<-- my newly imported guest vm#102 config file

scsi0: miniprox1_thinpool001:vm-102-disk-1,size=100G
change to
sata0: miniprox1_thinpool001:vm-102-disk-1,size=100G

But I definitely don't want to do that. I don't want to switch the device from scsi0 to sata0, a bad workaround that would allow for bootup but with even worse performance than if we stayed with VMWARE PVSCSI. So, don't change from scsi0 to sata0. I'll keep that .conf file as it was after import (VMWARE SCSI and scsi0, etc):

root@miniprox1:/etc/pve/qemu-server# cat 102.conf
agent: 1
bios: ovmf
boot: order=sata0;scsi0
cores: 4
cpu: x86-64-v2-AES
efidisk0: miniprox1_thinpool001:vm-102-disk-0,size=4M
memory: 10240
meta: creation-qemu=9.0.2,ctime=1734709623
name: BRTAT20c
net0: virtio=BC:24:11:DF:1B:11,bridge=vmbr0
net1: virtio=BC:24:11:29:EE:16,bridge=vmbr0
ostype: l26
sata0: none,media=cdrom
scsi0: miniprox1_thinpool001:vm-102-disk-1,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=4231016d-8b18-8bd2-6e22-39429ef526ec
sockets: 1
vmgenid: 339a0797-63ce-4e7c-9057-52b416c86ad7
root@miniprox1:/etc/pve/qemu-server#

So how do we bootup if it won't boot with scsi0 and virtio scsi drivers? The solution I've found to use controller=VirtIO SCSI (as shown in above .conf as scsihw: virtio-scsi-pci) while also still using scsi0 disk devices, and then boot into RESCUE kernel to rebuild/reinstall the linux main kernel. Not sure why that is needed, but that's my workaround for now:

I boot into the guest VM's linux rescue kernel by selecting it from the grub menu (because that top listed kernel doesn't bootup I must boot into "rescue" and recreate or reinstall a new workable kernel):
1734726604542.png

I wanted to use a el8uek (UEK=Oracle Unbreakable Linux kernel), and so once I boot into the rescue kernel I install a new/latest UEK kernel:
1734726775628.png

Then I add that new kernel so that it appears and defaults in the grub menu:
1734726899284.png

Then I can boot into the new kernel:

1734727227776.png

and the proxmox qemu service is running too:
1734727864208.png

double checking that the virtio drivers are installed and being utilized:
1734728071730.png

And because I was curious and wanted to know what those are, here is chatGPT-4o response as to what those various Virtio drivers are for:

The VirtIO drivers you see in the output each serve a different purpose, particularly when it comes to optimizing I/O performance and memory management in virtualized environments. Here's a breakdown of what each module represents:

1. virtio_balloon
  • Purpose: This module is responsible for dynamic memory management. The balloon driver allows the hypervisor (in this case, Oracle Linux with KVM or QEMU) to dynamically adjust the memory allocated to a virtual machine. It does this by "inflating" and "deflating" the balloon, which mimics memory usage and helps the hypervisor optimize physical memory between guests.
  • Use case: It helps manage overcommitment of memory by adjusting the memory size available to VMs without requiring a reboot. If memory is tight on the host, the balloon driver can release memory from the guest to the host.
2. virtio_net
  • Purpose: This module provides the VirtIO driver for network devices. VirtIO networking enables high-performance network I/O for virtual machines (VMs) by providing a paravirtualized network device.
  • Use case: It offers significantly better performance than emulated network devices (such as the e1000), making it ideal for high-performance networking in VMs.
3. virtio_console
  • Purpose: The VirtIO console driver provides a paravirtualized serial console interface to the guest.
  • Use case: This is used to manage serial devices in the guest (such as virtual serial ports for debugging or logging). It's typically used for management purposes and does not directly impact normal network or disk I/O.
4. virtio_scsi
  • Purpose: This module provides a paravirtualized SCSI driver for block storage devices.
  • Use case: It allows high-performance SCSI devices (such as virtual disks) to be accessed by virtual machines. It is used to provide better I/O performance for VM disk devices compared to the emulated IDE or SATA controllers.
5. virtio_pci
  • Purpose: This driver enables communication between the guest and VirtIO devices over PCI (Peripheral Component Interconnect). It acts as a bridge between the virtual machine and VirtIO devices, including network, block, and other devices.
  • Use case: This is the core module that facilitates communication for other VirtIO drivers (like virtio_net, virtio_blk, etc.) by registering the VirtIO devices on the PCI bus.
6. virtio_pci_legacy_dev
  • Purpose: This is a legacy driver for older versions of VirtIO that were designed to work with specific configurations of the VirtIO PCI device.
  • Use case: It's used for backward compatibility with older virtual machines or systems that use earlier versions of VirtIO. It's generally not needed for newer systems using modern VirtIO drivers but is included for compatibility.
7. virtio_pci_modern_dev
  • Purpose: This is a more recent version of the virtio_pci driver that provides better performance and more features compared to the legacy driver (virtio_pci_legacy_dev).
  • Use case: It is used to enable communication between the guest and VirtIO devices with the latest optimizations and features.
Summary of Key Concepts:
  • Ballooning: The virtio_balloon driver allows dynamic memory management for virtual machines, enabling the host to reclaim memory from the guest when needed.
  • Legacy vs. Modern: The virtio_pci_legacy_dev and virtio_pci_modern_dev drivers are related to compatibility. The legacy driver is for older systems, while the modern version provides enhanced performance and features.
  • Networking and Block Storage: The virtio_net and virtio_scsi drivers are used for network and storage devices, respectively, and provide efficient I/O by avoiding the overhead of emulated devices.


Clear as mud?

I'll probably spend alot of my vacation next week importing and converting most of my ESXi VMs into Proxmox. Cool stuff
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
I'm also excited because proxmox allows for a wider variety of PCIe NVMe SSDs. ESXI vmware comptability requirements killed off my desire to use my affordable m.2 nvme SSDs. I had to use hacks if not using expensive enterprise samsung firmware pcie ssds in esxi. Now I'm going to be using speedy pcie ssds in my home lab finally, without hacks :D the X11DPI has x4x4x4x4 bifurcation on it's x16 slots, so that we can use at least two or maybe even three or four of these Hyper M.2 cards by ASUS. I had been experimenting with these in esxi years ago, but I gave up because I didn't like to hack it to make it work with non compatible consume pcie m.2 SSDs. So I'm bringing the hyper cards project back when I convert my big host to proxmox. I even bought extra Hyper cards so I can see how many I can fit into the server. Having four x16 slots (actual x16 lanes, not x8 in a physical x16 slot) dual lga3647 sockets should give me enough lanes. Excited to get fully onto proxmox!

photo of asus hyper2 gen4 (I'll be using gen4 SSDs on gen4 card, but with gen3 PCI slot, but still works great)
PXL_20241220_211426036.MP.jpg
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
Adding more notes for my own future referencing.

Importing Windows (10 and 11) guest VMs from ESXi into Proxmox as virtio scsi. The goal:

- Uninstall VMware-tools from the windows VM
- Install qemu-guest-agent (proxmox own variant of tools)
- Install Virtio Drivers
- Change the virtual disk devices as seen in Proxmox from being SATA to SCSI, for performance.

When I'd imported a windows guest vm from ESXi to Proxmox via the Proxmox ESXi Importer, it would setup the resulting VM's hardware configuration file (/etc/pve/qemu-server/<vm#>.conf) to show the disks as being "sata" instead of "scsi". Even when the controller was set to use VirtIO SCSI. It would boot fine, but I was told that performance would be poor unless the disk was showing as "scsi", not "sata".

Simply changing the .conf to show scsi0 instead of sata0 after import didn't solve it. It would NOT boot if the disk was changed to scsi in the .conf file right off the bat.. So, here are all of my steps (refined by entering them into chatGPT for a clean concise listing):

Steps to Migrate a Windows VM from ESXi to Proxmox
  1. On ESXi (Before Shutdown):
    • Uninstall VMware Tools inside the Windows guest VM via Control Panel > Programs and Features > Uninstall VMware Tools.
    • Reboot the Windows guest VM after uninstalling VMware Tools.
  2. On ESXi (Prepare the VM for QEMU Guest Tools Installation):
    • Mount the QEMU .iso from the ESXi Datastore as a CD/ROM device in the guest VM (e.g., mount virtio-win-0.1.266.iso as D: drive in the VM).
    • Install the QEMU Guest Agent inside the guest VM by running the qemu-guest-agent installer from the mounted .iso on the D: drive.
    • Reboot the VM after installing the QEMU Guest Agent.
  3. On ESXi (Shutdown the VM):
    • Shutdown the Windows guest VM in ESXi after installation of the required tools.
  4. On Proxmox (Import the VM):
    • Use the Proxmox ESXi Import Tool:
      • Set the SCSI disk controller to Virtio SCSI.
      • Set the network adapter to VirtIO with a unique MAC address.
      • Disable any CDROM SATA devices (if present).
      • Set the socket count to 1 because my proxmox server has just one socket, and adjust the CPU cores (typically double the per socket core count from before, as I'm now using only one socket).
  5. On Proxmox (Attach VirtIO and QEMU ISO):
  6. On Proxmox (Install VirtIO and QEMU Drivers):
    • Boot the VM from the Proxmox GUI.
    • Inside the VM, install the VirtIO drivers and QEMU guest agent drivers from the respective CD/ROMs (SATA1 for VirtIO, SATA2 for QEMU).
    • After installation, shutdown the VM in the Proxmox GUI.
  7. Edit the .conf File (Add Both sata0 and scsi0 - these will be duplicates of each other - mapped to the same device, but only booting from sata0 as shown by "boot: order=sata0") :
    • Open the Proxmox terminal and edit the VM’s .conf file located in /etc/pve/qemu-server/101.conf.
      • Add the following lines under the VM configuration file (excerpt shown, these are not displaying all of the .conf):
        boot: order=sata0
        sata0: miniprox1_thinpool001:vm-101-disk-1,size=100G
        sata1: local:iso/virtio-win-0.1.266.iso,media=cdrom,size=707456K
        sata2: local:iso/qemu-guest-agent.iso,media=cdrom,size=178070K
        scsi0: miniprox1_thinpool001:vm-101-disk-1,size=100G
        scsihw: virtio-scsi-pci
  8. On Proxmox (Boot the VM and Check SCSI0):
    • Boot the VM.
    • Inside Windows, check Device Manager. Both sata0 and scsi0 should be recognized.
    • Disk Management will show scsi0 as offline because it points to the same disk as sata0. This is expected.
  9. On Proxmox (Shutdown the VM Again):
    • Shutdown the VM again using the Proxmox GUI.
  10. Edit the .conf File (Remove sata0 and Change Boot Order):
    • Edit the .conf file (excerpt shown) to remove the sata0 reference and change the boot order to boot from scsi0. The configuration should look like this (comment out, or remove the sata references):
      boot: order=scsi0
      scsi0: miniprox1_thinpool001:vm-101-disk-1,size=100G
      #sata0: miniprox1_thinpool001:vm-101-disk-1,size=100G
      #sata1: local:iso/virtio-win-0.1.266.iso,media=cdrom,size=707456K
      #sata2: local:iso/qemu-guest-agent.iso,media=cdrom,size=178070K
      scsihw: virtio-scsi-pci
  11. On Proxmox (Boot the VM and Verify):
    • Boot the VM again.
    • The VM should now boot from the scsi0 disk and run Windows correctly.
    • Check Device Manager and Disk Management to ensure that everything is working as expected.
  12. Configure Network Adapters (Optional):
    • In the Proxmox GUI, configure the network adapters as needed.
    • Ensure that the correct IP addresses are assigned. Use the dnsmasq file on your DNS server to verify the IP mapping.
  13. Reboot the Guest VM:
    • Reboot the VM inside Windows and confirm that everything is working, particularly the QEMU Guest Agent service.
  14. Optional: Add Display (SPICE):
    • In the Proxmox VM GUI, go to Hardware > Add Display and choose SPICE to improve display performance.
Notes:
  • By following these steps, you ensure that both sata0 and scsi0 are configured initially to help Windows recognize the new SCSI disk before fully transitioning to SCSI.
  • This process avoids potential issues where the VM might fail to boot if only scsi0 is added at the start without being recognized.
  • Removing the sata0 reference and adjusting the boot order ensures that the VM will now boot correctly from the scsi0 disk.
It looks chaotic and crazy, but in short it was Steps 7 and 8 that was the magic sauce/fix: by adding both a sata0 and scsi0 to the .conf file (with scsi0 mapped exactly same as sata0) and booting, it will boot using sata0, and scsi0 will be recognized but offline in windows guest (expected because it is a duplicate of sata0). That allows windows to enable the virtio SCSI drivers for that offline scsi0 device. Then we shut it down again and edit the .conf and remove the sata references, leaving just the scsi0 and then it will successfully boot using scsi0 instead of sata0.

These notes are somewhat unique to myself, for example the refernce to miniprox1_thinpool001 in the .conf files is how I delcared and reference my proxmox storage 4TB SSD1 disk.
 
Last edited:
  • Like
Reactions: marcoi and Marsh

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
Summary of my proxmox storage layout and my Reasoning:

In my proxmox, I'm utilizing Thin-LVMs (block storage not mounted to a filesystem path). I chose ThinLVMs over LVM or ZFS because it is possible to dactivate a ThinLVM VG (volume group) and reactivate them again later. i.e. I can detach and re-attach a ThinLVM Volume Group, whereas you cant do that with a regualr LVM and difficult with a ZFS pool of drives.

Each thinLVM is one-to-one with a virtual machine's virtual disk.

1735850418835.png
  • Easy Detachment and Attachment of SSDs:
    • From the diagram above we can see that each SSD is assigned a single Physical Volume (PV). Each Volume Group is assigned a single SSD (PV). Within each VG there is also a single ThinPool. Each VM is made up of one or more ThinLVMs within that ThinPool. This setup allows an SSD to be easily deactivated, detached, and replaced with a new one when full.
      • deactivate a VG (example): vgchange -an miniprox1_VG001 #we can re-activate this later too.
      • slide out that SSD containing VG001 from the physical proxmox chassis. no need to unmount I chose not to mount thin pools. It is block storage and proxmox handles assigning it as raw disk to the guest vm. But if we had mounted it (it's an option) then we'd do a 'umount /dev/sda' before physically removing the SSD from the chassis.
      • slide in a new SSD.
      • format the new SSD with a gpt partition, using 'wipe' or 'gdisk' commands.
      • create a new PV for the newly installed SSD: pvcreate /dev/sda
      • pvscan # ensures the new PV is seen.
      • create a new VG with that new PV: vgcreate miniprox1_VG008 /dev/sda
      • create a new ThinPool on that VG: lvcreate -L 3.5T -n miniprox1_thinpool008 --type thin-pool miniprox1_VG008
      • Begin creating new virtual machines in the PROXMOX GUI for that thinpool. Proxmox automatically creates the Virtual Machine's Thin-LMV(s).
      • Later on we can deactivate another VG and re-activate the VG001 we unplugged earlier to access it's VMs:
        • first, make a hotswap SSD caddy tray available by deactivating another VG: vgchange -an miniprox1_VG004
        • slide out the SSD that had VG004 from the proxmox chassis.
        • slide in the original SSD and reactivate VG001 which we had deactivated earlier (example): vgchange -ay miniprox1_VG001
    • This design is ideal for my direct attached storage in a minitower with only 4 hot-swap SSD caddy trays, and I didn't want to invest time into setting up a NAS or SAN for my ESXi to Prox VM conversions.
  • Efficient Use of Storage with Thin Provisioning:
    • Thin provisioning allocates space on demand, avoiding waste associated with unused or over-allocated volumes. Regular LVM lacks this feature, requiring upfront allocation.
  • Optimized for Proxmox VM Storage:
    • Thin LVMs integrate directly with Proxmox as block storage for VM disks, with each VM’s disk mapped one-to-one with a Thin-LVM.
    • Regular LVMs would require additional filesystem layers, and ZFS, while functional, involves greater resource consumption and complexity. ZFS is also a POOL of drives and I didn't want a pool of drives. I need to remove a single physical disk when it becomes full of VMs (VMs=ThinLVMs)
  • Low Overhead and Simplicity:
    • Thin LVMs impose minimal CPU and memory demands, ensuring resources remain available for VM workloads.
    • ZFS's advanced features (e.g., checksumming, compression) were not enough to outweigh the ThinLVM advantages for me. Plus, ZFS uses pools of physical disks and I wanted to be able to keep each SSD self contained and easily removable/deactivated for storage offsite.

Okay, all that said, I also wanted a way to monitor my actual disk usage of each SSD. I create a bash shell script utilizing pvs, lvs and lvdisplay commands and then using awk to parse it into a report. This script is unique for my setup where I have one SSD per PV, one PV per VG, one ThinPool per VG. Just keep that in mind. If you're using multiple PVs in a Volume Group (multiple physical HDDs or SSDs per VG), or if you have multiple thinPools per VG, then this might not display correctly for you (It might, but I've not tested that setup).

Bash:
#!/bin/bash

####################################
# Script Name: thinpool_usage.sh
# Author: Benny R. Tate II
# Company: BRTA - Ben R. Tate and Associates, Inc.
# Created Date: January 1, 2025
# Purpose: To provide a detailed summary of thin-pool usage in Proxmox environments.
#
# Description:
# This script generates a summary of thin-pool storage usage and detailed information
# about logical volumes (LVs) within each thin-pool. The script displays:
#   - Total size of each thin-pool.
#   - Allocated space in GB and MB (with decimal precision).
#   - Actual used space in GB and MB (with decimal precision).
#   - Percentage of total thin-pool size used.
#   - Percentage of allocated thin-pool size used.
#   - The associated Physical Volume (PV) for each thin pool.
#
# Usage:
# Simply run this script without any arguments to get the thin-pool usage summary.
# Place the script in a common directory like /usr/local/bin for easy access.
#
# Example:
# $ thinpool_usage.sh
####################################

# Step 1: Collect thin pool information and VG-PV mapping
lvs --units m --segments -o lv_name,lv_size,data_percent,pool_lv,vg_name --noheadings | awk '
BEGIN {
    # Build a mapping of Volume Groups (VGs) to Physical Volumes (PVs)
    while (( "pvs --noheadings --options pv_name,vg_name" | getline pv_line ) > 0) {
        split(pv_line, pv_info)
        pv_to_vg[pv_info[2]] = pv_info[1]  # Map VG name to its PV
    }
    close("pvs")

    # Identify thin pools explicitly (filter by "twi-" attributes)
    while (( "lvs --noheadings --options lv_name,vg_name,lv_attr,lv_size --units m" | getline thinpool_line ) > 0) {
        split(thinpool_line, tp_info)
        if (tp_info[3] ~ /^twi-/) {  # Thin pool attributes
            thinpools[tp_info[1]] = tp_info[2]  # Map thin pool name to its VG
            thinpool_sizes[tp_info[1]] = substr(tp_info[4], 1, length(tp_info[4]) - 1)  # Store thin pool size in MB
        }
    }
    close("lvs")

    # Print headers for the Thin-Pool Summary section
    printf "Thin-Pool Summary:\n"
    printf "%-25s %-25s %-30s %-30s %-30s %-12s %-12s\n", "Physical Volume", "Thin-Pool", "Total Size", "Allocated GB (MB)", "Used GB (MB)", "% Used Total", "% Used Allocated"
}

{
# Process logical volumes associated with thin pools
if ($4 ~ /_thinpool/) {  # Check if this is part of a thinpool
    pool = $4  # Thin pool name
    vg = thinpools[pool]  # Volume group for the thin pool

    # If the pool size is not already recorded, fetch and store it
    if (!(pool in pool_size)) {
        pool_size[pool] = thinpool_sizes[pool]
    }

    # Accumulate allocated size for the pool
    pool_allocated[pool] += substr($2, 1, length($2) - 1)
    # Accumulate used size for the pool
    pool_used[pool] += substr($2, 1, length($2) - 1) * ($3 / 100)

    # Assign the corresponding PV for the pool
    pool_pv[pool] = (vg in pv_to_vg) ? pv_to_vg[vg] : "Unknown"
}
    # Store details for individual LVs for the Detailed Logical Volumes section
    if ($4 in thinpools && $1 !~ /_thinpool/) {  # Exclude thin pools themselves
        lv_name = $1
        lv_size_mb = substr($2, 1, length($2) - 1)  # Logical volume size in MB
        data_percent = $3  # Percentage of allocated size that is used
        used_mb = lv_size_mb * (data_percent / 100)  # Calculate used size in MB

        # Convert sizes to GB
        lv_size_gb = lv_size_mb / 1024
        used_gb = used_mb / 1024

        # Format LV size display
        lv_size_display = sprintf("%.2fGB (%.2fMB)", lv_size_gb, lv_size_mb)
        used_display = sprintf("%.2fGB (%.2fMB)", used_gb, used_mb)

        # Store LV details for sorting
        lv_details[lv_name] = sprintf("%-25s %-25s %-25s %-10s %-25s",
            lv_name,
            lv_size_display,
            used_display,
            sprintf("%.2f%%", data_percent),
            $4)
        lv_names[lv_count++] = lv_name  # Track the LV names for sorting
    }
}

END {
    # Include all thin pools, even those without logical volumes
    for (pool in thinpools) {
        if (!(pool in pool_size)) {  # Add unreferenced thin pools
            pool_size[pool] = thinpool_sizes[pool]
            pool_allocated[pool] = 0
            pool_used[pool] = 0
            pool_pv[pool] = (thinpools[pool] in pv_to_vg) ? pv_to_vg[thinpools[pool]] : "Unknown"
        }
    }

    # Collect keys for sorting
    n = 0
    for (pool in pool_size) {
        sort_key = sprintf("%s:%s", pool_pv[pool], pool)  # Combine physical volume and thin-pool name for sorting
        keys[n++] = sort_key
    }

    # Manual sorting (Bubble Sort)
    for (i = 0; i < n - 1; i++) {
        for (j = i + 1; j < n; j++) {
            if (keys[i] > keys[j]) {
                temp = keys[i]
                keys[i] = keys[j]
                keys[j] = temp
            }
        }
    }

    # Print the summary in sorted order
    for (i = 0; i < n; i++) {
        split(keys[i], components, ":")
        pv = components[1]
        pool = components[2]

        total_mb = pool_size[pool]
        total_gb = total_mb / 1024
        allocated_gb = pool_allocated[pool] / 1024
        used_gb = pool_used[pool] / 1024
        percent_used_total = (total_mb > 0) ? (pool_used[pool] / total_mb) * 100 : 0
        percent_used_allocated = (pool_allocated[pool] > 0) ? (pool_used[pool] / pool_allocated[pool]) * 100 : 0

        printf "%-25s %-25s %-30s %-30s %-30s %-12s %-12s\n",
            pv,
            pool,
            sprintf("%.2fGB (%.2fMB)", total_gb, total_mb),
            sprintf("%.2fGB (%.2fMB)", allocated_gb, pool_allocated[pool]),
            sprintf("%.2fGB (%.2fMB)", used_gb, pool_used[pool]),
            sprintf("%.2f%%", percent_used_total),
            sprintf("%.2f%%", percent_used_allocated)
    }

    # Add header for the detailed section
    printf "\nDetailed Logical Volumes:\n"
    printf "%-25s %-25s %-25s %-10s %-25s\n", "LV Name", "Allocated Size", "Used GB (MB)", "% Used", "Thin-Pool"

    # Sort the LV names manually
    for (i = 0; i < lv_count - 1; i++) {
        for (j = i + 1; j < lv_count; j++) {
            if (lv_names[i] > lv_names[j]) {
                temp = lv_names[i]
                lv_names[i] = lv_names[j]
                lv_names[j] = temp
            }
        }
    }

    # Print the sorted LV details
    for (i = 0; i < lv_count; i++) {
        print lv_details[lv_names[i]]
    }
}'
Resulting output:
1735849455509.png

df commands won't show any of the Volume Groups or Logical Volumes, or where they are mounted on the filesystem. That is because the ThinLVMs are not mounted. They are block storage and will not appear in a df command unless we mount them. It is possible to mount the ThinLVMs, but I've no need to do that. Except without the df -h command I'm unable to examine the actual used diskspace of those thinLVMs. AND that was the reason I create the script, to show me that stuff.

By the way, the guest VM configuration files are not kept on the ThinLVMs. The .conf are where we configure network, vdisks, display and stuff, that is in regular ascii .conf files in /etc/pve/qemu-server/...
1735850060728.png
 
Last edited:
  • Like
Reactions: marcoi and itronin

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
Veeam Backup and Replication v12.3 for backing up Proxmox Guest VMs

I wasn't planning to continue using VEEAM for backups after moving to proxmox, but because the latest Veeam v12.3 (released Dec 2024) now has proxmox backup jobs, I'm recreating my existing Veeam VMware backup jobs to be Veeam Proxmox backups jobs. I'm already familiar with the Veeam GUI. I only need to change those backup jobs to be Proxmox backup jobs. I use Veeam to backup my other non-virtual machines, such as my laptops and physical servers, so I may as well continue using it for proxmox backups too.

It's pretty much exact same process to setup a Proxmox VM backup job in Veeam. The only thing that is different is that they require a Veeam Proxmox "Worker", which is simply adding a guest vm in proxmox to handle the backup heavy lifting, compression, and stuff. It's the same as creating a Veeam Backup Proxy, except this is a distinctly different type of proxy specifically for Proxmox backups

Creating the link in Veeam to my Proxmox Host:
1736274582809.png

Veeam will dynamically create a "Worker" VM in proxmox for me by connecting to my proxmox server. The worker handles all the compression, deduplication, and stuff for the backups. Cool thing is that the this Worker VM only is on during backups, then it shuts itself off so it won't consume resources.
1736274004874.png

1736274146737.png

Here is one of the Proxmox guest VM backup jobs running in veeam
1736273575262.png
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
It seems that Veeam Backup Jobs for Proxmox Guest VMs might not be quite ready for primetime.

My Veeam Backup Jobs of my Proxmox Guest VMs - full backup of a 600GB guest VM might take 35 minutes total (vbk). An immediate subsequent backup would be an incremental and takes less than 3 minutes for about 20 MBs of data transferred for that incremental backup (vib).
Then I run another incremental and it takes 35 minutes again, for only 18 MB of incremental data (vib).

I think Veeam is having issues with the Proxmox Guest VM, tracking blocks that changed, perhaps because my proxmox VMs are on a Thin LVM inside an LVM Thin Pool, and it's not reading it efficiently compared to perhaps a LVM (non-thin, i.e. thick).

So, I'm going to be installing a Proxmox Backup Service (PBS) server later this week to see if native proxmox backups work much better than the veeam backups.

One other issue I found with using Veeam for backing up Proxmox, the cmdlet powershell veeam commands don't know how to read or find veeam proxmox backup jobs, only vmware and microsoft hypervisor backup jobs. So my existing powershell script which I had been scheduing via Windows task scheduler won't work for proxmox backup jobs in veeam.

*Veeam Promox Backup jobs do not use CBT to check for changed blocks when doing incrementals. That is likely the performance issue I'm seeing in some incremental backups. I'm reading that Proxmox Backup Service (PBS) should be much more efficient and speedy on it's incremental. I'll find out later this week.
 
Last edited:
  • Like
Reactions: Stephan

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
EDIT: RESOLVED! I issued this ipmitool command to lower threshold, such as:

ipmitool -I lanplus -H <ipmi_ip_address> -U <ipmi_username> -P <ipmi_password> sensor thresh FAN5 lower 200 200 700

I was completely wrong about not being able to lower the threshold below 500, this worked!

EDIT2: That didn't work the way I intended it to work. It says it lowered the lower thresholds to 200 RPM, but now it is running all fans at maximum speed, regardless if I set OPTIMUM, STANDARD for fan mode. I think I made things worse

EDIT3: RESOLVED! I rebooted AGAIN and went into BIOS. I didn't change anything in BIOS because there isn't anything in there relating to FANS, but I exited with a SAVE anyway upon exiting. After exiting bios it reboots into ESXi and the fans are back to STANDARD speeds, meaning normal speeds, and no longer tripping codes because of my earlier adjustment to the ipmi thresholds. In other words, I just had to reboot a couple times after adjusting the new thresholds with ipmitool. Working great now.

---------------------------------------------- ORIGINAL ISSUE SHOWN BELOW - NOW RESOLVED BY THE ABOVE ---------------------------------------------
What might be a good fan controller that I can use to keep some of my fans in the Supermicro X11DPI-NT above their minimum low critical threshold? Or might I need to replace my fans?

Yesterday the X11DPI motherboard's IPMI began registering faults on FAN2, FAN5 and FAN6. Those fans (and only those fans) have been dropping to 300, 400 and 500 RPM and tripping the low rpm critical threshold, causing ALL of the fans to ramp up to MAX for one second and then settles again to 300RPM and repeats that cycle over and over.

I've shutdown the server, cleaned all of the fans, making sure they are not obstructed and spin okay, but I'm no fan expert to know if they are damaged. They look fine to me. Could they be breaking and causing this problem? Perhaps I should replace them as they've been running non-stop for almost 6 years. I hope it's not a reading sensor issue in the headers and motherboard.

ipmitool:

The lowest I thought I can set the threshold using ipmitool is 500 RPM. IPMI is showing those fans dropping to 300 and sometimes 500 RPM and that is tripping the low critical threshold fault and that forces ALL fans to MAX for a couple seconds. I've not had this happen except when I first setup this system in Jan of 2019, and my solution was to lower the threshold from it's default down to 500RPM, but now it seems 500RPM isn't low enough.

The low threshold is fault is only happening on FAN2 (CPU2 cooler = two Noctua 92mm fans Y-splitter to FAN2 header), and on the two 80mm exhaust fans going to FAN5 and FAN6 headers.

- FAN2 header goes to the 4PIN CPU2 Noctua Cooler fans (has a Y splitter so both fans controlled by FAN2 header:
Noctua NH-D9 DX-3647 4U, Premium CPU Cooler for Intel Xeon LGA3647 (Brown) with 2x Noctua NF-A9 HS-PWM (92mm)

- FAN5 and FAN6 headers go to the two exhaust fans:
2x Noctua NF-A8 PWM, Premium Quiet Fan, 4-Pin (80mm)

The 3x240mm fan wall are daisy chained and using FANB header, which stays at 800RPM and above and therefore doesn't trip the fault.

The CPU1 stays at about 1800 or 1900 RPM (it's not noctua cooled, but using less efficient Supermicro cooler with a single 80mm fan to FAN1 header, and therefore doesn't trip the fault.

The 92mm fan that is in a vacant PCIe slot I use to cool the LRDIMM 64 GB memory sticks which can get above 70c if I don't keep a fan on them, stays above 800 RPM and therefore doesn't trip the fault.


Why is it happening? Why are fans running so slowing now?

I think it is happening now because I've been migrating ALL of my vSphere guest VMs from this X11DPI Supermicro to another Supermicro running PRoxmox. So the load on this X11DPI is almost nothing, except for the ESXi and vCenter Server software running on it now and one or two lightweight VMs. So the temp sensors are very low reading and the fans do not need to run very fast.
 
Last edited:

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
I've finally uploaded all of my guest VMs from the big-dog server (thats what I'm calling the X11DPI-NT with ESXi installed, 2xXeon 6130 16 core CPUs, 396GB RAM, direct attached 40.5 TB SSD storage) up to the miniprox server (that's what I'm calling my Supermicro Minitower with the Xeon D1541 8 core, 128GB RAM, 16.5 TB SSD storage).

The reason I'm able to fit 40TB of VMs from ESXi into 16TB of storage on Proxmox is because I'm using thin LVMs in prox and alot more efficient, as long as I don't make mistake of overprovisioning. I have to monitor my thinpool usage else it can crash the entire thinpool and all VMs in that pool. I need to prepare a script to monitor and email me if it gets too close.

Now I'm ready to wipe out ESXi for good and reinstall Proxmox 8.3 onto the Big-Dog.

I'll report back after I'm done or if I have any issues. I'm really loving Proxmox VE and Proxmox Backup Server too.
 
  • Like
Reactions: cesmith9999

BennyT

Active Member
Dec 1, 2018
201
77
28
54
Louisiana USA
vSphere, ESXi, vCenter is gone forever. Yes! And I'm finally using the onboard m.2 NVMe connection on the X11DPI motherboard for the boot disk on this proxmox installation.
1742695877243.png

1742696100520.png
I've named it bigdogprox. I'm told that I should have used "dawg" instead of "dog". That server used to be ESXi host.

I haven't migrated any VMs to the bigdog yet. All of my VMs are on another minitower called miniprox1 which I've been learning proxmox with.

I'm considering to make a proxmox CLUSTER so I could use proxmox GUI to migrate offline VMs from miniprox1 to bigdogprox.

Or, because I've backed up all guest VMs from miniprox1 to a PBS (proxmox backup server), I can restore those backups to the bigdog server using the PBS GUI while keeping guest VMs online on miniprox1.

The only caveat if migrating offline VMs via the CLUSTER method is that I dislike when my DNS server goes offline. If I migrate offline VMs using CLUSTER I'll need to ensure all of the relevant IPs and hostnames are copied into my workstation c:\Windows\System32\drivers\etc\hosts file because after I shutdown that DNS VM then none of the server hostnames will resolve to their IPs.

I might do the PBS backup restore method, because I really need to ensure all of the backups are restorable anyway. I've tested restore with a few backups, but would be nice to test them all.



Here is my miniprox1 server, which is the one I used to offload all of my ESXi guest VMs to for the past few months. These will be pushed back up to the bigdogprox server. Miniprox1 is a Supermicro minitower 5028D-TN4T system with D1541 cpu and 2x10Gbit RJ45 on board:
1742697128251.png

And here is my PBS Proxmox Backup Server 3.3 where I backup all of the above VMs to. This is another Supermicro 5028D-TN4T minitower, just like the one above used for miniprox1, but with less memory installed:
1742697623165.png

I also tried to setup 'balance-xor' to bond multiple physical NICs to my unmanaged physical switch, but I think it confused the switch with how balance-xor would show the same MAC address on different adapters when bonded. I need to get a good, reliable, easy to setup smart switch. But the one I have is okay, I just cannot yet utilize all of the potential bandwidth of my physical network adapters until I get a smart switch.

next up will be to migrate the VMs from miniprox1 to bigdog. I'll post up here how that goes.

If anyone has any questions on my setup, just ask. I have a 200 page word document with all of my notes and screenshots so I can assist on setups if anyone has a question.

Thanks,

Benny