VM Freezing with 100% CPU in ProxMox 8.03 after about 5 days - anyone else?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

discarn8

New Member
Oct 30, 2023
5
1
3
Hello All:

I've got a Debian VM that is freezing with the CPU at 100% in ProxMox 8.03, repeatedly, after about 5 days - anyone else experiencing this? The host [running motioneye] runs fine, for about 5 days and then I get an alert [nagios] that the host is down. Checking the VM in PMVE shows that the CPU is pegged at 100%, memory at 98%. I can get a login prompt at the PM console, but am unable to type. Stopping the VM and starting it again brings back the feng shui. I occasionally receive [nagios] alerts that ping has failed on the hypervisor, as well. About once every two weeks or so

DETAILS
---------------------------------------------------------------------------
  • All updates are current, both on the hypervisor and the host
  • No HA
  • Running on a Dell Optiplex 7060 Micro Intel i7-8700T@2.4Ghz
  • 10 VCPUs
  • 28Gb RAM
  • BIOS: Default [SeaBIOS]
  • Display: Default / Headless
  • Machine: Default (i440fx)
  • SCSI Controller: VirtIO SCSI single
  • HD: local, qcow2, iothread=1
  • Network: virtio, bridge vmbr0, firewall=1
  • Start at boot: Yes
  • Start/Shutdown order = any
  • ProxMox Kernel: Linux 6.2.16-3-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.13-3
  • PVE Manager Ver: pve-manager/8.0.3
  • ProxMox HD: NVMe, LVM partitioned
  • VM OS Type: Linux 6.x - 2.6 Kernel, Debian GNU/Linux 11 (bullseye)
  • Boot Order: scsi0. ide2, net0
  • ACPI: Yes
  • KVM hardware virtualization: Yes
  • Freeze CPU at startup: No
  • Use local time for RTC: Default
  • RTC start date: now
  • SMBIOS settings [type1]
  • QEMU Guest Agent: Enabled
  • Protection: No
  • Spice Enhancements: None
  • VM State Storage: Auto
  • Stand-Alone, No cluster
Thanks in advance
 

vudu

Member
Dec 30, 2017
64
23
8
63
Seems its a known bug. Ill see if I can find you the info.

Edit: Discussed here.

Check the earlier posts. Seems upgrade to testing kernel 6.5.x should suffice as a short term solution.
 
  • Like
Reactions: discarn8

discarn8

New Member
Oct 30, 2023
5
1
3
Awesome! Thank you for the quick response. I'll give that a shot and then update the thread in about a week or so after letting it cook.

Thanks again
 

jakerouse

New Member
Oct 23, 2023
2
1
3
Hello All:

I've got a Debian VM that is freezing with the CPU at 100% in ProxMox 8.03, repeatedly, after about 5 days - anyone else experiencing this? The host [running motioneye] runs fine, for about 5 days and then I get an alert [nagios] that the host is down. Checking the VM in PMVE shows that the CPU is pegged at 100%, memory at 98%. I can get a login prompt at the PM console, but am unable to type. Stopping the VM and starting it again brings back the feng shui. I occasionally receive [nagios] alerts that ping has failed on the hypervisor, as well. About once every two weeks or so

DETAILS
---------------------------------------------------------------------------
  • All updates are current, both on the hypervisor and the host
  • No HA
  • Running on a Dell Optiplex 7060 Micro Intel i7-8700T@2.4Ghz
  • 10 VCPUs
  • 28Gb RAM
  • BIOS: Default [SeaBIOS]
  • Display: Default / Headless
  • Machine: Default (i440fx)
  • SCSI Controller: VirtIO SCSI single
  • HD: local, qcow2, iothread=1
  • Network: virtio, bridge vmbr0, firewall=1
  • Start at boot: Yes
  • Start/Shutdown order = any
  • ProxMox Kernel: Linux 6.2.16-3-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.13-3
  • PVE Manager Ver: pve-manager/8.0.3
  • ProxMox HD: NVMe, LVM partitioned
  • VM OS Type: Linux 6.x - 2.6 Kernel, Debian GNU/Linux 11 (bullseye)
  • Boot Order: scsi0. ide2, net0
  • ACPI: Yes
  • KVM hardware virtualization: Yes
  • Freeze CPU at startup: No
  • Use local time for RTC: Default
  • RTC start date: now
  • SMBIOS settings [type1]
  • QEMU Guest Agent: Enabled
  • Protection: No
  • Spice Enhancements: None
  • VM State Storage: Auto
  • Stand-Alone, No cluster
Thanks in advance
Seems like you are facing performance issues with your Debian VM on Proxmox, and troubleshooting such problems can involve checking various aspects. Here are some suggestions:

  1. Resource Usage Analysis:
    • Monitor resource usage within the VM using tools like top or htop to identify which process is consuming the most CPU and memory.
    • Check Proxmox's resource graphs to see if there are any spikes in resource usage during the freeze periods.
  2. Logs and Diagnostics:
    • Review system logs within the VM (/var/log/syslog, /var/log/messages, etc.) for any error messages or warnings.
    • Examine Proxmox logs (/var/log/pveproxy.log, /var/log/qemu-server/<vmid>.log, etc.) for relevant information during the problematic periods.
  3. Update VirtIO Drivers:
    • Ensure that VirtIO drivers are up to date within the VM. Outdated or incompatible VirtIO drivers can lead to performance issues.
  4. Check for I/O Bottlenecks:
    • Analyze storage performance, as high I/O wait times could cause the CPU to spike. Consider using tools like iostat to monitor disk I/O.
  5. Guest VM Configuration:
    • Adjust the number of virtual CPUs allocated to the VM. Having too many virtual CPUs for the workload can sometimes lead to inefficiencies.
    • Experiment with different machine types and display options in Proxmox to see if there are any improvements.
  6. Consider VM Ballooning:
    • If memory usage is consistently high, consider adjusting the VM's memory allocation or implementing memory ballooning to dynamically adjust memory usage.
  7. Update Proxmox:
    • Ensure that Proxmox and its components are updated to the latest version. Sometimes, performance issues can be resolved with software updates.
  8. Hardware Issues:
    • Check the physical hardware for any issues, such as faulty RAM or disk problems. Run diagnostics on the host machine to rule out hardware problems.
 

SnJ9MX

Active Member
Jul 18, 2019
130
83
28
Seems like you are facing performance issues with your Debian VM on Proxmox, and troubleshooting such problems can involve checking various aspects. Here are some suggestions:

  1. Resource Usage Analysis:
    • Monitor resource usage within the VM using tools like top or htop to identify which process is consuming the most CPU and memory.
    • Check Proxmox's resource graphs to see if there are any spikes in resource usage during the freeze periods.
  2. Logs and Diagnostics:
    • Review system logs within the VM (/var/log/syslog, /var/log/messages, etc.) for any error messages or warnings.
    • Examine Proxmox logs (/var/log/pveproxy.log, /var/log/qemu-server/<vmid>.log, etc.) for relevant information during the problematic periods.
  3. Update VirtIO Drivers:
    • Ensure that VirtIO drivers are up to date within the VM. Outdated or incompatible VirtIO drivers can lead to performance issues.
  4. Check for I/O Bottlenecks:
    • Analyze storage performance, as high I/O wait times could cause the CPU to spike. Consider using tools like iostat to monitor disk I/O.
  5. Guest VM Configuration:
    • Adjust the number of virtual CPUs allocated to the VM. Having too many virtual CPUs for the workload can sometimes lead to inefficiencies.
    • Experiment with different machine types and display options in Proxmox to see if there are any improvements.
  6. Consider VM Ballooning:
    • If memory usage is consistently high, consider adjusting the VM's memory allocation or implementing memory ballooning to dynamically adjust memory usage.
  7. Update Proxmox:
    • Ensure that Proxmox and its components are updated to the latest version. Sometimes, performance issues can be resolved with software updates.
  8. Hardware Issues:
    • Check the physical hardware for any issues, such as faulty RAM or disk problems. Run diagnostics on the host machine to rule out hardware problems.
lol thanks ChatGPT
 

discarn8

New Member
Oct 30, 2023
5
1
3
Seems its a known bug. Ill see if I can find you the info.

Edit: Discussed here.

Check the earlier posts. Seems upgrade to testing kernel 6.5.x should suffice as a short term solution.
Finally, installed the 6.5 kernel [linux-image-unsigned-6.5.0-060500-generic_6.5.0] / modules [linux-modules-6.5.0-060500-generic_6.5.0] a couple of days ago and disabled the numa balancing [echo 0 > /proc/sys/kernel/numa_balancing] for good measure. Waiting for the 5-6 day mark, where it usually freezes.

Thanks again, vudu
 

discarn8

New Member
Oct 30, 2023
5
1
3
And... that would be a negative. VM just froze. Had to stop and restart it.

Ok, well - I know where the ProxMox forum is - I'll re-direct my focus over there and see if I can find resolution.

Thanks ALL [even you ChatGPT] for the help