PFSense interrupt issue?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

gigatexal

I'm here to learn
Nov 25, 2012
2,913
607
113
Portland, Oregon
alexandarnarayan.com
I've got an old lenovo/ibm think station desktop I've converted into a PFSense router. I noticed this in the top output and got alarmed. I enabled device polling but disabled all the optional offloading in the advanced tab.

Any help would be welcome.

Specs, Core 2 Duo E8500.
4GB or so ram.
80GB SATA hdd.

Code:
last pid: 96142;  load averages:  0.76,  0.24,  0.09  up 0+22:56:07    21:29:47
129 processes: 5 running, 110 sleeping, 14 waiting

Mem: 15M Active, 88M Inact, 144M Wired, 279M Buf, 3173M Free
Swap: 8192M Total, 8192M Free


  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
    9 root     -16 ki-1     0K    16K CPU1    1   1:25  99.27% [idlepoll]
   11 root     155 ki31     0K    32K RUN     1  22.9H  54.88% [idle{idle: cpu1}]
   11 root     155 ki31     0K    32K RUN     0  22.9H  41.89% [idle{idle: cpu0}]
73306 root      22    0   223M 31844K piperd  0   0:00   0.10% php-fpm: pool lighty (php-fpm)
    0 root     -92    0     0K   256K -       0   1:39   0.00% [kernel{em2 que}]
    0 root     -16    0     0K   256K swapin  1   0:39   0.00% [kernel{swapper}]
    0 root     -92    0     0K   256K -       1   0:37   0.00% [kernel{em1 que}]
    5 root     -16    -     0K    16K pftm    0   0:17   0.00% [pf purge]
   12 root     -60    -     0K   224K WAIT    0   0:17   0.00% [intr{swi4: clock}]
17935 root      20    0 12456K  2176K select  1   0:07   0.00% /usr/local/sbin/apinger -c /var/etc/apinge
    4 root     -16    -     0K    32K -       0   0:07   0.00% [cam{scanner}]
   15 root     -16    -     0K    16K -       0   0:05   0.00% [rand_harvestq]
26910 unbound   20    0 55212K 23028K kqread  0   0:04   0.00% /usr/local/sbin/unbound -c /var/unbound/un
   12 root     -88    -     0K   224K WAIT    1   0:03   0.00% [intr{irq17: uhci1 uhc}]
38467 root      52   20 17136K  2424K wait    1   0:03   0.00% /bin/sh /var/db/rrd/updaterrd.sh
52536 root      20    0 21156K  4508K select  0   0:02   0.00% /usr/local/sbin/miniupnpd -f /var/etc/mini
   20 root      16    -     0K    16K syncer  0   0:02   0.00% [syncer]
33058 dhcpd     20    0 24844K 13124K select  1   0:02   0.00% /usr/local/sbin/dhcpd -user dhcpd -group _
adding to original post:

Here's what vmstat -i says polling off.

Code:
$ vmstat -i
interrupt                          total       rate
irq17: uhci1 uhci4+               140003          1
irq20: hpet0                    93869792       1126
irq257: em1                      7254111         87
irq258: em2                      8261133         99
Total                          109525039       1314
with polling on.

Code:
$ vmstat -i
interrupt total rate
irq17: uhci1 uhci4+ 3713 6
irq20: hpet0 610265 1125
irq257: em1 414 0
irq258: em2 583 1
Total 614975 1134
 
Last edited:

Danic

Member
Feb 6, 2015
84
35
18
jrdm.us
I've read mostly that polling should be avoided and not always supported by all drivers. Maybe vmstat -i could show which driver is causing all the polling? this thread may shine some light on the issue.
 

gigatexal

I'm here to learn
Nov 25, 2012
2,913
607
113
Portland, Oregon
alexandarnarayan.com
Checking that thread out now. Here's what vmstat -i says polling off.

Code:
$ vmstat -i
interrupt                          total       rate
irq17: uhci1 uhci4+               140003          1
irq20: hpet0                    93869792       1126
irq257: em1                      7254111         87
irq258: em2                      8261133         99
Total                          109525039       1314
with polling on.

Code:
$ vmstat -i
interrupt total rate
irq17: uhci1 uhci4+ 3713 6
irq20: hpet0 610265 1125
irq257: em1 414 0
irq258: em2 583 1
Total 614975 1134
 

Danic

Member
Feb 6, 2015
84
35
18
jrdm.us
Look into the cpu power states. I had a Core 2 Quad processor that would have serious time keeping issues when running powerd (in my case the cpu governor in linux) and the cpu power saving features were disabled in BIOS. Time issues = latency issues? Also you may not notice time drift because of NTP server/client in pfsense.

Just for humor and history, when I having time issues, my COD4 server would run like the Matrix lobby scene, all gun shooting slow mo with random speed ups. Also network file transfer would send data 'faster' than gigabit. I thought I had some amazing data compression going on, but nope it was all because it couldn't keep time. End solution for me was enable bios cpu power saving features (speedstep and c-states) and force the cpu governor to max performance.
 

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
My familiarity with bsd is nowhere near where it is with linux - but I highly suspect that lots of interrupts from HPET is perfectly normal. Interrupts from HPET are for low-level kernel timekeeping, process accounting, etc. and having 1000 or more per second is not unusual.
 

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
Does it? The only shot you posted of 'top' output looks to me like the CPU is spending its time polling the NICs, which is unrelated to HPET interrupts, and is from when you said "enabled device polling but disabled all the optional offloading". At least, I'm assuming that that was the active configuration during that 'top' output capture, and makes sense to me. Using polling on the NICs means the OS is constantly asking the NIC "do you have any packets for me" over and over and over, which would cause high CPU usage (related to an idlepoll task? my lack of bsd familiarity is making me guess at things here). When you disabled polling you only showed before/after 'vmstat -i' output and not 'top' output - but we can see that the rate of HPET interrupts is virtually identical in either config, but with polling disabled now the NICs are starting to send some interrupts as well. With polling disabled, the OS is no longer repeating "any packets yet?" over and over and over, and is now actually idle (the idle{idle: cpux} tasks in 'top', but again I'm not familiar with bsd process/idle-time accounting), and when a NIC does receive a packet it sends an interrupt to let the OS know.

As a general best-practice, I would recommend keeping polling disabled, and enable as much optional offload as you can (excepting that some NICs/drivers/firmware's don't implement them all or have bugs and are unstable with some options enabled).
 
  • Like
Reactions: gigatexal

gigatexal

I'm here to learn
Nov 25, 2012
2,913
607
113
Portland, Oregon
alexandarnarayan.com
This is what was worrying:


11 root 155 ki31 0K 32K CPU0 0 66.8H 100.00% [idle{idle: cpu0}]
11 root 155 ki31 0K 32K RUN 1 25.4H 100.00% [idle{idle: cpu1}]


Code:
last pid: 24215; load averages: 0.00, 0.00, 0.00 up 3+10:05:46 09:16:09
130 processes: 3 running, 113 sleeping, 14 waiting

Mem: 17M Active, 119M Inact, 137M Wired, 415M Buf, 3146M Free
Swap: 8192M Total, 8192M Free


PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K CPU0 0 66.8H 100.00% [idle{idle: cpu0}]
11 root 155 ki31 0K 32K RUN 1 25.4H 100.00% [idle{idle: cpu1}]
14617 root 21 0 223M 36936K piperd 1 0:00 0.10% php-fpm: pool lighty (php-fpm)
9 root -16 ki-1 0K 16K pollid 0 71.7H 0.00% [idlepoll]
12 root -60 - 0K 224K WAIT 0 0:52 0.00% [intr{swi4: clock}]
5 root -16 - 0K 16K pftm 0 0:41 0.00% [pf purge]
0 root -16 0 0K 256K swapin 0 0:40 0.00% [kernel{swapper}]
26696 root 20 0 12456K 2176K select 1 0:18 0.00% /usr/local/sbin/apinger -c /var/etc/apinge
33916 unbound 20 0 87980K 57644K kqread 0 0:17 0.00% /usr/local/sbin/unbound -c /var/unbound/un
0 root -92 0 0K 256K - 0 0:12 0.00% [kernel{em2 que}]
15 root -16 - 0K 16K - 0 0:11 0.00% [rand_harvestq]
12 root -88 - 0K 224K WAIT 0 0:10 0.00% [intr{irq17: uhci1 uhc}]
47026 root 52 20 17136K 2424K wait 1 0:09 0.00% /bin/sh /var/db/rrd/updaterrd.sh
59223 root 20 0 21156K 4496K select 0 0:07 0.00% /usr/local/sbin/miniupnpd -f /var/etc/mini
0 root -92 0 0K 256K - 1 0:07 0.00% [kernel{em1 que}]
4 root -16 - 0K 32K - 0 0:07 0.00% [cam{scanner}]
20 root 16 - 0K 16K syncer 0 0:07 0.00% [syncer]
40985 dhcpd 20 0 24844K 13124K select 0 0:06 0.00% /usr/local/sbin/dhcpd -user dhcpd -group _
 

gigatexal

I'm here to learn
Nov 25, 2012
2,913
607
113
Portland, Oregon
alexandarnarayan.com
both nics are intel. It's fine now, core temps are much more normal, and the box is silent and performant for my use case. I would like the interrupts to be much smaller but it is what it is. I'll probably have some more time to tinker with it this weekend but the internet is so vital to my house I have to schedule some downtime lol