Weird Networking / Firewall Problem

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,650
2,066
113
One of our web servers (centos / cpanel/whm with CFS/LFD) started blocking connections after the latest CFS/LFD update a couple weeks ago. I haven't done the most recent in fear of some other issue popping up.

Here's a quick backstory to catch-up to the current issues.

What happened was our API calls to UPS were going unanswered... we'd post for a rate quote and it would just time out. I spent forever diagnosing it to find out that disabling the firewall (CFS/LFD) fixed it. Obviously they weren't using just 1 IP so I had to use a handful of FQDNs that 'ups.com' was using, and added those to CFS, turned it back on, and bam, works just fine again.(Some of which are Akamai) I read over my rules, and configuration file and can't figure out why the reply from UPS was being blocked in the first place, or what was 'new' in the update that caused this as I had not touched the configuration in well over a year and all was working perfectly normal.

I was happy it was fixed, and moved on.


A couple days later an ECOM store on the same server had around 10 orders come through duplicated over a 24 hour period that had around 115 total orders. The store allowed users to submit their order twice or it timed out and they re-submitted it. Once again due to not getting a reply back from a 3rd-party API. These orders were never updated and payment status was 'pending' within the store, but on paypal and auth.net the payment went through just fine. As you can tell it was not just paypal or auth it was both payments as well as over 100 perfectly fine orders went through during this 24 hour period as well as many within minutes of the 'bad' orders. We turned on a bunch of logging, reached out to our software vendor for their thoughts, talked to paypal, auth, our network provider, looked for net splits, etc... no issues. No errors in log files, no geo/location connection between shoppers, etc. after a couple days of heavy investigation and NO more problems after that 24hr period we moved on.

Fast forward to today, and it occurred again over the weekend but this time it happened to 2 stores on the same server. One uses PayPal to accept CC on their site, one uses PayPal but you have to go to paypal to pay and also uses Authorize.net -- Problem (duplicate orders) occurred on all 3 payments.

So, it's back, again, short time period, and all is good again but I'm determined to nail this problem.

I've not been able to replicate it, but I'm starting to think there's a rate limit / connection limit setting I'm missing someplace, and that possibly during heavy shopping we're blocking some of these payment processors from sending us the data.

At this point I'm looking for some other ideas as what to investigate next.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,650
2,066
113
Ugh, it may be a stupid sofwtare bug we reported months ago coming back.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,650
2,066
113
Looks like there's bad error handling in this part of the software so when an e-mail connection failed it halted and caused customers to resubmit the page, not record payment data, and other pieces of data. This appears to be 1 issue, the other I'm not sure yet. Oh the fun of troubleshooting others code, 3rd-party connections, and APIs :D