STH Colocation Power and Networking Overhaul Today

Patrick

Administrator
Staff member
Dec 21, 2010
11,908
4,871
113
Some folks here may have noticed that STH had a few short bits of downtime today. Saturdays are slow at STH so it is a good maintenance window.

A few items got fixed today:
  1. The main 1GbE switches we have were the old HP V191024G's. 2013 era STH architecture recycled - these were removed in favor of 48 port 1GbE / 4x 10Gb SFP+ Dell switches. We needed more ports, so this was easy.
  2. We have had a zero U PDU that has been on life support since earlier this year when there was a major power event. Two of the ports did not work.
    1. Moved to two new APC 1U PDUs in the hosting rack
    2. Added an ATS PDU for single PSU architecture items
  3. Update Saturday!
The new PDU infrastructure gives us two power feeds from different facility outlets. Not perfect, but it is much more robust than the life support PDU situation we have had. These PDUs have been installed for a few months, but there were still a few items that needed to swap over, and one of our feeds was being used by the life support PDU so we needed to do a hard power off/ replug a few items. All of the main hosting nodes are dual PSU servers so they never went down during the process.

A big miss of the day was that I became the seemingly 100 millionth person to forget to save my vlan configuration to flash so when a switch was power cycled the vlans were not set and all heck broke loose for about 30 seconds.

Overall, I am still not 100% happy, but the new infrastructure will hopefully let me sleep better.
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,484
222
63
I saw it went down but then I reloaded and it worked. I thought it was my end.
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,908
4,871
113
Good point. IPv6 had been working, but we ran into a few issues. You will likely see it return again soon. We are in a period where we will not be making this kind of change to keep the site stable.