How do you shut down your VDX?

klui

Active Member
Feb 3, 2019
291
123
43
Have a 6740T and I read about someone whose system running 7.3 doesn't save its running-config. Poster states that it's because of pulling the power when shutting down that caused Dcmd database to be corrupted during next boot. saving running config on VDX 6740 logical mode | Extreme Networks Support Community

While my switch doesn't have that log entry about Dcmd database corruption, I have noticed the following when it boots up:
2021/03/12-20:03:42, [HASM-1004], 37217, INFO, VDX6740T, Processor reloaded - Reset.
.....done
done
waiting for server to shut down.... done
server stopped

When I first got the switch it pauses around a minute or two at this point and I thought it had a problem. I decided to wait longer and it continued booting. Today the log showed:
waiting for server to shut down.......................Sat Mar 27 05:01:15 GMT 2021 :: Confd: Waiting for Dcmd to become ready...
........................................ failed

But it continued booting after these messages.

Anyway I found a blog where someone recommended going into foscmd (bash shell) and manually shut down the switch to prevent Dcmd corruption. I tried it but it caused a boot loop. Here's the result of shutdown that somehow causes HA to go out of sync:
sw0# unhide foscmd
Password: ********
sw0# fos bash | no
bash-2.04# /fabos/bin/shutdowndcmdb
2021/03/27-05:11:53 : shutdowndcmdb : Shutting Down Database ...... (New)
2021/03/27-05:11:53 : shutdowndcmdb : Directly shut down Ccmd postgres service
pg_ctl: PID file "/etc/fabos/Ccmd/WaveDatabase/postmaster.pid" does not exist
Is server running?
2021/03/27-05:11:53 : shutdowndcmdb : Calling DcmClient for Dcm
2021/03/27-05:11:57 : shutdowndcmdb : Completed
bash-2.04#
bash-2.04# sync
bash-2.04# shutdown -h now
Broadcast message from root (pts/0) Sat Mar 27 05:12:05 2021...
The system is going down for system halt NOW !!
INIT: Switching to runlevel: 0
INIT: Sending processes the TERM signal
2021/03/27-05:12:05, [SEC-3022], 467, SW/0 | Active, INFO, sw0, Event: logout, Status: success, Info: Successful logout by user [admin].
2021/03/27-05:12:07, [HASM-1101], 468, SW/1 | Standby, WARNING, VDX6740T, HA State out of sync.
Received SIGTERM, arpd exiting...
Received SIGTERM, bgpd exiting...closing socket with nsm
closing socket with nsm
Disabling wdt: Sat Mar 27 05:12:15 GMT 2021
Unmounting all filesystems: Sat Mar 27 05:12:15 GMT 2021
umount2: Device or resource busy
umount: /mnt: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
umount2: Device or resource busy
umount: /tmp: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
umount2: Device or resource busy
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
mount: / is busy
The system is halted Sat Mar 27 05:12:18 GMT 2021
Power down.

After this the system won't boot properly, stopping at:
cpu0/0: failover_register() - registering notifiers...
Fman microcode 51 45 46
FMAN microcode UC size 0x1b64
default MII is 0xc10ac000 for tsec0
gtg_init:Coldboot case
gtg_init(1):Ring bf4f6800 Peer Ring becf6800
Uboot wdt counter value: 0
VSD Created with major number vsd_probe = 254
disk->start 0 cmd->start 0 nstart 0 disk name /vsmgr/vd@usb0/vda00
create_new_partition_table part_no 0 no_of_parts 2
disk->start 7665663 cmd->start 0 nstart 7665663 disk name /vsmgr/vd@usb0/vda01
create_new_partition_table part_no 2 no_of_parts 2
Status:
Status:
Status:
Status:
HV> INFO: task swapper:1 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
******VSD debug data start*******
.
.
.
I had to reinstall and now when I shutdown I issue reload then wait when u-boot asks "Hit ESC to stop autoboot," I press ESC then disconnect power. The next startup doesn't show "waiting for server to shut down" anymore. The shutdown command is really convenient because it powers off the switch, but the PSU still runs afterward.
2021/03/27-05:51:19, [HASM-1004], 4, INFO, VDX6740T, Processor reloaded - Reset.
Sat Mar 27 05:54:48 UTC 2021 :: Confd: Waiting for Dcmd to become ready...
Services starting COLD recovery
CHASSIS RECOV_ACTIVE
Activating Route Profile [RTPROFILE_TYPE_DEFAULT], ecmp [0]
Activating HW Profile [HWPROFILE_TYPE_DEFAULT]
Activating KAP Profile [DEFAULT]

Not waiting for whatever server to stop reduces load time from taking an eternity to eternity - 2 minutes. Boot time is still very long.