Datto MD70-Datto (Gigabyte MD70-HB0) BMC issues

proone

New Member
May 2, 2021
3
1
3
Hi everyone.

I`m a long time reader, first time poster. So far I've been successful in finding solutions to some of my home server build issues here from questions that other users have asked, but now, I am a bit stuck.

I have a Datto MD70-Datto dual socket LGA2011-3 motherboard, which appears to be a re-branded Gigabyte MD70-HB0 motherbord. Apart from the different PCB colour and the missing SMBus connector (pads are there but no pins) the boards appear to be identical. I am even running the latest BIOS available from Gigabyte.

However I am having issues with the BMC. I have installed the latest AST2400 firmware available from Gigabyte for the MD70-HB0 and the BMC fails to initialise. The board POSTs fine, but after first connecting to power I get an error that the BMC failed to initialise.

I have JTAGed into the BMC and it appears to be in a loop of trying to bring the eth0 interface up, failed, takes it down, tries again and so on.

The network interface does appear to be working, as I am able to flash the firmware through tftp in u-boot. Flashing through linux, dos and windows also works fine using socflash.

The BMC does boot, as I am able to user ipmitool in linux to access it. I can reboot, read FRU, sensor data and so on, however when I run "ipmitool lan print" as root I get invalid channel 0. I get the same for channels 1, 2, 3 and so on, so it looks like the network interface might not be detected in BMC. Maybe a driver issue with using the one from Gigabyte?

Datto does not have a download section on their website, or at least not one publicly accessible. I bought the board second hand, and is probably removed from a Datto appliance, so no official support for me.
I did try to email them at support@datto.com but they have switches to a user based ticketing system, to which only customer have access.

Long story short - I'd appreciate it if anyone would be able to provide the official BMC AST2400 firmware from Datto. It doesn't matter if its the latest version or not. I would like to try to install that one, and see if it might be using a different driver for the network interface.

I do want to mention that the BMC did work when I purchased the board. I was able to install the 8.88 BMC firmware from Gigabyte through the web interface, and it did work. However I don't remember exactly when it started failing. It might be that the firmware worked until disconnected the board fully from power. I also don't know if I the update through the web interface also updated u-boot. Since then I flashed the BMC multiple times, using different methods and parameters (with or without bootloader, with or without wiping user data and so on).

If the stock Datto firmware won't fix the issue, at least it would get me to a more vanilla point where I would have better chances of troubleshooting.

Thank you,
proone
 
  • Like
Reactions: NablaSquaredG

andrewbedia

Active Member
Jan 11, 2013
668
207
43
Actual Datto employee here. Please don't start DMing me asking for help--if you're a customer, use the normal support process. Support is only provided for partners with active service. My advice in this forum comes with no warranty implied or otherwise and may actually void your warranty.

The stock bmc firmware for HB0 should actually work on this board as I've done it before. I can't get you stock firmware as I don't have one of those boards handy. I see you've connected JTAG... but I'm assuming you might be referencing the 3-wire UART interface to see the bmc booting up? Either way, it could be a bad flash. If you have a SOP16 clip, you could try manually flashing it with a SPI flash tool like flashrom.
 
  • Like
Reactions: proone

proone

New Member
May 2, 2021
3
1
3
Actual Datto employee here. Please don't start DMing me asking for help--if you're a customer, use the normal support process. Support is only provided for partners with active service. My advice in this forum comes with no warranty implied or otherwise and may actually void your warranty.

The stock bmc firmware for HB0 should actually work on this board as I've done it before. I can't get you stock firmware as I don't have one of those boards handy. I see you've connected JTAG... but I'm assuming you might be referencing the 3-wire UART interface to see the bmc booting up? Either way, it could be a bad flash. If you have a SOP16 clip, you could try manually flashing it with a SPI flash tool like flashrom.
Hi Andrew, and thank you for the reply.

Don't worry, I won't bother you through direct messages. I honestly appreciate the support and take whatever I get. As long as I know the Gigabyte fw should work, its a good starting point in troubleshooting it.
I have ordered a SOP16 clip from China, but it will take about a month to arrive. In the meantime, I'd like to see if I can do a bit more troubleshooting. I'm not a a beginner at this, but neither an expert.

An yes, by JTAG I mean the UART port, and read the output using putty.

I thought it could be a bad flash, as I am known to skip a step or two from the documentation when I am excited to play with interesting equipment.
Which is why I re-tries flashing it multiple times, using different methods. Be it from Windows, DOS or tftp from the u-boot bootloader through UART.

When I use the DOS utility, the first part runs fine up to "Update Flash Chip O.K.", but then when it has to go through "Update Kernel", "Erase User Configuration" and "Update Bootloader" I get a message each time saying "The BMC is proteced, skip to update firmware image!!", as per the photo below.
proteced.jpg

I also get the message to confirm the MAC Address and FRU Data, which, funny enough, is missing. The MAC addresses for the 2 10Gb NICs is showing all zeros in the BIOS, and the FRU data is generic, as in 0123456.... The serial number is still correct, which is curious.

The MAC address for the management port is showing 29:29:29..., and the IP address as 0.32.0.32.

I have ran into issues with BMC in the past, with some supermicro boards, working in a data centre, so I tried their ipmicfg tool, as I believe it could write the MAC address. However whenever I run that, I get an error saying that the BMC is either not in idle, or in error state.
When I use ipmitool under ubuntu, I am able to read do almost anything with the BMC, except for anything thats related to networking.

I will try reflashing the chip directly using the SOP16 clip, once it arrives, however I'm afraid that might not get it sorted.

The board works fine without BMC, and I know the BMC is not something I really-really need, but it bugs me having something with a fault, and being unable to fix it. Its stuff like this that keeps me up at night :)

Again, I appreciate your help, even if you are unable to provide a bit more help.

Thank you
 

RageBone

Active Member
Jul 11, 2017
364
97
28
So, i'm wildly speculating but i assume there is some issue with either the Realtek Nic for the BMC or the BMC itself on your board.

I have a MD70-HB0 that i bought defective with very weird issues and one of those issues was that absolutely nothing was happening on the dedicated NIC and the BMC network settings in the BIOS didn't do anything. It appeared to be otherwise workings until i borked with flashing the wrong image. Further plan was to replace the Realtek Nic but since i made it worse, that got shelved for other projects.

The second experience i got with Giglebyte Server equipment is a Datto XeonD 1521 Board that is behaving very weirdly.
Those boards are plagued with a Postcode 06 Waiting for BMC timeout that then shuts down the board and keeps it off.
Usually it only times out when you instantly cold-power it on and let it time out while the BMC is still booting but sometimes, it just always times out, even if the BMC has booted 5min ago. I haven't yet found the golden bullet to that issue but i speculate that it might be an issue with the solder-balls on the AST2500.
I usually don't believe such nonsense, but this time, it seems that way because pressing down on it leaves me with the impression that it fixes that issue.

Reproduce-ability is bitch wit that issue on my board, so i can't say for sure, mine randomly decides to **** or unfuck itself but every time it was ****ed and i pressed on it, it worked for a while. At least i think so.
IF it is indeed an issue with the solder (balls), thermal ****ery should help temporarily. As in Freezing or heating it up. Sadly that is probably why ****ing baking is so popular, that damn blasphemy.
So you could either put it in a plastic bag that you zip shut and put it in a freezer, or you take a hair dryer and blast the BMC or Realtek Nic.

Sorry for rambling
 

proone

New Member
May 2, 2021
3
1
3
So, i'm wildly speculating but i assume there is some issue with either the Realtek Nic for the BMC or the BMC itself on your board.

I have a MD70-HB0 that i bought defective with very weird issues and one of those issues was that absolutely nothing was happening on the dedicated NIC and the BMC network settings in the BIOS didn't do anything. It appeared to be otherwise workings until i borked with flashing the wrong image. Further plan was to replace the Realtek Nic but since i made it worse, that got shelved for other projects.

The second experience i got with Giglebyte Server equipment is a Datto XeonD 1521 Board that is behaving very weirdly.
Those boards are plagued with a Postcode 06 Waiting for BMC timeout that then shuts down the board and keeps it off.
Usually it only times out when you instantly cold-power it on and let it time out while the BMC is still booting but sometimes, it just always times out, even if the BMC has booted 5min ago. I haven't yet found the golden bullet to that issue but i speculate that it might be an issue with the solder-balls on the AST2500.
I usually don't believe such nonsense, but this time, it seems that way because pressing down on it leaves me with the impression that it fixes that issue.

Reproduce-ability is bitch wit that issue on my board, so i can't say for sure, mine randomly decides to **** or unfuck itself but every time it was ****ed and i pressed on it, it worked for a while. At least i think so.
IF it is indeed an issue with the solder (balls), thermal ****ery should help temporarily. As in Freezing or heating it up. Sadly that is probably why ****ing baking is so popular, that damn blasphemy.
So you could either put it in a plastic bag that you zip shut and put it in a freezer, or you take a hair dryer and blast the BMC or Realtek Nic.

Sorry for rambling
Hi RageBone,

Thanks you for taking the time to reply. I sympathise with you on the issues you've had with similar boards.

To be honest, I don't think that there is an issue with the BGA. The situation I am in does not point in that dirrection:
  • the BMC worked fine from when I bought the board and until I messed around with it;
  • the system has not overheated, at least not since I got it. I keep it in a Corsair Obsidian 750D Airflow case with more then adequate cooling. The CPUs, under heavy load, rarely reach 50 degrees Celsius;
  • the RJ45 BMC port does work as I can use it to transfer the firmware image through TFTP while flashing it from the u-boot bootloader, using the UART port on the board to communicate with it. It just stops working once it starts booting the fw image;
  • the MAC addreesses for both the management port, and the 2x 10Gbe network ports have gone missing, which would make them unusable.

The repeating error I get when connected through the UART port, is:
OSINET - eth0 is NCSI root, set interface up
OSINET - polling NCSI initial status...
GetNCSINum - WARNING. Id 0 > Get 0 NCSI number, do retryOSINET - eth0 is NCSI root,re-init set interface down
If you are curious, I have attached the entire boot log below. At the start, before the image boots, the bootloader is able to find 2 interface (dedicated and shared) as eth0 and eth1. Once the fw image boots, they are not accessible anymore.

Thank you
 

Attachments