Why are there so many Failed/Bad Xeon Scalable 1st & 2nd Gen CPUs on eBay?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
There seems to be a awful lot of Failed and Bad / For Parts / As-Is eBay sales of these CPUs.

Seeing a failed Intel CPU has been rare in my 24 years on eBay. However, searching for anything "Intel Xeon Silver" or gold or alike brings back a bunch of bad CPUs.

Why is this? Is the IPC weak or something? Do they burn out? I always chalked up, "bad CPU" to just be user error by the seller - as they seem to work fine for me. However, the large number of bad here makes me think twice.

I also noticed one T SKU that said said "Thermal and Long-Life Cycle Support" model that had an extended warranty for 10 years.

Are they seriously that fragile?

What kind of validation testing should be performed to verify the CPU is fully function, and not "bad"?
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,368
1,615
113
if you look at the listings the CPUs are mostly physicaly damaged.
one seller started to sell CPU from scrapped servers and if they are sold why not sell more.
many buyer think they are more smart and hope to get them running.
by the way, i bought a Platinum 8175M as defective because a scustomer returned them as not working.
the Platinum is woring fine, but not on normal supermicro boards...

T SKU that said said "Thermal and Long-Life Cycle Support" model that had an extended warranty for 10 years
new to me. have you a linkt to the intel warranty statement please ?

T = extended lifetime (intel grants to deliver replacements for the next 10 years)
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
if you look at the listings the CPUs are mostly physicaly damaged.
one seller started to sell CPU from scrapped servers and if they are sold why not sell more.
many buyer think they are more smart and hope to get them running.
by the way, i bought a Platinum 8175M as defective because a scustomer returned them as not working.
the Platinum is woring fine, but not on normal supermicro boards...


new to me. have you a linkt to the intel warranty statement please ?

T = extended lifetime (intel grants to deliver replacements for the next 10 years)
Ah, not a quote. The grant to deliver replacements for 10y was what I was recalling.

Yeah, I saw a few with caps missing. But still, there seems to be a lot more than usual.
 

RedX1

Active Member
Aug 11, 2017
132
144
43
I do not know about the Thermal Failure incidence, but there are mechanical issues too.

The PCB substrate is much thinner than previous CPU generations. The ILM has a 6 Point Heat Sink mounting which introduces bending loads into the PCB. This does not occur with the previous clamp type mechanism with a 4 point Heat Sink.

I suspect this, when coupled with the idiotic Intel installation advice video results in higher CPU failure rates.

How to Install an Intel® Xeon® Processor into an LGA3647...

Take care with this expensive technology.


RedX1
 
  • Like
Reactions: eduncan911

Stephan

Well-Known Member
Apr 21, 2017
942
711
93
Germany
A general word of caution about socket 3647 Xeons out of California, especially the silicon valley area... I bought two 8259CL recently:

1) From ebay seller central_valley_computer_parts_inc, who misrepresented the CPU as a "clean pull", picture of CPU in ad also not same that was sent, obviously. CPU came badly dinged 45 deg on one corner. Currently waiting for refund from ebay. ebay said 2-3 days and its now 4 weeks.

2) From echo -n "X" | sha256sum = 500353e9eeae2fa68c49801e30a3b9a48390f8f51194a3a50521dc2d15e9be2a (will elaborate once I have the money back), who sent a dinged and bent-back into shape CPU, even though I explicitly stated that they should not go through with the deal if the CPU they are sending had been damaged previously. Currently en-route back to USA, for a refund. Of course I got charged 120 import tax since this was a USA to EU transaction, which I now have to try and recoup. 50 more EUR are gone forever, for sending it back and Fedex handling fee.

Can you tell I am miffed? Wait for it...

While CPU #1 would not even fit into its plastic carrier any longer, CPU #2 seemed fine at first. So I put into the board 6x32 GB of RAM and Passmark Memtest reported no errors. Then I ran stress-ng
Bash:
nice stress-ng --vm $(nproc) --vm-bytes 86% --vm-keep --vm-populate --vm-madvise willneed --verify -v -t 4h --tz --perf
while simultaneously watching ras-mc-ctl and its rasdaemon for errors:
Bash:
#!/bin/sh
systemctl stop rasdaemon
rm -f /var/lib/rasdaemon/ras-mc_event.db
systemctl start rasdaemon

exec watch -n 5 \
        "ras-mc-ctl --summary | \
        grep -v '^$'; echo \"\"; ras-mc-ctl --error-count; echo \"\"; free -h ; echo \"\"; \
        journalctl -b -n 500 | \
        grep -Ev \"( (systemd|systemd-logind|smbd|dbus-daemon|systemd-networkd|polkitd|sshd)\\[[0-9]+\\]: )|kernel: (cdc_ether|usb) \" | \
        tail -n20"
CPU #2 suddenly produced hundreds of generic memory controller errors. Not specific to a certain DIMM, so their errors stayed all zero. But something else was obviously up. On a second run, system froze solid after ~3 hrs. I was done here.

On a similar note: If you buy ECC RAM, I can recommend everyone run Passmark Memtest (the Pro version, if you have it) to scan for easy to spot errors. This was the only Memtest that would reliably report DIMM errors from the CPU's memory controller. I tested also Microsoft Windows memory diagnostics, which reported nothing (ECC corrected the errors). Freeware Memtest86+ also reported nothing. Under Linux it was hard to not trigger the OOM killer when using stress-ng, so 86% was found by me through trial and error. 1/32th of RAM as zram swap, or true swap file on the order of a few GB will help. But it took not long like only some minutes for rasdaemon and ras-mc-ctl to report weird stuff going on in the CPU/RAM complex. There shouldn't be any errors, not within the 4 hours I ran the test for. These were all Samsung M393A4K40BB2-CTD6Y DDR4 RDIMM 2666 MHz. So two out of 18 broken, maybe also mishandled by the recycling industry.

Edit, today is July 8th 2022... in case anyone is interested...

In case 1, after weeks of waiting and really pushing for a solution from ebay agents in their chat from my side, ebay refunded me. Nothing would have happened without me aggressively querying about the case and if there is a need for further documentation. There was, but ebay neglected to tell me. Funny how they do send you 10 emails when you buy a pen for 10 cents. The case just sat there, without progress. One agent promised a refund, nothing happened.

In case 2, the X was UNIXSurplus. They did not refund shipping (~130 to+from me) and I had to file 20 pages of paperwork to try and reclaim 150 of import tax. So their choice of doing business, which you can see below, cost me 280 bucks right now. Only 130 if German customs refunds me, which is more like a 50:50 affair. Also, item returned to them on June 27th and 9 days later and only after inquiring, a refund was initiated.

Case 3 (bad RAM) was a German seller, resolved by swapping for good parts.

There also was a "case 4", consisting of Netapp interposers, from California again, ebay again (will I ever learn), in the ad was the right picture, but sent wrong item. 50 items, no less. Refunded by seller after sending what was sent and what was advertised.

Lessons... no return policy from seller, red flag. 10k or 20k of sales of products all over the place and you are not exactly buying a pack of sponges for 1.50 but an off-roadmap 3647 CPU, red flag. Seller refuses to send picture of original item to be sent, red flag. Seller refuses to be called, red flag.

Capture.JPG
 
Last edited:

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
I do not know about the Thermal Failure incidence, but there are mechanical issues too.

The PCB substrate is much thinner than previous CPU generations. The ILM has a 6 Point Heat Sink mounting which introduces bending loads into the PCB. This does not occur with the previous clamp type mechanism with a 4 point Heat Sink.

I suspect this, when coupled with the idiotic Intel installation advice video results in higher CPU failure rates.

How to Install an Intel® Xeon® Processor into an LGA3647...

Take care with this expensive technology.


RedX1
So, I was watching this and thinking, "this isnt so bad... Yeah, that's correct... Wow, it does have 6 screws. Wth Intel. Ok still, but not too bad... Umm Humm... Yeah...

Wait, is that... No she's not... No! (Ewrrrr... Ewrrrr . BRRRR. BRRRRRRRRRR. BRRRR.)"

No way! That is just the worse thing I've ever seen, and directly from Intel ?!?! Hahaha.

Oh my... ROFL. I actually scared the family.

A general word of caution about socket 3647 Xeons out of California, especially the silicon valley area... I bought two 8259CL recently:

1) From ebay seller central_valley_computer_parts_inc, who misrepresented the CPU as a "clean pull", picture of CPU in ad also not same that was sent, obviously. CPU came badly dinged 45 deg on one corner. Currently waiting for refund from ebay. ebay said 2-3 days and its now 4 weeks.

2) From echo -n "X" | sha256sum = 500353e9eeae2fa68c49801e30a3b9a48390f8f51194a3a50521dc2d15e9be2a (will elaborate once I have the money back), who sent a dinged and bent-back into shape CPU, even though I explicitly stated that they should not go through with the deal if the CPU they are sending had been damaged previously. Currently en-route back to USA, for a refund. Of course I got charged 120 import tax since this was a USA to EU transaction, which I now have to try and recoup. 50 more EUR are gone forever, for sending it back and Fedex handling fee.

Can you tell I am miffed? Wait for it...

While CPU #1 would not even fit into its plastic carrier any longer, CPU #2 seemed fine at first. So I put into the board 6x32 GB of RAM and Passmark Memtest reported no errors. Then I ran stress-ng
Bash:
nice stress-ng --vm $(nproc) --vm-bytes 86% --vm-keep --vm-populate --vm-madvise willneed --verify -v -t 4h --tz --perf
while simultaneously watching ras-mc-ctl and its rasdaemon for errors:
Bash:
#!/bin/sh
systemctl stop rasdaemon
rm -f /var/lib/rasdaemon/ras-mc_event.db
systemctl start rasdaemon

exec watch -n 5 \
        "ras-mc-ctl --summary | \
        grep -v '^$'; echo \"\"; ras-mc-ctl --error-count; echo \"\"; free -h ; echo \"\"; \
        journalctl -b -n 500 | \
        grep -Ev \"( (systemd|systemd-logind|smbd|dbus-daemon|systemd-networkd|polkitd|sshd)\\[[0-9]+\\]: )|kernel: (cdc_ether|usb) \" | \
        tail -n20"
CPU #2 suddenly produced hundreds of generic memory controller errors. Not specific to a certain DIMM, so their errors stayed all zero. But something else was obviously up. On a second run, system froze solid after ~3 hrs. I was done here.

On a similar note: If you buy ECC RAM, I can recommend everyone run Passmark Memtest (the Pro version, if you have it) to scan for easy to spot errors. This was the only Memtest that would reliably report DIMM errors from the CPU's memory controller. I tested also Microsoft Windows memory diagnostics, which reported nothing (ECC corrected the errors). Freeware Memtest86+ also reported nothing. Under Linux it was hard to not trigger the OOM killer when using stress-ng, so 86% was found by me through trial and error. 1/32th of RAM as zram swap, or true swap file on the order of a few GB will help. But it took not long like only some minutes for rasdaemon and ras-mc-ctl to report weird stuff going on in the CPU/RAM complex. There shouldn't be any errors, not within the 4 hours I ran the test for. These were all Samsung M393A4K40BB2-CTD6Y DDR4 RDIMM 2666 MHz. So two out of 18 broken, maybe also mishandled by the recycling industry.
That's awful man. And thanks for the advice! Bookmarked for next time (everyone should check out the Bookmark feature of the forums).
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
Why is this? Is the IPC weak or something? Do they burn out? I always chalked up, "bad CPU" to just be user error by the seller - as they seem to work fine for me. However, the large number of bad here makes me think twice.
if you look at the listings the CPUs are mostly physicaly damaged.
Maybe people were surprised how huge and heavy these cpus were.
I know I was surprised when I had the first threadripper in my hands how massive that thing was... (And I almost dropped an amd epyc on a mainboard becuase I forgot about the weight)
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,368
1,615
113
the seller has just a bad test-motherboard. and a seller must not give any warranty on defective cpu.
 

bayleyw

Active Member
Jan 8, 2014
302
99
28
In case 2, the X was UNIXSurplus. They did not refund shipping (~130 to+from me) and I had to file 20 pages of paperwork to try and reclaim 150 of import tax. So their choice of doing business, which you can see below, cost me 280 bucks right now. Only 130 if German customs refunds me, which is more like a 50:50 affair. Also, item returned to them on June 27th and 9 days later and only after inquiring, a refund was initiated.
I can confirm that UNIXSurplus returns are miserable, out of two items I've ever bought from them, one was the wrong model variant and one was defective. In both cases, rather than take the return, they tried to wiggle out of it (I think I had to disassemble the server on one occasion to prove it was the wrong part, and wait for them to run 'internal tests' to prove an HBA was defective in the other).

It's not unethical per se to want additional verification, but pretty off-putting from such a large vendor.
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
Thank you all for the updates!

Maybe we should start a Blacklist/Boycott/EbayVendorsToAvoid thread where we keep track of things such as:

Vendor, eBay order number, description, Was it Resolved?, etc.

The point of capturing the eBay order number will be to give the Vendor a list of public complaints with order numbers, so they can take action to resolve the issues - and be removed from the blacklist/boycott list if users report favorable resolutions.

Serveral of us are running into issues for another vendor, deals2day-364, for chassis and HDD issues and they are giving us the run around.

They responded with, "no one else is having issues", which when I posted that, several people PM me their order numbers or personal experience with deals2day-364 screwing up orders, besides the ones who posted publicly.

Most likely will have to be a spreadsheet or something because a "thread of replies" is not the best place, it's just an open discussion. But that OP can be the placeholder for that spreadsheet, while allowing open discussions.

Name and Shame?
 
Last edited:

Wasmachineman_NL

Wittgenstein the Supercomputer FTW!
Aug 7, 2019
1,880
620
113
Thank you all for the updates!

Maybe we should start a Blacklist/Boycott/EbayVendorsToAvoid thread where we keep track of things such as:

Vendor, eBay order number, description, Was it Resolved?, etc.

The point of capturing the eBay order number will be to give the Vendor a list of public complaints with order numbers, so they can take action to resolve the issues - and be removed from the blacklist/boycott list if users report favorable resolutions.

Serveral of us are running into issues for another vendor, deals2day-364, for chassis and HDD issues and they are giving us the run around.

They responded with, "no one else is having issues", which when I posted that, several people PM me their order numbers or personal experience with deals2day-364 screwing up orders, besides the ones who posted publicly.

Most likely will have to be a spreadsheet or something because a "thread of replies" is not the best place, it's just an open discussion. But that OP can be the placeholder for that spreadsheet, while allowing open discussions.

Name and Shame?

Like this?
 

Stephan

Well-Known Member
Apr 21, 2017
942
711
93
Germany
Name and Shame?
Not a fan, things can change too fast with a seller, in either direction, and also within different product categories. I prefer teaching others from my own misjudgements, by giving generic hints what to expect and how to judge a deal. In order not to get screwed by big corp or small corp, who are both betting on you just giving up.

Btw bought another 8259CL, this time from China. I know what you are thinking. Case #5... we'll see. ;-) I saw exact CPU before he/she sent it, there is a return policy, low 100s of specialty CPU sales, 100% rating on ebay, fedexed it, so I guess it might even arrive within a week. Just glad I didn't track lost time with all the near-fraud and stupidity I had to deal with for this side-project, otherwise I could just have bought three 8280 retail with a huge ass Netapp and call it a day.
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
Eh, not really. That relies on the single OP person to keep everything up to date. I think opening a Google Form for you to submit, which shows up on a publicly accessible Read-Only Google Sheet. Or something along those lines.

As for maintaining the list, we can delegate Modify rights to select trusted forum members that will monitor the thread.

Thereby, no single point of failure (or someone to go out of contact for a while).
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
Not a fan, things can change too fast with a seller, in either direction, and also within different product categories.
I actually completely agree with this 100% percent. It's why I ended that statement with a ❓. :) Because that may not be the best way.

However, this "info" is already available via ebay's public feedback system (which I could tie into the spreadsheet with a plugin, based on seller's username and the order number).

We are just aggregating it to a single lookup list, across several vendors, and providing a long-form complaint to be filed (like the private company BBB). :)

It wouldn't be a place for everyone to complain about anything. You will first have to attempt to resolve the issues first. It's only when you reach the end of your journey of when you submit your review and results.

That way the seller can see they are being watched, and not to give false statements like, "no one else is having any issues." (Which is what I got a few days ago as a direct lie from a seller)
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
Thinking a little more, maybe it should be like the BBB:

- someone would Open a "Case" against a seller.
- seller sees the case, and laughs their head off.
- complain closes without a response from seller.

x5 times with a seller, blacklisted. Sales drop by loosing a lot of STH sales because buyers have now educated themselves of the risk from this vendor.
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
I think that is what Heatware.com is for actually.

Except, there isn't a resolution there. Just a review site.

I do like their requirement of $1 payment to sign up, to show you are a real person.
 

Wasmachineman_NL

Wittgenstein the Supercomputer FTW!
Aug 7, 2019
1,880
620
113
I think that is what Heatware.com is for actually.

Except, there isn't a resolution there. Just a review site.

I do like their requirement of $1 payment to sign up, to show you are a real person.
lmao wtf heatware is paid now? I think I registered there years ago but I never used it.
 

eduncan911

The New James Dean
Jul 27, 2015
648
506
93
eduncan911.com
lmao wtf heatware is paid now? I think I registered there years ago but I never used it.
Maybe they dropped the $1 requirement?

I registered in 2012 and they had a $1 requirement as proof you are a real person and not a bot.

 

nabsltd

Well-Known Member
Jan 26, 2022
422
284
63
Sales drop by loosing a lot of STH sales because buyers have now educated themselves of the risk from this vendor.
This would really only work for items that are truly "server" parts. SATA drives, consumer CPUs, non-ECC RAM, etc., likely wouldn't see much of a dent from losing business from all STH members. Even for true "server" parts, is STH big enough to make a dent?