1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Bug in Intel Atom C2000 series processors?

Discussion in 'Processors and Motherboards' started by smithse79, Feb 6, 2017.

  1. EffrafaxOfWug

    EffrafaxOfWug Radioactive Member

    Joined:
    Feb 12, 2015
    Messages:
    358
    Likes Received:
    128
    Assuming this is a SoC-level problem as opposed to a board-level problem, until a new stepping is established, won't RMA'd motherboards just have a newer version of the same potentially vulnerable chip?
     
    #41
  2. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    8,657
    Likes Received:
    2,459
    There is a platform-level fix that all of the vendors are using now.
     
    #42
  3. lofie

    lofie New Member

    Joined:
    Jul 12, 2013
    Messages:
    3
    Likes Received:
    0
    hi,

    anyone heard of any C2000 systems which are not affected?
    i.e. is there any way that a typical pc (bios based) board will not suffer from this bug?

    Have a C2758 system, fairly standard "pc like" motherboard running linux in an "appliance".
    e.g. AMI bios, usb, sata ports, boots from sata ssd.

    The mac addr has a OUI of "Lanner Electronics". sio Im they or the actual makers/designers of the board.
    (fields returned by dmidecode have not been set)

    So far the initial response of the appliance manufacturers is that they have not seen any increased failure rates, but they are looking into it. And continuing to sell this kit - a call to a sales rep confirmed the sales team did not know of the bug exisitence let alone if it effected their gear.

    Have a call in witht he retailer of the system. Will update here if there is any response.

    - L
     
    #43
  4. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    8,657
    Likes Received:
    2,459
    @lofie - keep us posted. I can update the main site with the response.

    The failure rate on most devices is not high. Any C2000 device 2016 or older is vulnerable.
     
    #44
  5. smithse79

    smithse79 Active Member

    Joined:
    Sep 17, 2014
    Messages:
    192
    Likes Received:
    31
    If you're keeping a log of responses, here's how it went for me:

    2/7: I contacted vendor, they had not heard anything from SM but would reach out.
    2/7 (later): Vendor heard back from SM that there is a recall exchange for this motherboard (A1SRM-2758F) and they will set up an advanced replacement RMA from SM to me

    2/9: New board arrives late in the day. Too late to be able to swap them out.
    2/10: Swapped motherboards (took less than 30 minutes to unrack and swap)
    2/10: got shipping labels from SM/vendor and sent old board back via UPS

    Completely painless process for me. It was obvious that Windows knew something had changed, but it booted just fine (Server 2016)
     
    #45
  6. mstone

    mstone Active Member

    Joined:
    Mar 11, 2015
    Messages:
    263
    Likes Received:
    63
    How long ago did you purchase? Were there any obvious blue wire fixes on the motherboard? Is the cpu revision different?
     
    #46
  7. djroketboy

    djroketboy New Member

    Joined:
    Sep 2, 2011
    Messages:
    6
    Likes Received:
    0
    I contacted SM directly yesterday, they had me open a support ticket, then RMA and it was approved for exchange. It's a A1SRi-2558F. I purchased mine in April.
     
    #47
  8. smithse79

    smithse79 Active Member

    Joined:
    Sep 17, 2014
    Messages:
    192
    Likes Received:
    31
    Mine was purchased August(?) 2015

    Nothing obvious on the motherboard, and I didn't get the stats on the Stepping, etc. It is a fairly critical server and I needed it back up ASAP.
     
    #48
  9. BlueFox

    BlueFox Active Member

    Joined:
    Oct 26, 2015
    Messages:
    177
    Likes Received:
    79
    #49
    Patrick likes this.
  10. Opensupport.it

    Opensupport.it New Member

    Joined:
    Oct 21, 2015
    Messages:
    19
    Likes Received:
    1
    I have supermicro board A1SAi-2750F, i contact support via email for this and SM europe (i'm italian) tell me to open RMA for product.

    how working the procedure?

    if i buy other hardware how i understand if is fixed the problem or is a new revision?
    there is a public hardware revision? (supermicro no say nothing to me...)

    and i not understand if platform are now affidable and fixed

    <and.. sorry for my english :) >
     
    #50
  11. trumee

    trumee Member

    Joined:
    Jan 31, 2016
    Messages:
    80
    Likes Received:
    5
    I got my A1SRI-2758F back from Supermicro today. The RMA note said "Reported problem not found. Ran AC power on/off reboot test okay. PC check test okay. Boot into windows 8 and ran burn in test pro okay. Test passed". And that was it!. Called up the RMA centre and was told they have put the Intel fix on the board. However, there doesnt seem to be a visible change in the board. Not sure what to make out of this.
     
    #51
  12. BlueFox

    BlueFox Active Member

    Joined:
    Oct 26, 2015
    Messages:
    177
    Likes Received:
    79
    If they fixed the board, you wouldn't really be able to tell. They should just be removing the old CPU, reballing the board, and attaching the replacement CPU.
     
    #52
  13. GaryD9

    GaryD9 New Member

    Joined:
    Feb 15, 2017
    Messages:
    21
    Likes Received:
    3
    If the fix were made with an updated CPU, the CPU stepping would change. My replacement C2558 board has no visible wire jumpers or changes (compared side-by-side with my older board), is still REV 1, and the CPU has the same B0 stepping :

    Origin="GenuineIntel" Id=0x406d8 Family=0x6 Model=0x4d Stepping=8
     
    #53
  14. Evan

    Evan Active Member

    Joined:
    Jan 6, 2016
    Messages:
    574
    Likes Received:
    68
    Remember not all chips are defective , only the ones in a given time range and maybe from a given fab, so maybe you're one was not one that will fail early and they can check that (of course a utility the user could run would be more sensible)
     
    #54
  15. GaryD9

    GaryD9 New Member

    Joined:
    Feb 15, 2017
    Messages:
    21
    Likes Received:
    3
    Source?

    Everything I've read indicates that ALL the C2xxx chips with the B0 stepping are impacted. Here's what Intel says about it (from http://www.intel.com/content/dam/ww...ion-updates/atom-c2000-family-spec-update.pdf)
    (Oddly, I thought I read something about Intel producing a new stepping to resolve the issue, but I guess I was mistaken... Based on the current document, there is (will be) no new chip/stepping to resolve the issue, and it's ONLY a platform change that works around it.)
     
    #55
  16. Evan

    Evan Active Member

    Joined:
    Jan 6, 2016
    Messages:
    574
    Likes Received:
    68
    Source was intel about the dates. Early processors not affected, now that may be before B0 production it was not stated. Speculation about the fab and hence the ? as I don't know what fab's are used to make them.

    It does not seem clear of the bug is really a pure design issue and/or a material or material application issue. At least that's the way I interpretation it.

    Remember the c2000 has been produced since 2013, and it's been hinted at that it's only the chips that are reaching 18month now and will reason 18month soon that have the issue ?? What about the 2013 produced chips ?

    I actually can't right now imagine what fix they could also implement on existing product, has anybody see what they add or do ?
     
    #56
  17. GaryD9

    GaryD9 New Member

    Joined:
    Feb 15, 2017
    Messages:
    21
    Likes Received:
    3
    Can you please provide a link to something where Intel says that early processors aren't impacted? I'm not doubting that it exists, but it's something I haven't read.

    Here's an article that has some interesting (if useless) info from intel quotes: Intel's Atom C2000 chips are bricking products – and it's not just Cisco hit

    According to that, Intel declined to comment on when the impacted chips started and stopped shipping. However, they are also quoted on that article as stating: "Additionally, Intel will implement and validate a minor silicon fix in a new product stepping that resolves this issue." THAT statement indicates that it wasn't a case of some bad components mixed in with good, or a bad process at a single fab, etc. Rather, that statement seems to indicate that it was a design flaw (either logical or physical) that they can correct with a silicon fix in a new stepping (that doesn't seem to exist yet.) (If it was just a subset of chips with the issue, then they wouldn't need a silicon fix and new stepping...)

    If the above linked/quoted material is accurate, it would be a design issue. (Keeping in mind that I'm including the choice of materials used as part of the overall design.)

    I've seen many references to "18 months." NONE of them from Intel, and none of them are willing to directly connect 18 months with Intel. However, I think it's fair to look past the smokescreen and see the relationship. In that (likely valid) case, the stuff I've seen indicates that the issue is more likely to be a concern after 18 months. It doesn't indicate that it won't occur until then, and I'm sure there are plenty of 3+ year old chips in 24/7 use that never had an issue.

    As for the 2013 chips that HAVE had the issue... I'm sure the majority of those owners were told that they were out of warranty... or in some cases, they got warranty replacements (if the warranty was long enough) without anyone officially relating it to this specific issue. Even now, with everything we think we know now, it STILL hasn't been officially related. Cisco, etc, have all been VERY careful in what they aren't saying.

    That's an interesting question. I'd also love to know what the "platform level fix" could be that doesn't force a new motherboard revision, doesn't force any type of BIOS (or microcode) update, doesn't change the chip, and doesn't leave any clearly visible signs on an older board that supposedly has been "repaired."

    At the very least, I expected to see some sign of hand SMT re-soldering on the replacement board. I don't see anything on either side. (Of course, I don't know what I'm looking for either, and there are a LOT of solder points on a motherboard!) I have to keep in mind that the "fix" might be as trivial as cutting a trace or replacing one of those extremely tiny components on the board. I might never see that even if I did know what to look for (and where to find it.)

    Edit: Here's a page that supposedly answers your "what about 2013 chips?" question: Intel Atom chips have been dying for at least 18 months – only now is truth coming to light (I say "supposedly" because none of the info in that article is directly confirmable.)
     
    #57
    Last edited: Feb 28, 2017
  18. Evan

    Evan Active Member

    Joined:
    Jan 6, 2016
    Messages:
    574
    Likes Received:
    68
    Let me see what I can find written that I am allowed to share.
    I have re-read the emails where I got the info and I would say given the dates and the wording it looks like early info and the vendor really also stated that it mostly affected products in a date range.

    Having said that a different vendor (Cisco) has provided the info to use that it's a full recall, I could attach that info but our company info just referenced documents already referenced.

    Not the biggest same size as we have very few low end ASA's etc but as far as I know we have also seen no failures. And I know of no single person with first hand experience of any failures on any platform yet so I guess they are few and far between.

    I just can't imagine any big vendor re-working product like soldering new chips being able to keep the failure rates of the repair low enough compared to factory shipped items but I guess maybe they do and will.
     
    #58
  19. mstone

    mstone Active Member

    Joined:
    Mar 11, 2015
    Messages:
    263
    Likes Received:
    63
    everything I've seen points to mass hysteria caused by bad reporting on very few facts.
     
    #59
  20. Opensupport.it

    Opensupport.it New Member

    Joined:
    Oct 21, 2015
    Messages:
    19
    Likes Received:
    1
    i have a long email correspondence with SUPERMICRO support:
    and in a mail they respond me:
    "....If you purchase another board that came from us it should not have the issue. Otherwise we’re told we can guarantee the problem is not present on motherboards with serial numbers xx173xxxx and xx73xxxx (produced from march)....."
    Other info/answers about mainboard fix are denied.

    my tow cents :)
     
    #60
    Evan likes this.
Similar Threads: Intel Atom
Forum Title Date
Processors and Motherboards Monero Mining Docker Images - Intel Atom C2000, C3000 and KNL CPUs Feb 12, 2017
Processors and Motherboards Bug in Intel Atom C2000 processor family using 22nm process Feb 8, 2017
Processors and Motherboards Intel Atom E3815 Performance Feb 10, 2015
Processors and Motherboards Intel Atom C2550 Power Consumption Comparison Posted Jan 28, 2014
Processors and Motherboards Intel Atom C2550 (Avoton) Sighting Dec 23, 2013

Share This Page