LGA 1700 Alder Lake "Servers"

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.
Jan 3, 2023
55
19
8
In DDR4 and below, the ECC was an extra chip that was exposed , 8 bits for 64 bits wide. This was a side-band access chip that provided storage area for the extra bits required for the CONTROLLER to do the ECC. The ECC was the responsibility of the controller. In DDR5 there is on-die ECC, and for modules that allow ECC extra data lines that are provided to allow the controller to detect errors in transit. I don't think it is clear what variants are implemented. It would be nice to find a document that describes this more clearly. I am currently searching.
 
Jan 3, 2023
55
19
8
PassMark MemTest86 - Memory Diagnostic Tool - ECC Technical Details currently comments that on-die ECC in DDR5 is self-contained, internal to the module, and completely opaque to the outside:

On-die ECC
On-die ECC is a new scheme introduced for DDR5 memory which is completely self-contained in the DDR5 memory module.

On-die ECC, unlike the above schemes, does not provide end-to-end protection. The purpose of On-die ECC is to protect the integrity of data stored in the memory cells of DRAM arrays; it does not detect or prevent errors that occur during transmission between the memory controller and the memory module. All ECC detection and correction is performed internally within DRAM memory cells; it is completely invisible to the CPU and memory controller.

To provide full end-to-end protection, On-die ECC would need to be used in conjuction with Side-band ECC.

So DDR5 modules that are reporting ECC are performing variants of side-band ECC or Link ECC

Link ECC
Link ECC is another new scheme introduced for LPDDR5 memory to augment end-to-end protection for systems with hardware constraints.

Link ECC, by itself, does not provide end-to-end protection; it provides protection for errors that occur during transmission on the channel between the memory controller and the DRAM.

On write operations, the memory controller generates and sends the ECC code along with the write data to the DRAM module. The DRAM module receives the write data, generates its own ECC code and verifies whether it matches with the ECC code sent by the memory controller. If necessary, single-bit errors are corrected accordingly.

In contrast to the other schemes, Link ECC does not detect or prevent errors while being stored in DRAM cells. To provide full end-to-end protection, Link ECC would need to be used in conjuction with Inline ECC to provide full end-to-end protection.

What is unclear is how all of this is reported at the operating system level. This document points out that many of the details depend on the CPU (imbedded memory controller).
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,139
1,546
113
In DDR4 and below, the ECC was an extra chip that was exposed , 8 bits for 64 bits wide
this has not changed in DDR5. but the 64 bits are splitted into 2x 32bits and both added 8 bits for ECC storage.
DDR4 chips can also have on-die-ECC to protect against rowhammer or something similiar.
they write on-die-ECC because of marketing, but this is heavy confusing buyer who want ECC RAM.
 
Jan 3, 2023
55
19
8
I agree the marketing is confusing, but the on-die ECC is there *and required* for DDR5. And yes it is required to keep the error rates of DDR5 modules comparable to JEDEC standard DDR4 non-ECC modules. Now that I have been reading these technical documents, it is very clear that not having on-die ECC for DDR5 would result in the modules having error rates far higher than that which was experienced with DDR4 due to the higher frequencies involved. On die only DDR5 will report 64 bits wide. If it is reporting 72 bits or 80 bits, that is a form of side-band, in-band, or link ECC. We simply need to get the technical specifications to eliminate the confusion.

From the specifications there also appear to be several different modes to access the "sub-channels" you are talking about by the memory controller, and that may be adding to the confusion. The lack of clear documentation is not helpful.
 
Jan 3, 2023
55
19
8
Are you saying all of the documentation in multiple places from Micron itself is wrong? I don't think so. This is in their official specs.
 
Jan 3, 2023
55
19
8
You do realize that is a marketing document right, while the documents I have been digging in are the engineering specifications. I trust the engineering documents more than the high level ones. I think there is a choice on how to implement the ECC bit width on each of the two subchannels. It will be interesting to see if we can find corresponding spec documentation from Hynix from engineering. Not the marketing documents.
 

adman_c

Active Member
Feb 14, 2016
257
135
43
Chicago

adman_c

Active Member
Feb 14, 2016
257
135
43
Chicago
Jan 3, 2023
55
19
8
That is the exact same part I am using in my workstation with Linux, and it reports as 72 bits. You are using Windows and using powershell or what tool to look at the width?
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,139
1,546
113
@James C. Owens
good findings. ok then i was indeed wrong.
but i have checked the pinout of lga1700 socket and there are 2x 40 data lines.
does this mean for alder-lake the 72 bit ECC UDIMM does not work or just the BIOS programme has to be more smart ?