Nice job.
Linux gives these errors. It has been said that is common with X79 and X99 systems. So is there something missing?
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Ā
dmesg |grep EDAC
[ 8.584543] EDAC MC: Ver: 3.0.0
[ 11.885601] EDAC sbridge: Seeking for: PCI ID 8086:2fa0
[ 11.885609] EDAC sbridge: Seeking for: PCI ID 8086:2fa0
[ 11.885613] EDAC sbridge: Seeking for: PCI ID 8086:2f60
[ 11.885616] EDAC sbridge: Seeking for: PCI ID 8086:2f60
[ 11.885619] EDAC sbridge: Seeking for: PCI ID 8086:2fa8
[ 11.885623] EDAC sbridge: Seeking for: PCI ID 8086:2fa8
[ 11.885626] EDAC sbridge: Seeking for: PCI ID 8086:2f71
[ 11.885629] EDAC sbridge: Seeking for: PCI ID 8086:2f71
[ 11.885632] EDAC sbridge: Seeking for: PCI ID 8086:2faa
[ 11.885635] EDAC sbridge: Seeking for: PCI ID 8086:2faa
[ 11.885638] EDAC sbridge: Seeking for: PCI ID 8086:2fab
[ 11.885642] EDAC sbridge: Seeking for: PCI ID 8086:2fab
[ 11.885645] EDAC sbridge: Seeking for: PCI ID 8086:2fac
[ 11.885649] EDAC sbridge: Seeking for: PCI ID 8086:2fad
[ 11.885654] EDAC sbridge: Seeking for: PCI ID 8086:2f68
[ 11.885658] EDAC sbridge: Seeking for: PCI ID 8086:2f68
[ 11.885660] EDAC sbridge: Seeking for: PCI ID 8086:2f79
[ 11.885664] EDAC sbridge: Seeking for: PCI ID 8086:2f79
[ 11.885667] EDAC sbridge: Seeking for: PCI ID 8086:2f6a
[ 11.885671] EDAC sbridge: Seeking for: PCI ID 8086:2f6a
[ 11.885673] EDAC sbridge: Seeking for: PCI ID 8086:2f6b
[ 11.885677] EDAC sbridge: Seeking for: PCI ID 8086:2f6b
[ 11.885679] EDAC sbridge: Seeking for: PCI ID 8086:2f6c
[ 11.885684] EDAC sbridge: Seeking for: PCI ID 8086:2f6d
[ 11.885688] EDAC sbridge: Seeking for: PCI ID 8086:2ffc
[ 11.885691] EDAC sbridge: Seeking for: PCI ID 8086:2ffc
[ 11.885695] EDAC sbridge: Seeking for: PCI ID 8086:2ffd
[ 11.885698] EDAC sbridge: Seeking for: PCI ID 8086:2ffd
[ 11.885701] EDAC sbridge: Seeking for: PCI ID 8086:2fbd
[ 11.885705] EDAC sbridge: Seeking for: PCI ID 8086:2fbd
[ 11.885707] EDAC sbridge: Seeking for: PCI ID 8086:2fbf
[ 11.885711] EDAC sbridge: Seeking for: PCI ID 8086:2fbf
[ 11.885718] EDAC sbridge: Seeking for: PCI ID 8086:2fb9
[ 11.885723] EDAC sbridge: Seeking for: PCI ID 8086:2fb9
[ 11.885725] EDAC sbridge: Seeking for: PCI ID 8086:2fbb
[ 11.885730] EDAC sbridge: Seeking for: PCI ID 8086:2fbb
[ 11.885760] EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
[ 11.885766] EDAC sbridge: Couldn't find mci handler
[ 11.885768] EDAC sbridge: Couldn't find mci handler
[ 11.885770] EDAC sbridge: Failed to register device with error -19.
Ā
Hello!
Iām using a JINGSHA X99 board with Xeon E5-2678 v3 and 16GB PC3-14900R RDIMM.
CPU and X99 support ECC. linux says its is disabled.
BIOS is raw / debug version, Iām not seeing any ECC enable options - did I miss it?
Iāve attached the bios, @Lost_N_BIOS mind taking a look?
Does it meet the requirements for ECC? Iāll check the memory traces tomorrowā¦
X99_BIOS.zip (4.84 MB)
@e97 - All I can do is check the BIOS settings, and in most BIOS Iāve seen that are ECC compatible only some have an ECC setting (or two), some have none and work fine with ECC (So could just be those sticks you are trying are not compatible, if you think it should be ECC enabled)
I found the usual setting Iāve seen @ >> IntelRCSetup >> Memory Configuration >> ECC Support (Default = Auto)
TONS of settings there, and Iām not familiar with ECC, so many, or none may also apply
@Lost_N_BIOS thank you! I missed that, will try to set to Enabled and see what happens.
I verified the ECC RDIMMs have ECC enabled and working in another system. This leaves me to think its either a BIOS setting, BIOS feature or hardware traces for ECC are not connected.
Youāre welcome! Another system is not this system, unless you meant same exact model?
Not all memory is compatible with all boards, BIOS, chipsets etc - thatās all I meant, maybe this particular set of memory is incompatible with one of those things, try another different ECC stick and see if always same fails or not.
Thats a good idea. I will try some small / slower ECC modules and see if they work.
I donāt see any āECC Supportā option in IntelRCSetup >> Memory Configuration >> ECC Support (Default = Auto)
Is it named something different or hidden? How did you find it?
Thatās the exact name of the setting. Sorry, I thought you were looking in AMIBCP when you mentioned digging aroundā¦
MANY settings may be hidden from you, you will either need mod BIOS to make them visible, or directly change in AMIBCP then reflash.
If you want me to make it visible to you, tell me how far into that string of submenus can you see in your current BIOS?
I mean, can you see IntelRCSetup, if yes, can you see IntelRCSetup >> Memory Config submenu? If yes, but no ECC Support in there, then it will need to be made visible, or changed directly and left hidden (up to you)
On your BIOS Main page, can you see "Access Level" if yes, what does it say, User or Admin/Supervisor?
No worries, I should have mentioned I hadnāt modified the BIOS yet
Access level: Administrator
Here are the Memory Configuration screens:
So using AMIBCP Iāll be able to modify / unhide all the options needed?
update:
mfg got back to me with a new BIOS with āEnable ECC Supportā option available but still same with āAutoā or āEnableā also tried the 8GB sticks and same issue.
One thing I did notice if using less than 8 stick, the channels overlap. Iām thinking its a bug in the initialization.
eg:
Working ECC system, X79
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#0_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#2_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors
mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#0_DIMM#1: 0 Corrected Errors
mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#1_DIMM#1: 0 Corrected Errors
mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#2_DIMM#1: 0 Corrected Errors
mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#3_DIMM#1: 0 Corrected Errors
this system:
with 8 DIMMs = OK
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
CPU_SrcID#0_Ha#0_Chan#0_DIMM#1
CPU_SrcID#0_Ha#0_Chan#1_DIMM#1
CPU_SrcID#0_Ha#0_Chan#2_DIMM#1
CPU_SrcID#0_Ha#0_Chan#3_DIMM#1
with 2 DIMMs = OK
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
with 4 DIMMs = overlapping ??
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
Great, they unhid the option for you! Is this in the same location I mentioned it would be located? If not, send me that BIOS, and let me look, unhide the one Iām talking about, maybe it could help?
Or, still could just be this particular memory isnāt playing nice with this board/BIOS.
@raun0 @e97 - Did you guys change the ECC Support Setting from Auto to enabled? If not, that may help, auto may be setting disabled.
Yep. Exactly where you said it would be.
I donāt think its a module compatibility issue since they work fine, only without ECC enabled. Iāve tried multiple other ECC modules that have verified to work in the previous generation X79/C602 and also same generation X99/C612 systems.
Their engineers said theyāve confirmed it to be working so its likely a misconfiguration on my part or they are still working out the memory timings / settings. Iāll know for sure in a few days!
Also this post was helpful in verifying physical trace connection: https://www.bios-mods.com/forum/Thread-Rā¦e-P6T-Deluxe-V2
RDIMM datasheet = ECC pins on RAM modules
DIMM connector datasheet = pins to motherboard connector / layout
Xeon datasheet = ECC pins on CPU
Xeon / Socket Mechanical Guide = CPU pins to motherboard socket / layout
** Be careful as the pins are delicate!!
After finding the pins, use a multi-meter with ohm/resistance capability to check if they are connected!
Be careful as the pins are delicate!! ** (x2 because its important)
My alternative is to get an open source BIOS implementation like CoreBoot working on the boardā¦ it will be an adventurous challenge!
Maybe something in the ECC chip itself may not be compatible. So youāve now tried others sets as well, none work? Have you tried all these with only 1-2 sticks at a time?
Is your CPU microcode up to date/latest? If not, maybe that could be causing some issue too, at least itās something you could try updating to see if it helps, while you wait on them to reply. Itās cool they replied to you quickly and sent out new BIOS
Their support is great! Its a recently released motherboard mainly for gamers, hence the LEDsā¦ My use case, professional/workstation use, is not what they typically see so Iām happy to work with them to get everything working as the board has great features.
If the modules were the problem, at least the ECC memory controller in edac-utils would be recognized but its not showing up.
2
3
4
5
6
7
Ā
[ 2.802091] EDAC MC: Ver: 3.0.0
...
[ 36.071500] EDAC sbridge: Seeking for: PCI ID 8086:2fbb
[ 36.071525] EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
[ 36.071555] EDAC sbridge: Couldn't find mci handler
[ 36.071570] EDAC sbridge: Couldn't find mci handler
[ 36.071586] EDAC sbridge: Failed to register device with error -19.
Ā
Also in dmidecode on 4 of 8 modules show up, but system memory shows the correct amount.
I tried many configurations and documented the results: 1 stick, 2 sticks, 4 sticks in various sockets and 8 sticks.
Tried setting to "Enable". No luck.
I think its NVRAM, MSR or other conflicting setting or lack of proper initialization. Its also possible the traces are not connected so that should be verified, but in my case they are connected.
Thatās nice to hear, Iāve only seen them mentioned a few times, and I thought they were like some small, cheaper, OEM type that made generic boards.
I know nothing about what you mention above, do you mean that even if the ECC control chip on the memory stick was bad or not compatible, youād still see something you are missing above?
If yes, then sounds like they need to look into this more for you, especially since they are confirming working and OK on their end.
@e97 Dump your current BIOS for me, since you mentioned NVRAM, yes it could be locked disabled or auto in there still, especially if flashing commands didnāt destroy and rebuild new NVRAM when you flashed in the new BIOS.
I will make you mod BIOS, with it enabled everywhere Usually, there is three main locations this can be set as a BIOS setting, in setup module, in AMITSE/SetupData (This is what AMIBCP Changes when you change a default setting value), and in NVRAM.
In NVRAM, there is often two main volumes, especially if you can dump with programmer, and then there is a third and sometimes 4th in the internal BIOS volumes too.
If this is Intel System, dump with FPT. If itās AMD, there may be issues, unless you have a flash programmer, or already know what AFU can flash in MOD BIOS on this system
If Intel, hereās how in case you donāt know - Check BIOS main page and see if ME FW version is shown, if not then download HWINFO64 and on the large window on left side, expand motherboard and find ME area, inside that get the ME Firmware version.
Once you have that, go to this thread and in the section āCā download the matching ME System Tools Package (ie if ME FW version = 10.x get V10 package, if 9.0-9.1 get V9.1 package, if 9.5 or above get V9.5 package etc)
Intel Management Engine: Drivers, Firmware & System Tools
Once downloaded, inside you will find Flash Programming Tool folder, and inside that a Windows or Win/Win32 folder. Select that Win folder, hold shift and press right click, choose open command window here (Not power shell).
At the command prompt type the following command and send me the created file to modify >> FPTw.exe -bios -d biosreg.bin
Right after you do that, try to write back the BIOS Region dump and see if you get any error, if you do show me image of the command entered and the error given >> FPTw.exe -bios -f biosreg.bin
If you are stuck on Win10 and cannot easily get command prompt, and method I mentioned above does not work for you, here is some links that should help
Or, copy all contents from the Flash Programming Tool \ DOS folder to the root of a USB Bootable disk and do the dump from DOS (FPT.exe -bios -d biosreg.bin)
https://www.windowscentral.com/how-add-cā¦creators-update
https://www.windowscentral.com/add-open-ā¦menu-windows-10
https://www.laptopmag.com/articles/open-ā¦ator-privileges
This is the board:
as far as I know the only LGA2011-V3 that supports DDR3 !
Large or small I dont know. I do know they sell quite a few boards and also sell them to OEMs.
For ECC, modern x86_64 CPUs (made in the last decade) have the memory controller integrated into the CPU itself. The ECC memory is "dumb" in the sense that it has an extra chip and hardware to calculate a checksum for the other 8 bits of data but that is also treated as data. Hence non-ECC is 64-bit data and ECC is 72-bit data.
These modules can correct a single bit error and detect but not correct multi-bit errors. They will show this kind of information:
Ā
Error Correction Type: Single-bit ECC
Ā
There are more advanced ECC modules like these HP modules that also have a "controller" on the RAM itself and can detect and correct multi-bit errors. Many vendors have this multi-bit ECC under various trade names. They show this kind of info:
Ā
Error Correction Type: Multi-bit ECC
Ā
I've read there is/was a generation of ECC modules that had the controller on the RAM itself to work with non-ECC CPUs/motherboard but I've not come across these personally.
Looks like a fairly decent board, not what I was thinking when I see this name. Iāve only seen them mentioned a few times though, I think I modified a X89 for someone not long ago and we discussed how that was an odd chipset name
Be sure to see my reply to you here - X99 ECC support (2)