MSI GT63 Titan 8RG QTJ1 Engineering Sample need help with PCI-E/PEX

Hello to everyone!
I replaced CPU to QTJ1 in my laptop an now i have a problem. Board is working, but DGPU in MXM slot is not :sweat_smile:. There is no PEX_REFCLK.
What i did already:
Check that CLK_REQ is Low. Its low and PCH see it, when i remap pins with SSD in FIT, break connection to GPU, SSD not working till i connect GPU_CLKREQ to GND.
Check there is a working gpu at all. Tested on a desktop PC, its ok.
Check there is DGPU_PWR_EN, PWROK - all good.
Reflashed EC
Replaced PCH
Tried this PCH on G531GW - OK
PCIe controller is visible in device manager - 1901

So i guess this is a software problem. But i have no idea how to find it) MSI is terrible and i can bet this is some intentional block to prevent users from installing better processors.
I will be glad to have ideas on how to find instructions that may prohibit to enable clocking. Or at least in which module to look for it.

please attach your original bios dump.

E16L4IMS.zip (5.3 MB)
PCI_PATCH.zip (5.3 MB)

E16L4IMS - stock BIOS region, mod ME to support QTJ1
PCI_PATCH - hardcoded PCIe to type “1”, “ASRock way”. No luck. Changed jnz after cmp r11d, 506E0h to jmp to correct place.

Problem with BIOS region, processor and pch tested on g531gw.
REFCLK is present, i was measuring it wrong way. All PCIe lines connected between GPU and CPU, also tried to replace all 0.22uf capacitors, no luck

E16L4IMS_Mod1.zip (5.2 MB)
see if this one work.

if pcie works, check the ac/dc loadline settings.
i met some loadline issue before. i didnt know the reason, but i fixed by enter default values.

Oh my, you are just a god to me now :slight_smile: It’s working. I’ve been struggling with this for months, and you just came and fixed xD. It turns out that it was necessary to fix it in several modules, not only in those for the standard PCIe patch. How did you think of doing this?

1 Like

i just reprodure what i have done in 22nm pch lga1151 bios. like you said, or “the asrock way”, i just make the default return value become 1, not 4.

first time i do the cpu upgrade from hexa core to octal core on bga1440 platform was 2020.
according to the experience on lga1151 platform, there should be cpu thread number limit, and the pci host bridge (or north bridge, or system agent) whitelist for pcie x16.
i didnt find the thread limit, so i guess there will be only pcie problem.
then i did the cpu swap, and there was pcie problem, so i reproduce the modification on pci host bridge verfication, and i resolved.

This one is HM370, but with A stepping, there is a buffer overflow vulnerability here, so I can fix ME and it allows me to run an engineering sample. I used to do this on ASUS, there were no problems, but then I got this MSI. I also tried to modify this function, but I changed it to an unconditional jump to the case where it returns 1. It turns out that I just didn’t find the signature everywhere, in some places it’s not cmp r11d, 806E0h, but cmp eax, 806E0h xD

true, the jump are not all the same, but there always are push 4 and pop ebx.
i got these place by search id like 3e1f(lga non-xeon 4c), 3ec2(lga non-xeon 6c).
sometimes i just replace 3ec4(bga 6c) to 3e20(bga 8c) after cpu swap.

Haha, now i cant get correct settings for loadline. By default its 2300))) I tried 1, 20, 40, 60, 110. On 110 voltage with no load is too high already, but still crashing under 100% load. On asus it was working with 10