[Problem] Using MI50 32GB on an Asrock z170 Pro4S

Hi all, I bought an MI50 32 GB GPU to use for local LLMs and I wanted to use it on an old Asrock z170 Pro4S motherboard I had still laying around. However, when I installed it, everytime I boot I get booted straight to the BIOS and the GPU does not get recognized by the OS.

I get the following error from dmesg:
[ 54.170295] amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init [ 54.170686] amdgpu: probe of 0000:03:00.0 failed with error -12
From what I found online, this seems to be related to the BIOS on the GPU that needs certain settings on the motherboard. CSM needs to be turned off, which is easy to do in the BIOS, but it also seems like above 4g decoding is required.

The latest BIOS available is 7.50 found here. I searched through it using UEFITool and found a setting at VarStoreId: 0xCCCC, VarOffset: 0x3which should be above 4G decoding. I enabled it using setup_var.efi, but it still isn’t working.

Does anybody have a clue what I’m missing or what I’m doing wrong?

My setup:
Motherboard: Asrock z170 Pro4s
CPU: i5-6500
RAM: Kingston HyperX Fury HX421C14FBK2/8
PSU: Seasonic M12II 620W

Does the card vbios has a GOP driver?
I’m asking this cause in TPU 3 vbios files, only one has 2nd fw image with UEFI GOP.

I’m not sure. How can I find that out?

Dump the bios on OS with GPU-Z or similar, GOP updater can show you, link for tool.

EDIT: Ok, so what’s the hardware device ID (complete one with sub revision) on Windows device manager?
ASRock Z170 bios version is…?

The problem is that the GPU is not picked up by GPU-Z, so extracting the bios is not possible :\

Bios version is 7.50.
The GPU also doesn’t show up in Windows device manager, I can get some info from lspci:
07:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon Pro VII/Radeon Instinct MI50 32GB] [1002:66a1] (rev 02)
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:0834]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 16
Capabilities: [48] Vendor Specific Information: Len=08
Capabilities: [150 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CorrIntErr- HeaderOF-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr- HeaderOF-
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [270 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [2a0 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Capabilities: [2b0 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [2c0 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+ PASID-
Page Request Capacity: 00000100, Page Request Allocation: 00000000
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec+ Priv+, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Capabilities: [320 v1] Latency Tolerance Reporting
Max snoop latency: 3145728ns
Max no snoop latency: 3145728ns
Capabilities: [400 v1] Data Link Feature <?>
Capabilities: [410 v1] Physical Layer 16.0 GT/s
Phy16Sta: EquComplete+ EquPhase1+ EquPhase2+ EquPhase3+ LinkEquRequest-
Capabilities: [440 v1] Lane Margining at the Receiver
PortCap: Uses Driver-
PortSta: MargReady- MargSoftReady-
Kernel modules: amdgpu

Hum…isn’t this a device only supported in Linux environment???

Help Flash MI50 to Radeon VII Pro | TechPowerUp Forums

How to test an AMD Instinct Mi50/Mi60 GPU : r/LocalAIServers

Seems like it is, I don’t mind using it on Linux either, but it’s also not getting picked up there. From the second post you linked it seems that the card is just dead, so I’ll probably have to return it.

I tried the GPU on another PC and it worked there, so that is not the limiting factor. The problem seems to be that the hidden above 4g encoding does not work on my motherboard even though I set it through setup_var.