first problem here was there were no resizable BAR capabilities advertised in PCI space of RX460 GPU (might be possible to fix this by patching PCI memory) but easier solution for me was to solder external ROM chip for vBIOS and make the system use it. rebar capability showed up in lspci output.
then it was needed to insert rebardxe into main BIOS and reflash it. this went smoothly.
next thing is to use driver that supports SAM, on win10 so far only amernime drivers with flex-arch kernel worked - 22.10.1 and 23.4.1
these steps get a working 1GB BAR, but it appears there is little to no difference as several people reported they see some improvements on polaris GPUs only with 2GB and larger BARs
trying to set up 2GB with rebarstate will obviously give a black screen and soft-bricked laptop, because it needs above 4g decode enabled, which is broken in this BIOS version
but it is there and can be enabled from modGRUBShell efi bootable USB by running this command
setup_var_cv PCI_COMMON 0x3 0x1 0x1
main problem after doing that is that PCI resources are missing and system will just crash and display output will stuck on last static image
it needs another BIOS patch that may be difficult to identify, but looking at the CPU BKDG there are MMIO base/limit pair registers and using HE_v1.22.10.19_Portable utility it is possible to see that 3 ranges are used by default - 0,1 and 7
after some experimenting it turned out to be enough to initialize one more pair with the values suitable for large memory, first i used setpci in linux, then same module in grub and finally i patched rebardxe and compiled it to be set automatically whenever rebarstate is changed to anything other than 0.
here’s the example of patched code
VOID reBarSetupDevice(EFI_HANDLE handle, EFI_PCI_ROOT_BRIDGE_IO_PROTOCOL_PCI_ADDRESS addrInfo)
{
UINTN epos;
UINT16 vid, did;
UINTN pciAddress;
UINT32 base = 0x48f0003;
UINT32 limit = 0xfffff00;
gBS->HandleProtocol(handle, &gEfiPciRootBridgeIoProtocolGuid, (void **)&pciRootBridgeIo);
pciAddress = EFI_PCI_ADDRESS(0x0, 0x18, 0x1, 0);
pciWriteConfigDword(pciAddress, 0x94, &limit);
pciWriteConfigDword(pciAddress, 0x90, &base);
pciAddress = EFI_PCI_ADDRESS(addrInfo.Bus, addrInfo.Device, addrInfo.Function, 0);
pciReadConfigWord(pciAddress, 0, &vid);
pciReadConfigWord(pciAddress, 2, &did);
then i modify DSDT table, qwordmemory entries that are all zero
QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x000000048F000000, // Range Minimum
0x0000000FFFFFFFFF, // Range Maximum
0x0000000000000000, // Translation Offset
0x0000000B71000000, // Length
,, , AddressRangeMemory, TypeStatic)
this DSDT change is needed for linux if using it with above 4G disabled
if only need it with above 4G enabled these values don’t need to be changed, but another patch is needed,
the two If (LOr (MALH, MALL)) sections need proper local variables offset
If (LOr (MALH, MALL))
{
CreateDWordField (CRS1, \_SB.PCI0._Y07._MIN, MN8L) // _MIN: Minimum Base Address
Add (0x00CE, 0x04, Local0)
CreateDWordField (CRS1, Local0, MN8H)
Store (MABL, MN8L) /* \_SB_.PCI0._CRS.MN8L */
Store (MABH, MN8H) /* \_SB_.PCI0._CRS.MN8H */
CreateDWordField (CRS1, \_SB.PCI0._Y07._MAX, MX8L) // _MAX: Maximum Base Address
Add (0x00D6, 0x04, Local1)
CreateDWordField (CRS1, Local1, MX8H)
CreateDWordField (CRS1, \_SB.PCI0._Y07._LEN, LN8L) // _LEN: Length
Add (0x00E6, 0x04, Local2)
CreateDWordField (CRS1, Local2, LN8H)
Store (MABL, MN8L) /* \_SB_.PCI0._CRS.MN8L */
Store (MABH, MN8H) /* \_SB_.PCI0._CRS.MN8H */
Store (MALL, LN8L) /* \_SB_.PCI0._CRS.LN8L */
Store (MALH, LN8H) /* \_SB_.PCI0._CRS.LN8H */
Store (MAML, MX8L) /* \_SB_.PCI0._CRS.MX8L */
Store (MAMH, MX8H) /* \_SB_.PCI0._CRS.MX8H */
}
If (LOr (MALH, MALL))
{
CreateDWordField (CRS2, \_SB.PCI0._Y0E._MIN, MN9L) // _MIN: Minimum Base Address
Add (0x008C, 0x04, Local0)
CreateDWordField (CRS2, Local0, MN9H)
CreateDWordField (CRS2, \_SB.PCI0._Y0E._MAX, MX9L) // _MAX: Maximum Base Address
Add (0x0094, 0x04, Local1)
CreateDWordField (CRS2, Local1, MX9H)
CreateDWordField (CRS2, \_SB.PCI0._Y0E._LEN, LN9L) // _LEN: Length
Add (0x00A4, 0x04, Local2)
CreateDWordField (CRS2, Local2, LN9H)
Store (MABL, MN9L) /* \_SB_.PCI0._CRS.MN9L */
Store (MABH, MN9H) /* \_SB_.PCI0._CRS.MN9H */
Store (MALL, LN9L) /* \_SB_.PCI0._CRS.LN9L */
Store (MALH, LN9H) /* \_SB_.PCI0._CRS.LN9H */
Store (MAML, MX9L) /* \_SB_.PCI0._CRS.MX9L */
Store (MAMH, MX9H) /* \_SB_.PCI0._CRS.MX9H */
}
wrong values that were in the original DSDT:
Add (0x0670, 0x04, Local0)
Add (0x06B0, 0x04, Local1)
Add (0x0730, 0x04, Local2)
Add (0x0460, 0x04, Local0)
Add (0x04A0, 0x04, Local1)
Add (0x0520, 0x04, Local2)
now need to replace BIOS DSDT table with modified one and replace rebardxe module with the one built from patched code. reflash BIOS
need to activate rebarstate so that MMIO registers write are executed at each boot
at this point there is resizable BAR of RX460 up to 4GB working in linux (legacy BIOS as well) and above 4g decode is disabled as it is not needed in linux
MMIO patch solves the problem like in this comment failed to send message 254 ret is 0 when using legacy installed ubuntu
it also solved a number of issues including black screen when above 4g decode is enabled in UEFI ubuntu OS
still a work in progress, but so far uefi ubuntu boots with above 4g decode and rebarstate of 4GB. windows 10 crashes. will update this once everything is working properly
it was not possible to fix ACPI bug easily, first BSOD had something to do with AHCI controller, the address was found in DSDT and changed to the one linux assigned to the controller. then different BSOD code was there:
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
ACPI_BIOS_ERROR (a5)
The ACPI Bios in the system is not fully compliant with the ACPI specification.
The first value indicates where the incompatibility lies:
This BugCheck covers a great variety of ACPI problems. If a kernel debugger
is attached, use "!analyze -v". This command will analyze the precise problem,
and display whatever information is most useful for debugging the specific
error.
Arguments:
Arg1: 0000000000000002, ACPI_ROOT_PCI_RESOURCE_FAILURE
ACPI could not process the resource list for the PCI root buses
Arg2: ffff8e881deca980, The ACPI Extension for the PCI bus.
Arg3: 0000000000000001, ACPI cannot convert the BIOS' resource list into the proper
format. This probably represents a flaw in the BIOS' list
encoding procedure.
Arg4: ffff8e881dec1d60, Pointer to the QUERY_RESOURCE_REQUIREMENTS IRP
Debugging Details:
------------------
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 15906
Key : Analysis.Elapsed.mSec
Value: 16041
Key : Analysis.IO.Other.Mb
Value: 0
Key : Analysis.IO.Read.Mb
Value: 0
Key : Analysis.IO.Write.Mb
Value: 0
Key : Analysis.Init.CPU.mSec
Value: 2264
Key : Analysis.Init.Elapsed.mSec
Value: 56629
Key : Analysis.Memory.CommitPeak.Mb
Value: 68
Key : Bugcheck.Code.KiBugCheckData
Value: 0xa5
Key : Bugcheck.Code.LegacyAPI
Value: 0xa5
Key : Failure.Bucket
Value: 0xA5_ACPI!ACPIBusIrpQueryResourceRequirements
Key : Failure.Hash
Value: {c444906d-5cf9-2bd6-38f5-169a2b7b3c42}
Key : Hypervisor.Enlightenments.Value
Value: 0
Key : Hypervisor.Enlightenments.ValueHex
Value: 0
Key : Hypervisor.Flags.AnyHypervisorPresent
Value: 0
Key : Hypervisor.Flags.ApicEnlightened
Value: 0
Key : Hypervisor.Flags.ApicVirtualizationAvailable
Value: 1
Key : Hypervisor.Flags.AsyncMemoryHint
Value: 0
Key : Hypervisor.Flags.CoreSchedulerRequested
Value: 0
Key : Hypervisor.Flags.CpuManager
Value: 0
Key : Hypervisor.Flags.DeprecateAutoEoi
Value: 0
Key : Hypervisor.Flags.DynamicCpuDisabled
Value: 0
Key : Hypervisor.Flags.Epf
Value: 0
Key : Hypervisor.Flags.ExtendedProcessorMasks
Value: 0
Key : Hypervisor.Flags.HardwareMbecAvailable
Value: 0
Key : Hypervisor.Flags.MaxBankNumber
Value: 0
Key : Hypervisor.Flags.MemoryZeroingControl
Value: 0
Key : Hypervisor.Flags.NoExtendedRangeFlush
Value: 0
Key : Hypervisor.Flags.NoNonArchCoreSharing
Value: 0
Key : Hypervisor.Flags.Phase0InitDone
Value: 0
Key : Hypervisor.Flags.PowerSchedulerQos
Value: 0
Key : Hypervisor.Flags.RootScheduler
Value: 0
Key : Hypervisor.Flags.SynicAvailable
Value: 0
Key : Hypervisor.Flags.UseQpcBias
Value: 0
Key : Hypervisor.Flags.Value
Value: 16777216
Key : Hypervisor.Flags.ValueHex
Value: 1000000
Key : Hypervisor.Flags.VpAssistPage
Value: 0
Key : Hypervisor.Flags.VsmAvailable
Value: 0
Key : Hypervisor.RootFlags.AccessStats
Value: 0
Key : Hypervisor.RootFlags.CrashdumpEnlightened
Value: 0
Key : Hypervisor.RootFlags.CreateVirtualProcessor
Value: 0
Key : Hypervisor.RootFlags.DisableHyperthreading
Value: 0
Key : Hypervisor.RootFlags.HostTimelineSync
Value: 0
Key : Hypervisor.RootFlags.HypervisorDebuggingEnabled
Value: 0
Key : Hypervisor.RootFlags.IsHyperV
Value: 0
Key : Hypervisor.RootFlags.LivedumpEnlightened
Value: 0
Key : Hypervisor.RootFlags.MapDeviceInterrupt
Value: 0
Key : Hypervisor.RootFlags.MceEnlightened
Value: 0
Key : Hypervisor.RootFlags.Nested
Value: 0
Key : Hypervisor.RootFlags.StartLogicalProcessor
Value: 0
Key : Hypervisor.RootFlags.Value
Value: 0
Key : Hypervisor.RootFlags.ValueHex
Value: 0
Key : SecureKernel.HalpHvciEnabled
Value: 0
Key : WER.OS.Branch
Value: vb_release
Key : WER.OS.Version
Value: 10.0.19041.1
BUGCHECK_CODE: a5
BUGCHECK_P1: 2
BUGCHECK_P2: ffff8e881deca980
BUGCHECK_P3: 1
BUGCHECK_P4: ffff8e881dec1d60
ACPI_EXTENSION: ffff8e881deca980 -- (!acpikd.acpiext ffff8e881deca980)
ACPI_RESCONFLICT: 0000000000000001 -- (!acpiresconflict 0000000000000001 ffff8e881dec1d60)
acpiresconflict: Failed to initialize type nt!_IO_RESOURCE_REQUIREMENTS_LIST
PROCESS_NAME: System
LOCK_ADDRESS: fffff80147a44be0 -- (!locks fffff80147a44be0)
Resource @ nt!PiEngineLock (0xfffff80147a44be0) Exclusively owned
Threads: ffff8e881de88400-01<*>
1 total locks
PNP_TRIAGE_DATA:
Lock address : 0xfffff80147a44be0
Thread Count : 1
Thread address: 0xffff8e881de88400
Thread wait : 0x1c3
STACK_TEXT:
fffff188`2d0059c8 fffff801`47319022 : fffff188`2d005b30 fffff801`4717e080 00000000`00000000 00000000`00000000 : nt!DbgBreakPointWithStatus
fffff188`2d0059d0 fffff801`47318606 : 00000000`00000003 fffff188`2d005b30 fffff801`472162e0 00000000`000000a5 : nt!KiBugCheckDebugBreak+0x12
fffff188`2d005a30 fffff801`471fe417 : 00000000`00000002 00000000`00000000 00000000`c0000001 ffff8e88`1e6e7110 : nt!KeBugCheck2+0x946
fffff188`2d006140 fffff801`4a4e8182 : 00000000`000000a5 00000000`00000002 ffff8e88`1deca980 00000000`00000001 : nt!KeBugCheckEx+0x107
fffff188`2d006180 fffff801`4a441262 : ffff8e88`1deca980 00000000`00000000 ffff8e88`1dec1d60 ffff8e88`1dec1e30 : ACPI!ACPIBusIrpQueryResourceRequirements+0xbfe2
fffff188`2d006210 fffff801`4704ad55 : 00000000`00000007 ffff8e88`1e6bdce0 00000000`00000000 fffff188`2d006370 : ACPI!ACPIDispatchIrp+0x252
fffff188`2d006290 fffff801`47511b70 : 00000000`00000000 ffff8e88`1e6bdce0 fffff188`2d006370 ffff8e88`1dee6a20 : nt!IofCallDriver+0x55
fffff188`2d0062d0 fffff801`47567ce1 : ffff8e88`1e6bdce0 fffff188`2d006420 ffffffff`800000f8 fffff188`2d006458 : nt!IopSynchronousCall+0xf8
fffff188`2d006340 fffff801`47567b0f : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!PpIrpQueryResourceRequirements+0x49
fffff188`2d0063d0 fffff801`47573b3f : 00000000`00000000 00000000`00000000 00000000`00000000 ffffffff`800000f8 : nt!PiQueryResourceRequirements+0x3b
fffff188`2d006450 fffff801`475702d4 : ffff8e88`1dee6a20 ffff8e88`1dee6a20 00000000`00000001 00000000`00000000 : nt!PiProcessNewDeviceNode+0x95f
fffff188`2d006620 fffff801`47181617 : 00000000`00000000 00000000`00000001 fffff188`2d006738 ffff8e88`00000000 : nt!PipProcessDevNodeTree+0x380
fffff188`2d0066f0 fffff801`471811ef : fffff801`435cf820 00000000`00000001 ffff8e88`1e5fe300 fffff801`00000000 : nt!PnpDeviceActionWorker+0x3c7
fffff188`2d0067b0 fffff801`4785d0ef : ffff8e88`1e0bedd0 fffff801`00000007 00000000`00000000 ffffdf02`00000024 : nt!PnpRequestDeviceAction+0x37b
fffff188`2d006820 fffff801`4785d892 : 00000000`00000001 00000000`00000001 ffffffff`800000d8 ffff8e88`1dec5480 : nt!PipInitializeCoreDriversByGroup+0x13f
fffff188`2d0068c0 fffff801`47849b31 : fffff801`00000000 fffff801`47a45700 fffff801`47a45b10 fffff801`435cf7b0 : nt!IopInitializeBootDrivers+0x186
fffff188`2d006a70 fffff801`47865f71 : fffff801`4b939000 fffff801`435cf7b0 fffff801`475b65b0 fffff801`435cf700 : nt!IoInitSystemPreDrivers+0xa71
fffff188`2d006bb0 fffff801`475b65eb : fffff801`435cf7b0 fffff801`47a47008 fffff801`475b65b0 fffff801`435cf7b0 : nt!IoInitSystem+0x15
fffff188`2d006be0 fffff801`47129905 : ffff8e88`1de88400 fffff801`475b65b0 fffff801`435cf7b0 e8104b8d`4800705a : nt!Phase1Initialization+0x3b
fffff188`2d006c10 fffff801`47207318 : fffff801`43943180 ffff8e88`1de88400 fffff801`471298b0 000001ba`007059d4 : nt!PspSystemThreadStartup+0x55
fffff188`2d006c60 00000000`00000000 : fffff188`2d007000 fffff188`2d001000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28
SYMBOL_NAME: ACPI!ACPIBusIrpQueryResourceRequirements+bfe2
MODULE_NAME: ACPI
IMAGE_NAME: ACPI.sys
IMAGE_VERSION: 10.0.19041.4355
STACK_COMMAND: .cxr; .ecxr ; kb
BUCKET_ID_FUNC_OFFSET: bfe2
FAILURE_BUCKET_ID: 0xA5_ACPI!ACPIBusIrpQueryResourceRequirements
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {c444906d-5cf9-2bd6-38f5-169a2b7b3c42}
Followup: MachineOwner
---------
0: kd> lmvm ACPI
Browse full module list
start end module name
fffff801`4a440000 fffff801`4a50c000 ACPI (pdb symbols) c:\myserversymbols\acpi.pdb\971DB8530BB0BB2CBA7514C46D12F27E1\acpi.pdb
Loaded symbol image file: ACPI.sys
Image path: \SystemRoot\System32\drivers\ACPI.sys
Image name: ACPI.sys
Browse all global symbols functions data
Image was built with /Brepro flag.
Timestamp: 0B752E6D (This is a reproducible build file hash, not a timestamp)
CheckSum: 000CADD8
ImageSize: 000CC000
File version: 10.0.19041.4355
Product version: 10.0.19041.4355
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 3.7 Driver
File date: 00000000.00000000
Translations: 0409.04b0
Information from resource tables:
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: ACPI.sys
OriginalFilename: ACPI.sys
ProductVersion: 10.0.19041.4355
FileVersion: 10.0.19041.4355 (WinBuild.160101.0800)
FileDescription: ACPI Driver for NT
LegalCopyright: © Microsoft Corporation. All rights reserved.
0: kd>
maybe the AHCI bug need better fix, maybe it’s something else. either way no time to waste with it anymore and no interest as well
meanwhile, also got the idea to try get larger than 1GB BAR without 4G decode
in the end this was possible
first thing was to boot linux and get base addresses of bridge port GPU is connected to and of GPU itself.
then use setpci module in GRUB and write the obtained addresses to both devices before booting windows from GRUB
at that point GPU was assigned in large memory space but still with 1GB BAR
i did try to write BAR size to GPU PCI space from GRUB too, but that didn’t work as the windows was overwriting it later (0x209 offset)
then remembered to try to do the same thing, but this time from windbg on another laptop since i already had it set up to catch BSOD
finally, after windows booted there was BAR size of 4GB reported in GPU-Z
4GB broken, benchmarks kept crashing, or if one finished there were stalls and scores dropping due to this (observing hwinfo, VRAM usage kept getting reset to 0MB, and the data being reloaded again)
even windows fonts were getting lost from some apps.
with 2GB, it became stable, there was no improvement. FPS in valley was slightly worse (about 0.5FPS)
and the same result in game.
some stuttering started to happen, in the same place in game where there was small stutter without large BAR, now there was something almost like a hang with audio breaking for maybe half a second and then the game would resume. i was thinking it could maybe fix this problem if not give FPS boost but it only made it worse. same type of glitch was happening several times in other places of the map
it seems large BAR itself is of little to no help and there are other features associated with it that actually give some performance increase
ryzen 5000 and newer CPU have this in hardware, older CPU only emulate in microcode and are many times slower doing so