[Guide] How to Modify a Polaris (Radeon RX Series) GPU

.

To get 0.95v on both auto and manual voltages youā€™d need to set -250mv offset and change the manual voltage to 1.2v as that offset voltage is applied to both auto and manual voltages.



I wouldnā€™t say you are looking at any substantial improvements by replacing them, the biggest improvement with overspeccing is generally when it comes to heat output and you might get more stable voltages under load other factors start to come in to play for that.

hereā€™s the thing:

case 1: disabled powernow in system BIOS, so all CPU cores are at max 3.5GHz, during furmark stress test the card disappears from hwinfo (all fields grayed out) system crashes with BSOD (or without it) and restarts or gets stuck like that until power-cycled

diagnose: GPU core MOSFET overheat and trigger VR_HOT signal. measured temperature with DMM thermal probe on the MOSFET is 86Ā°C at the time overheat is triggered. if i have a fan blowing at them then temp remain below 80Ā°C and furmark keeps running (30 minutes tested) at the same time GDDR5 VRM is close to 100Ā°C but keeps running fine at 1.65V

this problem i can solve with replacing thermal pads to graphite ones or using extra cooler

case 2: input 1.20V on the core and set GPU core to 1400MHz, the same VR_HOT signal is triggered within seconds after starting stress test and already described happens

diagnose: looks like overcurrent protection also triggers VR_HOT, because of insufficient current for 1.20V (still not sure if itā€™s board, system BIOS or MOSFET limitation)

buildzoid wrote in one of his articles that number of phases multiplied by low-side current can give estimated current capabilities, while in another article i read they say high-side is control MOSFET and it limits output current of a phaseā€¦

now if i knew for sure low-side is limiting here iā€™d simply switch to E6932, but the confusing part is their continuous current draw rating against ambient temperature is quite lower than with E6930, for both low and high side.

something interesting regarding GDDR5 VRM too, seems like default switching frequency is about 500kHz, and increasing it to 800kHz (without voltage change) manifests with more EDC errors. will see if i can lower it down to 300kHz and whether itā€™ll have any impact

edit: just remembered that with some win10 version, the max core current limit value was 69.272, with most others it is 69.727. does that smell like a soft limit?

Almost 100c for the GDDR VRM is getting pretty high Iā€™d see about cooling that better, actual VRM should be fine and theres such a small difference with the core current Iā€™d consider it a non issue.

after a month of running the laptop with thermal paste i took off the heatsink to find this: one of the GPU phases is not been used at all as the paste is liquid just as it was when i applied it while on the other inductor it already became somewhat dried out. likely the machine was sold with some kind of GPU VRM defect (no wonder the seller gave it for 150$ less than listed price) or even if this defect happened during use it makes the overclocking attempt uselessā€¦
what is worse is below the inductors there is about 1.5mm of space to the heatsink, what dell monkeys did they the stick 0.5mm thermal pad to the heatsink and left 1mm of empty space to inductorsā€¦

Except for the actual CPU or GPU die everything else should use thermal pads, 6w/mk will do the job nicely.

Hello, I have a Sapphire RX 580 pulse, I followed the guide and I was able to improve the memory bandwidth a little bit, but I also tried to overclock the card, if I do it with afterburner (1430 and 2175) the settings are applied but despite having flashed the bios with new clocks they still use the default settings.
Did I forget to update anything?
Here is my bios: https://drive.google.com/file/d/1U3rx1JSā€¦iew?usp=sharing

Thanks in advance for any help



no doubt in that, this was temporary solution until i get pads delivered. and the result of it really took my attention - one phase has been abused, other unused. i managed to download schematic for the whole laptop, measured all resistors with DMM in GPU VRM circuit and except for the temperature sensing circuit almost all resistances correspond to the schematics.

the ones that differ could be due to different board revision (schematic was for rev A, board is rev D), but even if thats the case i measured these same resistors on the CPU and NB controller and they come up with similar deviation from schematics as the GPU ones doā€¦

so everything seems to be working fine, seems, but it doesnā€™t.

unless intersil refers to such a controller behaviour as a "perfect balancing" as they promise in datasheet

@ket :
Thank you very much for having taken your time to write and to offer such fine guide within our Forum well done!
To make it easier for the visitors to find it, I have ā€œstickiedā€ it.
Thanks again!
Dieter



You are very welcome @Fernando , and thanks!



@Nemo1985 uninstall the radeon driver from safe mode with display driver uninstaller then reinstall it that should sort you out. Also remember to move the power slider to +30%. I made some additional changes to the vBIOS you provided (see attachment)

Min voltage limit: 0.6v
Set default 2D voltage to 0.7v instead of 0.75v (you might be able to go lower for additional power saving, but have a means ready to revert to a previous vBIOS in case you set voltage too low)
Increased manual voltage limit to 1.26v to compensate for vdroop under load (your card doesnā€™t use offsets, I might be able to enable them)
Reduced auto voltages to 1.15v to stop the vBIOS from blasting the card with more voltage than is needed which will save you power and reduce heat
Reduced vDDC from 0.95v to 0.9v, this should actually help improve memory overclocking and again reduce power usage and heat
Increased default TDP from 145w to 160w, this is so when you max out the power slider to +30% the card has pretty much the full scope of power that can be provided to it via the PCI-E slot and 8pin connector

Have you checked what memory your card has with GPU-Z? It looks like you edited timings for both Samsung and Hynix memory.



no doubt in that, this was temporary solution until i get pads delivered. and the result of it really took my attention - one phase has been abused, other unused. i managed to download schematic for the whole laptop, measured all resistors with DMM in GPU VRM circuit and except for the temperature sensing circuit almost all resistances correspond to the schematics.

the ones that differ could be due to different board revision (schematic was for rev A, board is rev D), but even if thats the case i measured these same resistors on the CPU and NB controller and they come up with similar deviation from schematics as the GPU ones doā€¦

so everything seems to be working fine, seems, but it doesnā€™t.

unless intersil refers to such a controller behaviour as a "perfect balancing" as they promise in datasheet




Usually board revisions just represent changes in components nothing too drastic or mega cheapeningā€¦ but the latter can still happen, as it did with the RX590 I looked at. I donā€™t rate Intersil controllers at all it is more likely one phase is soft disabled probably where the cooling couldnā€™t handle it.

Mod03.zip (107 KB)

@ket
Thank you very much for your precious advices and help!
My card has the hynix memory I probably made some mess with my first timing mod, in short is the very same model as your.
Waiting for your advices I solved the issue in another way, I modified the bios of the nitro+ version, changing the deviceid and subvendor id.
I kept the frequency to 2000mhz and used those settings: ā€œ777000000000000022EE9C00106A6D4D906914153C8E160B004684007D0714204A8900A00200712414143F48B1324C1Aā€ (your settings), the gpu clock is at 1430 with 1138 voltage as before.
With those settings Iā€™m able to reach 1430 consistently without using afterburner to increase the power limit.
On poclmembench I get 209-210gb\s, quite far from your stunning result but better than default values.
With the actual bios I improved the gaming performance from 78 fps to 82 in shadow of tomb raider demo benchmark.


Edit: I tried the modified bios, obviously the poclmembench is lower (around 200gb\s) also the gaming performance are lower.
Then I used your bios as base and put the powertune values of the nitro+ with polarisbioseditor but the maximum frequency for the core is still 1366, so iā€™m guessing there are some hidden settings that keep the frequency lower in our pulse bios?

I further edited my old values, now at 2200mhz memory clock, iā€™m able to reach 228-220gb\s, apparently without artifacts (I used hwinfo64 as per your guide).

I attached the version iā€™m using now, I was able just to decrease the idle voltage to 700 as you did, the system apparently worked fine in 3d, but then after 2-3 hours the screen became grey and then turned off and I had to hard reset the system, any suggestion is going to be appreciated :slight_smile:

Mod06c actual.zip (110 KB)

@Nemo1985 try this timing strap, if you have a strong enough memory controller this should let you hit 2200MHz+. These timings are about as tight as you can make them for the frequency and you should hit 226/228GB/s with them. 777000000000000022EE9C00106A7D4DA06914153C8EC60B004684007D0714204A8900A00200712414143F48BC324C1A.

1 Like



Thank you very much, I have been able to use those settings at 2150 (at 2200 I get memory errors) which gave me the bandwidth of 225gb.

@ket
Why in rx580 sapphire pulse the state 5 has higher voltage?
It seems that this is the case for few manufacturers.

1.jpg

@boombastik its a bug with one of the automatic pointers, youā€™re better off inputting a manual value anyway and limiting auto voltages to no more than 1.15v to stop cards from cooking themselves anyway. Auto voltages are absolutely terrible on Polaris cards I have no idea how they passed any kind of QA test as those auto values will blast a card with 1.2v even if it only needs 1.1v or less.

Hello and Iā€™d like to start by thanking you for this very thorough guide.

So I have an XFX RS RX570 8GB, the one with the shitty VRM, 2 BIOSes and Samsung(K4G80325FC) memory. Iā€™ve been messing with the memory timings for about a week now, after reading your guide. Iā€™ve been largely unsuccessful, as the 7-8 straps Iā€™ve tried so far have either resulted in lots of errors in HWiNFO or no errors but black screens/freezes/artifacts during gaming.

Frustrated, today I decided to reflash the original BIOS, run oclmembench and use AMD Memory Tweak XL to check the load timings in real time. I also checked the timings for the idle 300mhz strap. This led to some interesting, yet annoying discoveries which Iā€™ve outlined in the picture:

Screenshot_23.png



That is what happens during an oclmembench test done at stock settings(1150/2050). Clearly, the card isnā€™t exactly following what the strap tells it to do. Worse, when I apply any setting in MSI Afterburner or OverdriveNTool, I notice the oclmembench results take a hit and if I check the real timings again, I notice that the top 2000mhz memory state now uses the ARB_DRAM_TIMING and ARB_DRAM_TIMING2 values from the bottom 300mhz state. These are very tight timings, like RAS2RAS 31, RP 13. So basically, since the card is idle when I apply settings in MSI afterburner, those 2 timing sections become ā€œfrozenā€ and stay the same no matter what strap the card is in. This also happens when I set anything in OverdriveNTool. This is likely why Iā€™m either getting memory errors or crashes when testing modded straps, as I have set the card to 1750mhz in the BIOS. I then apply the modded timings to the 2000 strap and when Iā€™m in Windows, I apply my 2100mhz overclock in MSI Afterburner. That ends up messing the ARB_DRAM_TIMING and ARB_DRAM_TIMING2 timings. The only way to avoid this is to do all my settings in the BIOS and never use any tools to alter them when in Windows. This is annoying since it means I could brick the BIOS if I try aggressive timings and Iā€™d also be unable to pick different settings for each game(some stuff I play is old and really doesnā€™t need 1400/2100 clocks).

So my questions are:
1. Should I worry about the usually messed up ARB_DRAM_TIMING and ARB_DRAM_TIMING2 timings? Out of the box, at 2050, the card scores 210GB/s in oclmembench and has no issues overclocking to 1400 at 1.125v and staying stable in games. No errors in HWiNFO either. This is despite the messed up ARB_DRAM_TIMING and ARB_DRAM_TIMING2 timings. Itā€™s not like I can do anything about this tho.

2. Is there any way to maintain the ability to change card settings in Windows without really screwing up the ARB_DRAM_TIMING and ARB_DRAM_TIMING2 timings? Or should I just settle on a default configuration and configure that via the BIOS?

Some more details about my card: ASIC Quality 73%. VDDCI is 0.838v under load.

that tweaker xl tool is unstable/broken? and it was also giving me troubles on rx560. using several oc tools at once will likely result in crashes. get rid of afterburner and use adrenalineā€™s builtin wattman only

I only use the tool to check the timings. I didnā€™t set anything in it. Wattman is worse than Afterburner in that trying 1400/2100 in Wattman results in the driver crashing. Only MSI Afterburner has given me stable overclocks, until I began playing with the memory timings. I doubt tweaker xl tool is broken as it seems to read everything correctly(I have to fix the fan profiles in the BIOS) and the drop in performance(in oclmembench) and occasional memory errors(in HWiNFO) coincide with tweaker xl tool detecting the 300mhz strap values for ARB_DRAM_TIMING and ARB_DRAM_TIMING2.

they are all just frontends to set the desired clocks, and should not cause unstable overclocks, but the way each of them work may cause incompatibility between them. for example afterburner sets additional registry keys and soft powerplay table once you tick that OC checkbox which might set some bits different than adrenaline/overdriventool, but if you edit BIOS with all the required changes there should be no problem for adrenaline eitherā€¦
about tewaeker xl, yes reads correctly, but sets timings that reset or autochange only seconds later - thus broken for whatever reason and this is where i stopped wasting time with it