RAID1 Mirror Corruption on 2008 R2 Server with Intel RSTe Controller with Intel SSD Drives

During my test parity errors come out with driver 3.6.0.1086 also (probably due to RAID ROM 4.1.0.1026).

That’s why I was sure to mention that I have Sata Orom 3.8. BTW Supermicro offered to give me a bios that would upgrade my Sata Orom to 4.1, I chose not to install it since I was getting the impression that the best way to go here was the 3.8 Orom and th 3.6 drivers as per Stephen Done’s initial report.

Okay, I cut the 3.6 hdds stress test a bit short, 3 1/2 days, parity verification shows no errors. I wanted to move along to testing the 3.8 drivers with ssds and trim disabled- test is now running, set it for 7 days
here are the exact details:
Sata Orom 3.8, IRSTe 3.8.0.1111 drivers and gui, two intel ssd 530gb in raid 1 volume for the OS (server 2008r2), Trim disabled via "fsutil behavior set disabledeletenotify 1" in command prompt (reboot of machine followed), Passmark BurnIn test running at 60% load on disks.

I will report back but my gut instinct tells me that it will be okay, I am really believing that the trouble with all of this is Trim with raid 1.

I am actually repeating my tests with ‘fsutil behavior set disabledeletenotify 1’ having rebooted the Server under test after comitting this command (previously I did not reboot, because some MS doc stated that this is not necessary - but under this conditions my test failed, as mentioned above).
I hope, that this will help, because if not, I really do not know how what to do as next step…

Just wanted to add this. I found this document from Intel
http://www.intel.com/content/dam/www/pub…tions-paper.pdf
It has some interesting info, especially for you Stephen, regarding trim on the s3500 and s3700 series ssds. Apparently trim is not so beneficial with these units. Also some interesting points about checking alignment.

to the RealBibo
Is it my understanding that you have ssds in raid 1 running with trim disabled and 3.8 series drivers? How are you stress testing? Did your previous test fail?

@billydv
No, as Steve complained, I am not using the enterprise Versions - just because my server (board: supermicro x9scm-f) was not delivered with them.
Instead I am using:
Intel® Rapid Storage-Technologie für Unternehmen Information
Installiertes Kit: 13.6.0.1002
Benutzeroberflächenversion: 13.6.0.1002
Sprache: Deutsch (Deutschland)
Version des RAID-Erweiterungs-ROMs: 11.6.0.1702
Treiberversion: 13.6.0.1002
ISDI-Version: 13.6.0.1002
But in this configuration I ran into comparable, or even the same problem: regular check&verify faults…
I do stress testing using the 'h2testw.exe", which is a very usefull tool to check the consistency of harddisks or eben USB-Sticks.
Yes, my privious test with Setting TRIM to disabled (but not having rebooted after Setting this) failed!

@ theRealBibo:

According to my knowledge it is generally not a good idea to combine an Intel RAID Utility v11.x.x.xxxx series witrh any Intel RST RAID driver of the v13.x.x.xxxx series.
Furthermore I doubt, that your Intel C204 Chipset RAID1 system is fully supported by the Intel RST drivers v13.x.x.xxxx, which are designed for Intel 8- and 9-Series Chipset systems.
The fact, that you were able to install the Intel RST RAID drivers v13.6.0.1002 and the related Intel RST Console Software, does not mean, that you get the best possible results with them.
Although all Intel SATA RAID Controillers from ICH8R up to the 9-Series (except the C600/C600+ Chipset Series) have the same universal external DeviceID DEV_2822, their internal DeviceID and their features are quite different.

RealBibo,
Your mboard is a socket 1155 and the reccomended irst for your sata chipset is 12 series on the Supermicro website. I have probably 20-25 computers built in just One location which ALL have raid 1 or raid 10 with 13.6 irst and win 8.1. None have ever had a single parity error. We have very different situations, I have had SUPER success with 1155 asus boards running 13.6 drivers. Me and Stephens problems lie with the C602 chipset and the IRSTe drivers.
What I will say is I don’t understand why trim would be enabled on your chipset when it is not enabled on any of my asus 1155 boards.
Here is part of the release notes for the 11 series IRST

1. The Intel® Rapid Storage Technology driver and user interface version of this release is 11.7.0.1013 and the RST OROM/UEFI version hasn’t changed from the previous maintenance release which is 11.6.0.1702.


You may want to try that driver which is available here https://downloadcenter.intel.com/downloa…RST-RAID-Driver

Finally,
The results are in— 3.8 series drivers with ssds in raid 1, trim disabled in OS, raid volume does not fall apart

Capture6-8-15.JPG

I believe that this has always been the problem, trim simply does not work with raid 1. I intend to rerun the test but I’m quite sure the results will be the same.

Just to update, A push of the reset button while everything was running also completed successfully. After restarting, parity verification showed no errors. I am now about to run another 5 day test having swapped out the intel 530 240gb for intel s3710 200gbs. I will report the results back next weekend.

Just reported all results to Intel engineers so that maybe they can help.

Very interesting!
Just to discuss the basics: Does anybody have a whole understanding about SSDs TRIM?
If my understanding is correct, the RAID Driver has (simply) to ‘duplicate’ the TRIM command in RAID 1 and has to send it to both SSDs (that’s what I never understood, why TRIM was not supported by the drivers for such a long time).
One of my theories ever was that the TRIM command has different Impacts on both SSDs, because it’s simply different hardware…
Maybe the verification faults resulting of the different content when reading the raw sectors of the SSDs, which are TRIMed (means Contents does not care for the file System)?
Because I never had serious file system corruption!
Regards

Hi Billydv,

I think you have just performed the most interesting test so far - thank you for that!
I hope you have more response from Intel than we did. We only managed some vague Chinese whispers via SuperMicro. None of which fixed anything.
But I will keep my finger crossed… for all of us!
Good luck!

Steve

Hi Stephen,
I could have stopped simply at the point of knowing that 3.6 works with server 2008r2 but I have a big problem, I will be going to server 2012r2 which according to intel requires at a min 3.8 drivers. Now that I have repeatable testing that shows the problem is with the enabling of trim, I know that I can install 3.8 drivers on 2012r2 and simply disable trim.

Just wanted to show everyone some emails exchanged between Intel and myself regarding sata OROM version and IRSTe driver versions. Stephen, this is your exact situation.
My email to Intel:
If I could just get one definitive answer in the interim, Since the 3.6 series IRSTe are the drivers recommended for server 2008r2 on your website, can they be used successfully on a mboard that has sata orom 3.8.0.1029? What if any disadvantage/danger will there be if I was to use that configuration?
Intel’s response:
Hello Bill,
There should not be any problem if you use an Intel RSTe version older
than the Option ROM version. Intel RST should be able to work fine with
newer and older OROM versions because it has been validated for the SATA
controller that comes in that motherboard.

@billydv
Thank you for this detailed info!
Yes, you are right, I did not realize, that the 13.6 Drivers are not made to Support my C204 chipset (I simply updated to the newest Version I found @Intel, because I had some other Problems which I hoped to solve with this update - and because it did instal: …)
Actually I finished my stress tests using the 12.8.0.1016 driver (supermicro recommends) and what should I say: it works without any verification faults!

Now I do my next steps, because I have to migrate some Servers from 2008r2 to 2012r2. The problem is, that the 12.8.0.1016 Driver seems not to support 2012r2 (following its readme). I installed it anyway on a test machine running 2012r2 and my first impression is that anything is working fine.

Any comments on this?

Thank you an advance!

Just a copy of my email to SuperMicro support Europe with an already open ticket regarding this matter.

Hi Quintijn,
I just wanted to give you and your company some info as I have been testing extensively to find the exact cause to our problems. Ultimately what I have found is that while using ssds in a raid 1 configuration, only the 3.6 series drivers from Intel’s website work correctly with no parity errors or mirror corruption under high IO load. Stress testing is done via Passmark Burn in test with disk testing set to 60%. Also, a push of the reset button while system is running also does not damage the volume whatsoever, parity is maintained and verification shows no errors. The main difference with 3.6 series IRSTe drivers is that they do not enable trim in raid 1.
I have also tested extensively with the 3.8 series drivers from Intel’s website. If the driver is installed and used normally, ssd raid 1 volumes will corrupt. If trim is disabled in OS via “fsutil behavior set disabledeletenotify 1”, the 3.8 series drivers are stable, again with both manners of tests, Passmark and a push of the reset button. Neither test showed any errors in parity verification. What I believe is most important for you to know is that my testing with the 4.2 series drivers fails even with hdds. Currently you have 4.2 series drivers on your website as recommended.

Here are detailed results
Server 2008r2, C602 Chipset

1- IRSTe 3.8, 3.9, 4.1, 4.2 with raid 1 ssds fails

2- IRSTe 3.6, 3.8, 3.9, 4.1 with raid 1 hdds succeeds

3- IRSTe 4.2 with raid 1 hdds fails

4- IRSTe 3.6 with raid 1 ssds succeeds (I ran this twice to confirm my results, each time resulted in no parity errors. A 3rd test was to push the reset button on the server while it was running, Server restarted normal and parity verification reported no errors)

5- IRSTe 3.8 with raid 1 ssds succeeds if trim is disabled via "fsutil behavior set disabledeletenotify 1"

The above tests are repeatable. I believe ultimately that trim in raid 1 with ssds is simply broken in spite of the fact that all IRSTe drivers version 3.8 and higher support and enable it. I believe the tests would be repeatable in all motherboards with c600 series chipsets. I would hope that your engineers could discuss the matter with Intel engineers to correct the issue and ultimately either fix trim in raid 1 or simply update all driver versions to disable it. Please keep me posted of any changes regarding this situation.
Thanks,
Billy DeVincentis

The RealBibo,
If you now are tesing Server 2012r2, have any of Intels 12 series IRST drivers installed? I would probably test extensively with the 12.9 series
https://downloadcenter.intel.com/downloa…RST-RAID-Driver
I believe on your chipset trim in raid 1 is disabled anyway so hopefully you would not see any of these issues. Test both the 12.9 and 13.6 series. If you see any parity issues, disable trim in OS and test again. As I told you previously, I have countless workstations running ssds in raid 1 with 13.6 series drivers and have never had any issues, trim is disabled by default on those systems.