Raid setup
How to configure Raid in the BIOS
Hosts with PERC H750 RAID Controllers
We have experienced a number of issues with the configuration of this RAID controller card, particularly when there are two different classes of disks installed.
Examples of these are the newer kafka-jumbo servers and the kafka-stretch servers in both eqiad and codfw.
This page of the Dell PERC 11 users guide says the following:
Virtual disks (VDs) are enumerated, second based on the virtual disk target ID.
Target IDs are assigned to the VDs in the descending order when they are created. The first created VD is assigned the highest available target ID, and the last created VD is assigned the lowest available target ID.
Therefore, the last created VD is discovered first by the operating system.
Therefore, when creating multiple virtual disks, such as a RAID10 for data and a RAID1 for the operating system, we should always create the operating system volume last.
In addition to this we have to configure which of the two logical volumes is to be the boot drive. See the screenshot.
Even when doing this, sometimes the controller has thus far proven to be unpredictable in terms of saving this configuration.
Dell R510
(More detail on how to interact with the console, ctrl-alt-delete, ctrl chars, etc.: Dell#Connecting_to_serial_console)
Target configuration for databases (from Raid and MegaCli):
- raid-10
- 256k stripe
- no read ahead
- writeback cache
Dell R740xd2
This kind of server is used for mass-storage (normally, backups)
Target configuration for backups:
HDs:
- raid-6
- 256k stripe
- no read ahead
- writeback cache
2 SSDs:
- Software RAID 1 will be set on reimage, so those SSDs should show as "not part of a RAID" on the bios, nothing to do there
- Set first SSD as the bootable disk (only 1 device is bootable at a time :-() See: https://phabricator.wikimedia.org/T277323#7107664
To completely nuke the existing raid and start over...
- follow instructions in Build a new server to get into the BIOS (powercycle the server, console com2, etc.)
- hit CTRL-R when prompted to enter the RAID controller setup (PERC H800 Integrated BIOS Configuration Utility 2.02-0025)
- I've had the best luck hitting it several times right about when the HW initialization goes from 0% to 100%.
- delete the existing RAID set
- cursor over the Disk Group line, hit F2
- scroll down to 'Delete Disk Group'
- say "Yes" to the "Are you sure?" question. (tab to Yes, hit enter)
- create a new RAID set
- select the 'No Configuration Present' line, hit enter
- set RAID level to 10
- leave PD per Span at 2
- select all Physical Disks (space bar to X each one)
- leave VD Size alone
- set VD Name to 'allr10' or something useful.
- select Advanced Settings
- set Strip Element Size to 256KB
- set Read Policy to No Read Ahead
- set Write Policy to Write Back
- check Initialize
- choose OK
- "Initialization will destroy data on the virtual disk. Are you sure?" choose OK
- "Initialization finished" choose OK
- You should see "Spanned Disk Group: 0, Raid 10"
- You should see a single line under Virtual disks "Virtual Disk: 0, allr10, 11175.00GB" (or whatever size your disks are)
- hit 'esc' to exit
- "Are you sure" choose OK
- You should see "Press Ctrl-Alt-Delete to reboot"
- press esc-R esc-r esc-R to reboot
When you're done, ctrl-\ to get out of console, 'exit' to disconnect from the mgmt interface.