eMLC SSD RAID

Replacing a failed SSD module in a Power 740 eMLC PCI-E SAS adapter under AIX

Initial thoughts

  • WARNING: the Power 740 cannot be opened while system is powered on! (It will power off when the lid is opened.) You need a full system shutdown (power off managed system) to be able to change the internal SSD disks or PCI cards.
  • Note the following: the location code of the adapter (there is a diagram on the cover of the machine) and the disk (eg. "P1-C1-C4-T1-D1")
  • Run rmdev on each device on the RAID adapter (eg. 'rmdev -Rdl sissas2')
  • Power down the system, open the cover and remove the adapter. If you have the low profile eMLC adapters, they are found in the slot C2 or C4.
  • Identify the disk based on the location code
  • You can remove the SSD modules in pairs. First, open the small blue metal lever until it disengages from the small pin (approx 90') then the two small blue plastic levers with the location codes printed on them. Then, pull both SSD modules together from the SATA socket in horizontal direction (approx. 1.5-2 cm). It requires little force. Both modules move together. To access the disk adjacent to the adapter circuit board, remove the outer disk first.
  • Replace and re-seat the failing module, then put the adapter back in the PCI-E slot carefully.

Device names

sissasX SAS RAID adapter

hdiskX RAID array

pdiskX physical disk

Procedure

Here we had a 4-disk RAID0 which fell apart when a single pdisk died.

After the replacement, the previous array could not be reassembled, the new pdisk formed a separate new array.

After the dead disk is replaced, this is the state of the pdisks on the sissas2 adapter:

# sissasraidmgr -L -l sissas2
------------------------------------------------------------------------
Name      Resource  State       Description              Size
------------------------------------------------------------------------
sissas2   FFFFFFFF  Available   PCIe RAID & SSD SAS Adapter 3Gb
hdisk2    00FF0000  Optimal     RAID 0 Array           177.8GB
 pdisk6   00040000  Active      SSD Array Member       177.8GB
*unknwn*  00FFFFFF  Failed      RAID 0 Array          SN81C171D4
 pdisk4   00000000  RWProtected SSD Array Member       177.8GB
 pdisk5   00010000  RWProtected SSD Array Member       177.8GB
 pdisk7   00050000  RWProtected SSD Array Member       177.8GB
 *unknwn* 00040000  Missing     Array Member               N/A

Both arrays should be deleted.

Adding the disks to the other array is not recommended:

"...performance will not be optimal when using this option, since the included device will not contain parity and the data will not be re-striped"

Delete failed array, using the serial number because the array is missing:

Note that 'SN' is not needed.

# sissasraidmgr -D -l sissas2 -e SN81C171D4
0940-028 SN81C171D4 is an invalid serial number.
# sissasraidmgr -D -l sissas2 -e 81C171D4
pdisk4 Defined
pdisk5 Defined
pdisk7 Defined
*unknwn* array with serial number 81C171D4 removed
# sissasraidmgr -L -l sissas2 -a0 -j1
------------------------------------------------------------------------
Name      Resource  State       Description              Size
------------------------------------------------------------------------
sissas2   FFFFFFFF  Available   PCIe RAID & SSD SAS Adapter 3Gb
hdisk2    00FF0000  Optimal     RAID 0 Array           177.8GB
 pdisk6   00040000  Active      SSD Array Member       177.8GB
pdisk4    00000000  Active      SSD Array Candidate    177.8GB
pdisk5    00010000  Active      SSD Array Candidate    177.8GB
pdisk7    00050000  Active      SSD Array Candidate    177.8GB

Delete new array with the replaced disk:

# sissasraidmgr -D -l sissas2 -d hdisk2
hdisk2 deleted
pdisk6 Defined
# sissasraidmgr -L -l sissas2 -a0 -j1
------------------------------------------------------------------------
Name      Resource  State       Description              Size
------------------------------------------------------------------------
sissas2   FFFFFFFF  Available   PCIe RAID & SSD SAS Adapter 3Gb
pdisk4    00000000  Active      SSD Array Candidate    177.8GB
pdisk5    00010000  Active      SSD Array Candidate    177.8GB
pdisk6    00040000  Active      SSD Array Candidate    177.8GB
pdisk7    00050000  Active      SSD Array Candidate    177.8GB

Create array

-r0 stands for 'RAID 0'

# sissasraidmgr -C -r0 -z 'pdisk4 pdisk5 pdisk6 pdisk7'
.
# sissasraidmgr -L -l sissas2 -a0 -j1
------------------------------------------------------------------------
Name      Resource  State       Description              Size
------------------------------------------------------------------------
sissas2   FFFFFFFF  Available   PCIe RAID & SSD SAS Adapter 3Gb
hdisk2    00FF0000  Optimal     RAID 0 Array           711.2GB
 pdisk4   00000000  Active      SSD Array Member       177.8GB
 pdisk5   00010000  Active      SSD Array Member       177.8GB
 pdisk6   00040000  Active      SSD Array Member       177.8GB
 pdisk7   00050000  Active      SSD Array Member       177.8GB

That's all!

You can start using the array as a regular PV (here as hdisk2) again.

smitty must DIE!! ;-))

References

sissasraidmgr -h

SAS RAID controllers for AIX (PDF)

PCIe RAID and SSD SAS adapter 3 Gb