Make unRAID Trust the Parity Drive, Avoid Rebuilding Parity Unnecessarily

From unRAID
Jump to: navigation, search

The 'Trust My Array' Procedure

What is this? And why would you do it?

There are various situations where unRAID may not think that the parity drive is currently valid, and want to totally rebuild it, yet you KNOW that your current parity drive is completely valid, perfectly good! Since rebuilding the parity drive involves reading from every single sector of every data drive, and rewriting the entire parity drive, it is natural to want to avoid that, if possible.

Some of the situations where this may apply:

  • Disk drive rearrangements, such as changing the disk numbers or physically moving the drives to different ports, may confuse unRAID (more commonly in unRAID v4), and result in unRAID assuming the parity drive is no longer valid for the current configuration.
For v6, see "What is the safe way to rearrange disk numbers, assignments, slots, etc?"
For v4 or v5, see "What is the safe way to rearrange disk numbers, assignments, slots, etc?"
  • Disk controller or cable faults that result in loss of communications to one or more drives, causing the drive(s) to be marked Disabled, displayed with a <red> ball or X, even though they are fine and were not written to
  • A disk drive that fails to spin back up, causing the drive to be marked Disabled, even though it is fine and was not written to
  • Drive randomly not recognized by BIOS on boot, perhaps because of flaky motherboard, causing drive to be dropped from array
  • Parity drive was removed, then re-assigned, but no important changes were made to it or any of the data drives (small changes will result in a few parity errors on the ensuing parity check)
  • Upgrading from unRAID v4.3.1 or earlier, to v4.3.2 or later, AND you have Western Digital or Hitachi drives with very long model names (see this)
  • There may be other situations too, but if you KNOW that you have the same set of drives that are associated with (and used to calculate) the current parity drive, then this procedure should restore the array and re-validate the parity drive. It does not matter if the drives are numbered differently, or moved to different cables, ports, and controllers. But the parity drive MUST be correctly assigned(!), and the exact same set of drives must be assigned, in which ever slots you want them

Situations where this should NEVER be used:

  • It should NEVER be used if you have a failed disk or are in the process of replacing a failed disk.
  • It should NEVER be used if you are replacing a drive with one of a larger size.
  • It should NEVER be used if you are adding or removing disks in the array.
It is ONLY to be used if exactly the same drives last used to perform a full parity calculation are present and working, AND only those same drives are currently assigned to the array. If any are missing, or any disks have been added, or deleted, or have failed since the last full parity calculation, this procedure will falsely lead you to think parity protection is in effect - when it is NOT, it is invalid.

A situation where this procedure should probably not be used:

  • When a drive is disabled because a "write" to the drive failed, and you wish to keep the data that was in the process of being written
If a disk is off-line, it is because a "write" to it failed. If the drive went off-line when you were saving the only copy of important files/music/pictures/movies, etc then the failed drive does not contain any of the data that was written to it.
Any files written to the failed drive were recorded in parity, but not to the physical data drive. It is entirely possible to load the entire failed disk with files, but only update parity and not the physical data disk, because it is disabled.
If you use this procedure, it will force the disabled drive back online, and when the resulting parity check proceeds, it will be updated to reflect the data on the physical disk. In other words, using this procedure will bring the disk back without the new files written to it when it went off-line or any data written to it since that time. You will effectively roll back the clock, as if they were never written to the array.
A subsequent parity check will contain many errors, as it is brought in sync with the physical disk contents. Remember, the disk was taken off-line because a write to it failed.
If you want to preserve the data that was written to the failed drive, you must unassign the drive, start the array, stop the array, re-assign the drive, and let unRAID re-construct the contents as created by parity and all the other data drives. They can reconstruct a full copy of what was written to the array based on parity and the other data drives. You will not have parity protection during this period of time, but you will have the latest data written.
So, you can choose. Get back the data you were writing to the failed disk by NOT using this procedure, or get back immediate parity protection (but knowing it must have errors, because a write to the physical drive failed, so it may not really be usable in all recovery situations until after a full parity check is performed.)
In other words, the "Trust My Array" procedure should probably NOT be used if you were writing files to the drive that was disabled

For unRAID v6

These instructions assume you are using unRAID v6
Sometimes, in some situations, you can get away without having to do a New Config (which requires reentering ALL drives). It may not work in all unRAID versions, and it clearly doesn't work in all situations. It often does work however, if all you have done is reassign drives, or swapped drive assignments. It may not work if you have empty slots within your array assignments. That is, drive assignments should be contiguous (e.g. 3 drives should be 1,2,3; not 1,3,4 or 1,2,9).
So how do you know if it will work for you? Just try starting the array! If it works, you're done! There's nothing more to do (although a full parity check is a very good idea, to ensure that parity really is good, for every parity bit). If however the array refuses to start, shows an error instead, use the following procedure.

This procedure starts by removing all drive assignments. You will then need to reenter all of them from your notes, making any changes you desire.
  1. Take a screenshot of your current array assignments, or make good notes of them
  2. Stop the array (if it is started)
  3. Go to <Tools> and click <New config>
  4. Reassign all disks using your notes or screen shot
  5. Double check that your Parity disk is assigned correctly!
  6. Click the check box "Parity is already valid" (make sure it is checked)
  7. Start the array
  8. Done!
  9. It is strongly recommended to do a parity check now, to make sure parity is fine

For unRAID v6.2, there's word of a simpler way, that avoids having to use New Config. Just unassign the parity drive, start and stop the array (to make it forget the parity drive assignment), reassign the parity drive, and start the array after clicking the "Parity is already valid" checkbox.

For unRAID v4 and v5

These instructions assume you are using a version of unRAID prior to v6

Note: THIS PROCEDURE DOES NOT WORK THE SAME IN ANY RECENT 5.X SERIES OF UNRAID. If you refresh the web-interface, the commands you entered will be undone. You will then be overwriting parity. Consider yourself warned.

Here is a procedure provided by Tom of Lime Technology. Do this only if you know your configuration is completely valid, no disabled or missing disks, all disks correctly assigned, and you are sure that your parity drive is good. The data drives do NOT have to be assigned to the same slots as they were previously.

  • Boot unRAID, but DO NOT START the array. Stop the array if it has started.
  • Make sure that all of your disks are correctly assigned, not disabled or missing. Note: they do not have to be assigned to the same slots they were originally assigned to, except for the Parity drive(!), but it MUST be the same set of drives.
  • Open a console, either at the unRAID server or in a terminal session with SSH or Telnet. Make sure you are in the home directory, which is /root. If you are unsure, type cd and press the Enter key, and you will be there.
  • If you are running any version of unRAID that is PRIOR to v4.5.4, then at the unRAID Web Management page, click the Restore button, after first checking the "I'm sure I want to do this" box.
  • If you are running unRAID version v4.5.4 or later, then log in as root at your system console or via SSH or Telnet and type the following command:
  • If necessary, refresh your unRAID Web Management Main page in your browser. For UnRaid 4.7 ensure you DO NOT refresh the unRAID Web Management Main page in your browser (for reasons that are explained here).
  • On the unRAID Web Management Main page, this should result in all disk status symbols/balls turning <Blue>. The server status should indicate "Stopped. Initial Configuration".
  • Now at the unRAID console or SSH or Telnet prompt, type this command: (Please see this thread for v5 updated syntax. Upon verification, info here will be updated too.)
mdcmd set invalidslot 99
  • The output of this command should be this:

In the case of UnRaid 4.7, there will be no such output (again, for reasons that are explained here).

  • Now click the Start button. All the disk status indicators should turn <Green>; the system state should be Started; and there should be a parity check in progress. You can let the parity check complete, or you can cancel it. In most cases, you should let it finish. If you were correct and parity was valid, the parity check will not find any errors. If you were wrong, then the parity check will find and correct the errors, and report them. By the time the parity check reports on the parity errors, they will have already been corrected.

For more information, see the original post and thread.

For version 4.7 and 5.0 series, see this post. See also this post.

Other procedure

This appears to be a procedure to remove a drive from an unRAID array by zeroing it. I have no idea why it was put in this wiki page!

For version 6.0, follow the below procedure to remove a drive from the array:

1) Start a screen session or work on the physical console of the server, as this may take more than a day

2) With the array started, empty the drive of all contents (move all files to another drive)

3) Stop Samba : "/root/samba stop"

4) Unmount the drive to be removed : "umount /dev/md?" (where ? is Unraid disc number)

4.1) If the command fails issue "lsof /dev/md?" command to see which processes have a hold of the drive. Stop these processes

4.2 If AFP is stubborn, consider : "killall afpd"

4.3) Try "umount /dev/md?" again

5) Restart Samba : "/root/samba start"

6) At this point the drive may should show 0 (zero) Free Space in the WEB GUI. If it does, move on to step 7

6.1) If, instead of showing zero free, it shows an incorrect size you may experience very slow writing speeds

6.2 In this case, clear enough of the partition that no filesystem is recognized and stop/restart the array

6.3) To make the filesystem unrecognizable : "dd if=/dev/zero of=/dev/md? bs=1G count=1"

6.4) Stop the array

6.5) Restart the array

6.6) Confirm the drive is now listed as unformatted (and is therefore not mounted)

7) Write Zero's to the drive with the dd command "dd if=/dev/zero bs=2048k | pv -s 4000000000000 | dd of=/dev/md? bs=2048k"

7.1) The pv pipe acts as a progress indicator

7.2) Replace "4000000000000" with the size of your drive in bytes (note that so-called 4TB drives are 4 trillion bytes, not 4TiB)

7.3) Wait for a very long time until the process is finished

7.4) If writing to the drive is very slow, then cancel and go back to step 6.1

8) Stop the array

9) Make a screenshot of your drive assignments

10) From the 'Tools' menu choose the 'New Config' option in the 'UnRAID OS' section.

10.1) this is equivalent to issuing an 'initconfig' command in the console

10.2) This will rename super.dat to super.bak, effectively clearing array configuration

10.3) All drives will be unassigned

11) Reassign all drives to their original slots, while refering to your screenshot. Leave the drive to be removed unassigned.

11.1) At this point all drives will be BLUE and Unraid is waiting for you to press the Start Button

11.2) Assuming all is good, check the "Parity is valid" box and press the 'Start' button

12) At this point, provided all is OK, all drives should be GREEN and a parity-check will start.

12.1) Note this is a parity check, not a rebuild. If everything went well, it should find 0 parity errors.

13) If everything does appear to be OK (0 Sync Errors) and you want to remove the drive straight away

13.1) Cancel the parity check

13.2) Stop the array

13.3) Shut down the server

13.4) Remove the drive

13.4) Power up the server

13.5) The array should start on its own

13.6) Maybe a good idea to do complete a parity-check

Notes: - The reason this works is you are operating on the md? device, which is the parity protected partition for the data disk you want to remove. Fill that partition with zeros, and parity will not be affected by it's presence, same way as when you are adding a pre-cleared drive, only in reverse.

Full discussion in this forum thread.