File System Conversion

From unRAID
Revision as of 17:17, 23 January 2019 by Nia (talk | contribs) (The procedure)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search



This page discusses the conversion of file systems on a drive, various methods for doing it, and why you may or may not want to do it. The procedures below will be based on a conversion from ReiserFS to XFS, but the principles are the same for whatever conversion you wish to do, just replace ReiserFS with your source file system and replace XFS with your destination file system.

This page is derived from a post originally in the Format XFS on replacement drive / Convert from RFS to XFS thread, with modifications suggested by johnnie.black. The mirror procedure here is a revision of the original procedure by bjp999, found in an early post in that thread. For more information, you may want to read more of that thread.


Preface

When I finally began the process of converting old drives to XFS, I found a few ideas that make the process simpler and less of a disruption. (I've started calling my method the Mirror method, fully described below.) Parity protection is always preserved, and normal server operation can continue at almost all times, except for brief moments of swapping drive assignments. There's very little planning needed, very little reconfiguration needed, and the array ends up with everything where it started, looking the same. I use the rsync command with the -avPX options, as it accurately performs copies of all data, metadata, and extended attributes of all files and folders. In addition, from statements online, I understand that rsync ALWAYS checksums every bit of data transferred all the way to its reconstruction at the destination. That makes the additional lengthy checksum verification almost redundant, so I include that verification only as an option, for the most paranoid. I've come to feel that an rsync transfer is the only option most users need, and that speeds this process up.
Then I perform a swap procedure that preserves all shares and data, and doesn't require any lengthy parity checks or builds. It's not a physical drive swap, just a swap of the drive assignments. The swap procedure that works in v6.0 and v6.1 does not work for v6.2 or later, and the one that works for v6.2 or later does not work for prior versions. This wiki page is for v6.2 or later, see this post if you are running an earlier version.
I exclude the swapping drive from User Shares, so you never have to worry about files duplicated within shares, and therefore don't need a temporary copy location, and never need to move any files.
You can if you wish perform parity checks at any time, but they are optional, if you are sure there has recently been a successful one, before you start.
Update and Important Warning! My system is static, and my array only changes when I manually copy files and make changes. Which is why I completely forgot that other user's arrays are dynamic, constantly changing in the background due to Dockers, plugins, VM's, scheduled backups from other networked machines, and the Mover copying and moving files around. That's a big problem when you are converting a drive, because you cannot allow any changes to it once the copying begins. As the copy sweeps through the folders of the original drive, any new files added to folders already processed will be lost! It is your responsibility to make sure that NOTHING can make changes to a drive being copied and converted! That may mean stopping your Dockers, plugins, VM's, and the Mover, if any of those could possibly add files or make changes to the drive being converted. And you should check for and temporarily disable any scheduled backups from other machines. Yes, this can require some careful planning! You may want to consider rebooting into Safe Mode (stops plugins from loading), and disabling any VM's and the Docker service and the Mover.

File systems

A summary of the unRAID file systems
One strong opinion against ReiserFS

Should you or should you not convert?

Converting Drives from ReiserFS to XFS
One strong opinion for converting from ReiserFS

Preliminary steps

Preliminary parity check
If you have a parity drive, we recommend running a parity check first, so that you won't run into any drive problems during the conversion process. Whether or not you have valid parity, you want to fix drive issues before you start. You don't want to run into drive problems in the middle of a drive conversion. Plus, there's no point in trying to preserve parity if it's no good!
Preparing the first empty drive
To convert drives from one file system to another, the drive has to be formatted with the new file system. Obviously, formatting a drive causes all data on that drive to be lost, so if there's data on the drive it MUST be saved elsewhere first! There's no way around it, you HAVE to free a drive up! To convert your array, you MUST start with a free and empty data drive, whether you add it, or you juggle data around to create it.
To convert an array, most users will just add a new and tested drive to the array. But some users either don't want to add a drive, or cannot add a drive, perhaps because they are out of SATA ports or physical drive slots. They can still convert their array, but they have some preliminary work to do first:
Preliminary step if you don't add a new drive: Select one of the largest data drives, preferably with the least amount of data to be moved, and move ALL of its data to the rest of the drives. You can use the unBALANCE plugin to spread the data around the rest of the array. Or you can use the built in Midnight Commander (mc) or any other file manager to manually move the data where you want it. Any data left on this drive WILL BE LOST! If you don't have spare room on the array, then you will have to find room elsewhere, off the array.

Methods for drive and array conversion

Basically, to convert a single drive from one file system to another, you have to copy all the data off, format the drive to the new file system, then either copy the data back or use the empty drive for other data. Easy and obvious, right? But it's the little details that complicate the process.
To convert an entire array of drives, different methods have been designed, for different situations, each with its pros and cons. Different users prefer different methods.
Please do not be misled by users saying there's nothing to this, that all you have to do is "copy and format, copy and format, copy and ..."
If you proceed thinking it's that simple, you may lose data, and worse, won't even know it is lost until later when you can't find something. You also can get bogged down in trying to figure out where everything should get moved to.
Background writing!
  • It's likely that some of those who started this without planning or care have lost data. Why? Because it's easy to think that when you issue a copy command (doesn't matter which tool you use), that when it finishes, then everything on the drive has been copied, and now you can format the drive and move on to the next drive. But what if something is writing to the drive while you are copying everything off? Every time the copy tool finishes copying a folder, it moves on to the next, and it never comes back! So if new files were added to folders already copied, they won't be seen, and they won't be copied, and they will be lost when you format the drive! And you won't even know it, until sometime later when you wonder where that file went.
  • Many user's arrays are dynamic, constantly changing in the background due to Dockers, plugins, VM's, scheduled backups from other networked machines, and the Mover copying and moving files around. That's a big problem when you are converting a drive, because you cannot allow any changes to it once the copying begins. As the copy sweeps through the folders of the original drive, any new files added to folders already processed will be lost! It is your responsibility to make sure that NOTHING can make changes to a drive being copied and converted! That may mean stopping your Dockers, plugins, VM's, the Mover, and the APP Backup plugin, if any of those could possibly add files or make changes to the drive being converted. And you should check for and temporarily disable any scheduled backups from other machines. Yes, this can require some careful planning! You may want to consider rebooting into Safe Mode (stops plugins from loading), and disabling any VM's and the Docker service and the Mover, plus disable any externally scheduled backups to your server.
  • Users that use unBALANCE to do the moving need to be especially careful, because unBALANCE does not copy the data, it moves the data to other drives, thereby emptying the drive. What is particularly concerning about this is that if you have tools writing to the array in the background, a near empty drive is the most attractive target for the next writes! You should make absolutely sure that before you stop the array to reformat the drive, check once more that the drive is truly empty! You want to make sure that something else wasn't writing additional files and folders to the drive you *think* you just emptied.
Planning! Space management!
  • The only method that requires almost no planning is the Mirror method, as the only thing to decide is the order of the drives to convert, and that order has to be from the largest to the smallest drive.
  • The next easiest case is where all of your drives are the same size, and you have lots of free space, and you do not set Inclusion and Exclusion rules for your User Shares - you let all shares use all drives. Then you can use unBALANCE to empty your drives, and set it each time to migrate all of one drive's files and folders to the rest of the drives, and it should take care of deciding where it should go. It will go faster if you take the time to be careful that you always target the empty drive, otherwise some sets of files will be moved multiple times (moved to one drive, then when that drive is converted, moved again). If you start with an empty drive, then it's easy. If you have to empty an existing data drive, then its files will have to move twice.
  • If you have no Inclusion/Exclusion rules, but your drives are not the same size, then you will have to be careful, because the empty drive you want to copy to may not be big enough for the current drive's files. Then some planning will save you some overall conversion time, getting the order of conversion correct, and avoiding as best you can, moving file sets multiple times.
  • If you do have Inclusion/Exclusion rules, then planning is strongly recommended. Some users will just wing it, and that's a strategy, but you may easily find yourself copying certain sets of files multiple times, and running into difficulties deciding where to move file sets. In my judgement, this takes much more time and space management than most users think will be needed. As much as User Share users don't want to manage where their shares are, and how much of each share is on each disk, this would be the one time when that is really necessary. I would make good notes of what drives are used by each share, and how much of each share is on each drive, so that you can plan where each of those shares or parts of shares will end up, partly for space management, and partly to know how to reconfigure the share Inclusions.
Tools!
rsync
(work in progress)
cp & mv
(work in progress)
mc (Midnight Commander)
(work in progress)
unBALANCE
(work in progress)
Other copy tools over the network
Someone mentioned copying over the network, and my first reaction was "You've got to be kidding!!" But yes, it's a valid way to do it. It is likely to take weeks longer, not just days longer. It won't be mentioned in the methods below, but it can be used in place of any other copying method. Because it involves more pathways, more risk, more users will want to verify all copies, which will double the extra weeks needed. Lest some think that any old copy will be good enough, this is not about copying just a few files, it's about copying the entire data collection of a user, some of which may not be backed up (less important videos for example). You are about to move the entire 5TB, 10TB, 20TB, or 50TB of a users collection, one to four weeks of copying. Copy verification becomes much more important on that scale.
(work in progress)
Factors that make a difference, in deciding which method will work best for you
  • Do you want to preserve parity? Do you even have a parity drive built? How about dual parity?
  • Do you care where your data is placed? In other words, do you like knowing exactly which disk each of the various sets of your data reside on? Or do you not care so long as your shares stay the same?
  • If you strictly use User Shares and don't care about data placement, are you using inclusions and exclusions, or are all shares allowed to use all drives?
- Here's a test - say you have 10 shares and one of them is a small share with about 20 files. Would you prefer to keep that share on a single drive, or would you not care if it's files were scattered across the array, one file on every drive? (Either way, the share looks the same)
  • Do you already have a large empty drive available? Or will you have to juggle data around to be able to clear each drive?
  • Does the array need to be fully operational during this process? (It's going to take days)
  • Is the overall speed more important to you, how fast you finish the whole conversion, or is apparent ease of use more important, even if it takes longer?
  • How important to you is knowing that every bit and byte was correctly transferred?
  • Do you have only a few drives to convert, or are there a lot?
PLEASE read the blue warning at the top, in the Preface! If you have ANY processes, internal or external, that may write to the drive you are copying off, then those new files may be lost! They may be written to a part of the disk where your copy tool has already finished and moved past. Most especially, if you use a tool like unBALANCE that moves the files, not just copies them, then the drive you are emptying is going to become the most attractive target for new files written to the shares assigned to that drive! Before you format, if at all possible, make sure there isn't anything new on the drive that hasn't been saved, or you will lose it!
And now for the methods: (Note: we'll use ReiserFS as the file system to convert, and XFS as the file system to convert to, but you can substitute any pair of file systems)
Mirror each disk with rsync, preserving parity
Note: this is the main step by step procedure described farther down the page -> Mirroring procedure to convert drives
Start with a large empty drive formatted with XFS, pick a data drive to convert, then copy it all to the empty drive, then swap the XFS drive with the original drive. Then format the ReiserFS drive to XFS, and do it all over again with another data drive, until they are all converted.
Advantages: parity is preserved throughout; no planning needed; array looks identical before and after; data remains exactly on the same drive it started on; besides globally excluding the initial cleared drive, no User Share configuration has to be done at all; no file or folder is ever copied more than once, which makes this one of the fastest methods; except for brief intervals, normal system operation continues unaffected
Disadvantages: users have found this method to seem convoluted, the disk swapping is too complicated for some; because parity is preserved, copying is slower; second parity drive must be rebuilt
Share based, no inclusions, no parity
Note: only for systems with no share inclusions and exclusions, that is, all shares are allowed to use all data drives
Start by unassigning the parity drives, then use rsync or unBALANCE or MC to save all of the data on a ReiserFS drive to the other data drives; then format it with XFS, and repeat on another data drive, continuing until all drives are converted; then rebuild parity
Advantages: faster if lots of data, because no parity drive; simpler to understand, no drive swapping
Disadvantages: requires rebuilding parity drives afterward (which can take a day or 2, which is why it's overall not faster if not much data); some planning and space management needed; small shares may be scattered across the drives; no parity until rebuilt; some data may be moved multiple times, slowing down the overall speed; normal system operation is possible but without parity protection
Share based with inclusions, no parity
Start by unassigning the parity drives, and turning off User Shares, then use rsync or unBALANCE or MC to save all of the data on a ReiserFS drive to the other data drives; then format it with XFS, and repeat on another data drive, continuing until all drives are converted; then rebuild parity and correct the User Share configuration
Advantages: faster if lots of data, because no parity drive; simpler to understand, no drive swapping; with shares turned off, you are less likely to have additional unsaved files written to the drive being emptied, and possibly lost
Disadvantages: requires rebuilding parity drives afterward (which can take a day or 2, which is why it's overall not faster if not much data); some planning and space management needed; small shares may be scattered across the drives; no parity until rebuilt; no User Shares until conversion is complete (if you keep User Shares enabled, may have duplication warnings and issues with wrong inclusions); some data may be moved multiple times, slowing down the overall speed; normal system operation is probably not possible; afterward, you will have to find all of your sets of data and correct the share inclusion and exclusion configuration (you may need to keep notes of where each data set moves to)
Share based, no inclusions, preserving parity
Note: only for systems with no share inclusions and exclusions, that is, all shares are allowed to use all data drives
Use rsync or unBALANCE or MC to save all of the data on a ReiserFS drive to the other data drives; then format it with XFS, and repeat on another data drive, continuing until all drives are converted
Advantages: simpler to understand, no drive swapping
Disadvantages: some planning and space management needed; small shares may be scattered across the drives; some data may be moved multiple times, slowing down the overall speed; normal system operation is possible
Share based with inclusions, preserving parity
Start by turning off User Shares, then use rsync or unBALANCE or MC to save all of the data on a ReiserFS drive to the other data drives; then format it with XFS, and repeat on another data drive, continuing until all drives are converted; then correct the User Share configuration
Advantages: simpler to understand, no drive swapping; with shares turned off, you are less likely to have additional unsaved files written to the drive being emptied, and possibly lost
Disadvantages: some planning and space management needed; small shares may be scattered across the drives; no User Shares until conversion is complete (if you keep User Shares enabled, may have duplication warnings and issues with wrong inclusions); some data may be moved multiple times, slowing down the overall speed; normal system operation is probably not possible; afterward, you will have to find all of your sets of data and correct the share inclusion and exclusion configuration (you may need to keep notes of where each data set moves to)
(work in progress)

Mirroring procedure to convert drives

Note: this is the Mirror each disk with rsync, preserving parity method
Important Notes!
  • This procedure requires unRAID v6.2 or later. If you are running any release of unRAID v6.0 or v6.1, then you need to go to the original post, which has a different / better way to swap the drives (that swap method doesn't work in v6.2).
  • The procedure below is based on a conversion from ReiserFS to XFS, but the particular file systems are not important. Just substitute the actual file systems you are converting, to and from. In our example, ReiserFS will be the source file system, and XFS will be the resultant file system.
System assumptions before you begin
  • You are running unRAID v6.2 or later (if not, use the procedure in this post instead)
  • You have one Parity drive, not two! If you have dual parity, this procedure will invalidate the second parity drive. You can still use this procedure, but you might as well unassign the second parity drive until you have finished converting all of the drives you wish to convert. Then you can reassign it and let it rebuild. Why? The first parity drive does not care about drive positioning within the array, so we can freely swap the drives around. The second parity drive *does* care, so no drives can move or be swapped, at all!
  • You have a given number of data drives you wish to convert (we will use an example of 10 data drives)
  • You have prepared a swap drive, a new empty drive that is as large or larger than the largest data drive; preferably you have Precleared it or otherwise tested it, and you know it's a good drive (read A preliminary step above if you aren't sure what the swap drive is)
  • Note: the swap drive should not be formatted with XFS (the resultant file system); if it is, then you will need to clear it or format it to something else first
Whether you added a drive, or cleared off a drive, we'll call this empty drive the swap drive. The procedure below will use an example of an existing array with 10 data drives plus a new and added data drive, but the instructions are basically the same. This swap drive, no matter how you make it, will always have the same unRAID disk number (Disk 11 in the examples below). So while the swap drive begins as the drive you just added or emptied, the swap drive will actually be a different physical drive for each conversion, but the SAME unRAID disk number (Disk 11 in our example) throughout the procedure.
A summary of this method
  • Steps 1 - 7 are just prep, figuring out a strategy, and preparing the initial drive. Only takes a few minutes (unless you need to run a parity check)
  • Steps 8 - 9 are about copying, saving your data, with optional additional verification - can take quite awhile!
  • Steps 10 - 18 are about swapping the converted drive for the original, and preparing the drive for the next copy, takes only a few minutes
  • Drive is done, now loop back to Step 8 to start copying the next drive
The procedure

  1. If you have not run a successful Parity Check recently, do so now. You want to be certain that the array is perfect before you start
  2. Prepare a strategy for the order of drive conversion. Because you can't replace a larger drive with a smaller drive (unless the total file space used will fit on the smaller drive), you will have to order the conversions so that your largest data drive is first, then the next largest, then the next, with the smallest data drive being last. Obviously, the order doesn't matter for drives that are the same size.
  3. If your empty swap drive is not already installed and assigned, install it, and with the array stopped, assign it to the next empty drive slot (for our example, we will assign it to Disk 11)
  4. If you have enabled User Shares (and most users have), go to Settings -> Global Share Settings and add your swap disk to ' Excluded disk(s) ' (for our example, we would put disk11)
  5. Click on the disk name of your swap drive (e.g. 'Disk 11') and change the format to XFS if it isn't already, then click Apply and Done
  6. Start the array; your empty swap drive should show as 'Unmountable', and a Format button will be present
  7. Click the check box for formatting, then click the Format button; it takes a few minutes, says it's formatting; when done, array should show an additional drive, almost completely empty, formatted with XFS
  8. At the console or within a screen session, copy all data from your drive to be converted to the empty swap drive; use an rsync command based on the following, except change the drive numbers as appropriate for your system; type it exactly with the same slashes, upper and lower case matter; this command will take a long time but parity will be fully preserved; when complete, prompt should return with no errors showing; your array now has 2 drives that are identical except for their format, their file system (one of them is excluded from shares)
    rsync -avPX /mnt/disk10/ /mnt/disk11/
    (in our example, we are copying our large disk10 to the new and empty swap drive)
  9. This step is optional, as the previous rsync automatically checksums each transfer. But if you would like to verify that the end-to-end transfer was perfect, perform the next rsync command below; it will take a long time, and probably nothing will be copied unless the drive has been updated (see warning below!) since the full copy above; there's no progress info, it's over when the prompt returns
    rsync -rcvPX /mnt/disk10/ /mnt/disk11/
  10. Stop the array; we are now going to swap the drive assignments
  11. Click on Tools, then New Config, then Retain current configuration: (from version 6.6 onwards it's Preserve current assignments), then select All, then check Yes I want to do this, click Apply then Done
    Important Warning! Doing a New Config will reset the file system type for all disks to Auto! While usually this is not a problem, especially with the latest unRAID, in some circumstances this can lead to unmountable disk(s). If that happens, then you need to select the correct file system for those disk(s). If in doubt, ask for help!
  12. Go back to the Main page and click on the dropdown for the swap drive (e.g. Disk 11) and unassign it (click on "unassigned" or "no device")
  13. Click on the dropdown for the other drive (the one being converted, e.g. Disk 10 to start), and reassign it as the physical drive of the swap drive, the drive that was empty (e.g Disk 11)
  14. Click on the dropdown for the slot of the swap drive (e.g. Disk 11) and reassign it to the physical drive that was being converted (e.g. Disk 10)
  15. Important! Click on each drive name (e.g. Disk 10 and Disk 11) and swap the file system format of the drive - if it's ReiserFS change it to XFS, if it's XFS change it to ReiserFS; it's important to swap the disk formats as well as the physical drive assignments
    At this point, you have now swapped the 2 drives, which is fine as they are identical (except for file system format); parity remains valid because the same drives are assigned, their slot does not matter; however if you have a second parity drive, it's now invalid!
  16. You should see all array disks with blue icons, and a warning (All data on the parity drive will be erased when array is started), and a check box for Parity is already valid. VERY IMPORTANT! Click the check box! Make sure that it's checked to indicate that Parity is already valid or your Parity disk will be rebuilt! Then click the Start button to start the array. It should start up without issue (and without erasing and rebuilding parity), and look almost identical to what it looked like before the swap, with no parity check needed. However the XFS disk is now online and its files are now being shared as they normally would. Check it all if you are in doubt.
    Before you click the Start button, you may still see the warning about the parity drive being erased, but if you have put a check mark in the checkbox for Parity is already valid, then the parity drive will NOT be erased or touched. It is considered to be already fully valid. But if you do start the array without the box checked, no harm done. It will rebuild the parity drive exactly as it already is, taking a day or 2 and you'll have to wait for it to finish, but the only thing lost is your time!
  17. If you are sure it's all fine, stop the array and click the empty swap disk slot (e.g. still Disk 11), and change the format to XFS, then click Apply and Done
  18. Start the array; the Format button should be available, format it now; when done, your empty disk slot now has a fresh and empty disk formatted with XFS and ready to fill again; your data drive has completed the conversion process and is already back online, with all files and shares intact, but formatted with XFS
You are now ready to convert the next drive, so circle back to Step 8 and repeat these steps (Step 8 through Step 18), substituting your next drive to be converted; the empty and excluded swap disk slot will always be the same (e.g. always Disk 11 in our example), the other will change as you convert different data drives

Aftermath
When done, you have an empty XFS drive appended to your system, probably your smallest drive, and still excluded. It's up to you what you want to do with it. You can leave it as is, or you can unassign it and rebuild parity, or you can use the parity preserving remove-a-drive procedure, instructions elsewhere. Remember, it's probably still globally excluded from shares.
I do recommend that if you are going to try this procedure, you read through the steps and notes carefully until you fully understand them, and understand the importance of each detail. Missing a step or typing the wrong disk number could be disastrous!
If you wish, you can perform parity checks at any point during and after. I don't believe they are necessary, I only did one before starting, and I believe I did another after it was all done.
Important Warning! If you run the verification copy in Step 9, and it actually copies files, then it is likely you have a process still changing the drive! These newly copied files were not there for the Step 8 copy! You need to determine what process (Docker, plugin, VM, an external backup, or the Mover) made the changes to this drive, and stop it. Then you may need to run Step 9 again, because the process may have made even more changes to folders, after the Step 9 rsync process had moved past those folders. In summary, if the Step 9 copy actually copies any files, then you should probably repeat Step 9 until nothing is copied.
I've checked the above pretty carefully, if you see any errors, PLEASE let me know ASAP! I'm sure it can be improved. Steps 17 and 18 are a repeat of 5, 6, and 7, but it seemed safer to do it this way.
Please let us know of any issues or suggestions!