HOW TO CLONE A DISK WITH DDRESCUE – GNU DDRESCUE ALSO KNOWN AS GDDRESCUE – THE BETTER DDRESCUE TOOL
###############################################
** First off I am not responsible for any data loss that can be caused by any of this, if your reading this this is strictly notes for myself that I like to write down so I can organize my thoughts.
 
BEFORE YOU BEGIN
################
* Read this whole guide!
* Get new drives of the same size as the original (although possible cloning to a bigger drive is unrecommended as its not a true clone and logically might not react equally afterwards)
* Note the serial number of the source and destination drive. Note just because I got asked before, the source is the drive your cloning FROM and the destination is the drive your cloning TO. So in a perfect world before the clone the source drive has all of your meaningful data (And maybe some disk errors, that the clone procedure will attempt to clean up) and the destination drive has unimportant stuff on it, or its empty. After the clone the source drive remains untouched (it has only been read from, nothing has been writen to it) and the destination drive hopefully is an exact copy of the source (just hopefully it didnt copy over the errors)
* Personally I consider a bad drive – and this based on my research and pure opinion – to have 50 reallocated sectors and 1 ata errors – of course exceptions exist, for example GREEN DRIVES like to have alot of ATA errors and still function properly because their GREEN features (power saving aka randomly turning off) like to cause em. 
* I would only clone to a drive that has 0 ata errors and 0 reallocated sectors before the clone. If its gets errors after the clone it might be time to consider another drive and clone again.
 
NOTES
######
* IDE drives have the notation hd# and SATA drives have the notation sd#, where # is a letter which is the unique identifier for the drive for the current session. If your drives show up as hds then change all of my commands from /dev/sd<whatever> to /dev/hd<whatever>
* Note we are using the command ddrescue here (which is not to be confused with dd_rescue) which is the newer and better clone commands
* I noticed that usb drive enclosures work best with virtualbox, as SATA disks are weird with VirtualBox
* https://sites.google.com/a/infotinks.com/main/linux—disk-cloning-guide
* My old guide on cloning which uses the older dd_rescue command, which is not as good, but it has extra notes: https://sites.google.com/a/infotinks.com/main/linux—disk-cloning-guide
* Notice its very important to be 100% certain which file is the right drive (yes in a linux everything is a file so a full drive is just a file like /dev/sda unlike windows where a disk is a magical entity that looks like this C: or DRIVE_LETTER:) thats why I have you run dmesg and cat and fdisk and smartctl commands to verify the drives by drive letter and also by serial number.
 
THE STEPS
#########
 
Pick the virtualbox or boot cd method, you will know which to pick after reading the guide. I recommend virtualbox, but it does not require reboots of the cloning computer. However Virtualbox does require USB Drive enclousures (unless you figure out how to make virtualbox see your whole drive as a device then be my guess just be warned if you lose all of your data it isnt my fault.)
 
VIRTUAL BOX
===========
 
1. Download and install VirtualBox (made by Oracle) with default options 
2. Download Knoppix that matches your computers architecture (the computer where you are going to do the clone procedures)
3. Run Virtual Box
4. Start up a VM with the Knoppix image
 
* With virtual box its best to use USB Drive enclosures — as stated before – a few open and closed parenthesis ago – good luck trying to get Virtualbox to recognize your directly connected SATA or IDE drives.
 
BOOT CD STYLE
=============
 
1. Download Knoppix that matches your cloning computers architecture (the computer where you are going to do the clone procedures)
2. Burn the ISO to the CD
3. Turn off the computer that will be used for the cloning and Put in that CD
2. Boot into the Knoppix CD (i
 
* With a linux computer
 
IN KNOPPIX
##########
 
1. Open up a terminal so that you can start typing commands
 
2. Here we go:
 
ATTACH DRIVES AND FIND OUT INFO
===============================
 
This will go like this, we will attach the source drive, run commands to get information, attach the destination drive, run the same commands to get the updated information, thus helping us differentiate between the first(source drive) and second drive(destination). (Note the order I plug the drives in doesnt matter in real world, its just to flow with this guide better) Finally run a mass command to check on every connected drives serial number and errors (which we will use the errors later to see if they grew after the clone, hopefully they didnt grow, especially on the destination drive)
All commands will be inside of a little fancy box like this. Comments will start off with a hash mark. Commands will be posted in order that they should be typed. Some commands will take a while to process. Some commands will do nothing to the system (readonly command) meaning they only spit out information for you to see. Other commands will do writes, meaning they can do some damage, or some good (hopefully the later). All commands in the computer universe are either read or write or both.
# comment, read this, but dont type this line (if you do type it in, make sure to include the # with it, that way bash/linux will know to ignore it). bash & linux ignore lines that start off with #. 
command1
# this is command2, and should be run after command1
command2
(Step 1)
See current drives:
# this next command shows all of the detected drives and thier partitions. We need to know this because once we add more drives, we will know what are the old drives (which we will now see) and what are the new drives (which we will see later)

cat /proc/partitions 

# I recommend to record what you see here as the "before adding source drive", just note it on a piece of paper or copy it to a notepad program
 
(Step 2)
Attach source drive:
See current drives and how the newly added drive appended to the system messages(dmesg) and the drive is now shown in /proc/partitions:
# dmesg shows a running system log of the linux system (so it will show when new hardware is attached or failed or etc), we dont need to see that whole dmesg output, so we just see the last few lines by "piping" it to tail (which just shows the last 5 or so lines). if that wasnt enough for you, you can view the whole output with: "dmesg"
dmesg | tail
# now that we added the source drive and dmesg should of reported it being added we will see it added to "cat /proc/partitions". If you dont see anything, wait a minute. If you dont see anything and your sure the drive has been added, then it means the drive could be completely damaged - I recommend trying a different port/slot, but if that doesnt work, take it to data recovery solution that can clone completely gone-bad drives (such a vendor is http://www.datarecoverynj.com/)
cat /proc/partitions
# here is another way to view the partition details of all drives. you will see the source drive
fdisk -l
# its important to realize the letter of the drive, we attached the drive, and we see it came up as "sda" and it probably also had 3 partitions (more or less) "sda1", "sda2", "sda3"
# we simply need to rememember "sda" is our source drive
 
(Step 3)
Attach destination drive (the drive that we will be cloning the bad drive to – again to mention the obvious: the bad drive is the source drive)
 
(Step 4)
See current drives again:
# similar process as when we attached the source drive
dmesg | tail
cat /proc/partitions
fdisk -l
# the destination drive probably shows up as "sdb" and it probably has no partitions as its a brand new drive, so we dont see things like "sdb1", "sdb2", etc...
 
(Step 5)
Verify drives one by one like this by serial number:
Example to check sda:
smartctl -a /dev/sda
# smartctl has alot of output with alot of whitespace, here is a trick to make all the empty lines of the output go away for easier reading:
# smartctl -a /dev/sda | grep .
# or since we are only interested in the serial number
# smartctl -a /dev/sda | grep -i "serial" 
# grep is a line by line search utility in linux. When we grep for a ., it means to find every none empty line. when we ask grep to do -i, it means dont be case sensitive, so when we ask to find "serial" it will also find "SeRIAL" or the more likely "Serial".
 
(Step 6)
Check every drive and make note of the serial number and the errors on the drives – hopefully your new drives, the destination drives, have 0 ata errors and 0 reallocated sectors (if not I would consider getting new drives to clone to, although exceptions do exist):
# this is a mini script that i recommend copy pasteing in, but if you cant, re-write it. it will run smartctl on every drive letter a thru z, and it will only show serial number, ata errors, model number, user capacity etc...(the info we need):

for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do echo "===drive sd$i==="; smartctl -a /dev/sd$i | egrep -i "reallocated_sector|ata error|serial|model|user capacity"; done;

# or try this one which also looks at current_pending sectors & offline_uncorrectable (some disks dont have reallocated sectors or ata errors, but are still badvia other variables such as current_pending sector and offline_uncorrectable)

for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do echo "===drive sd$i==="; smartctl -a /dev/sd$i | egrep -i "reallocated_sector|ata error|serial|model|user capacity|current_pending|offline_uncorrectable"; done;
 
(Step 7)
Once you know what disk is the source and the destination the clone can begin, make sure to not include the partition number. For example here is a disk: sda, or hdc. Here is a partition: sda1, or hdc1. We dont want the partition form.
 
Again: Realize which is the source drive and the destination drive. Make a note of it, “like sdc is the source drive and sdd is the destination drive”
 
MOVING TO GOOD FOLDER
=====================
SIDENOTE: above I talked about sda being the source drive, and sdb being the sdb the destination drive. Here we switch gears to something more realistic (as the system your using probably already has another system drive in it), so the source drive will have a letter like sdc and the destination drive will have sdd
 
(Step 8)
For this example lets pretend sdc is the source, and sdd is the destination.
cd ~
pwd

SIDENOTE: the above just gets you to the currently logged in users home directory. Which should be /root/ if your the root user, or /home/user1 if your user1 user. Or /home/knoppix if your the knoppix user (knoppix operating system has a knoppix user which is like their main admin user – note that the root user still has more priviledges than the knoppix user, but for this article the knoppix default user should be sufficient for use).  Keep this in mind, if for any command, you get an error saying “you dont have permissions” or something like this, preceed the command with “sudo”, so if you didnt have permission to run a command like “ls -lisah” then run it like this “sudo ls -lisah” (note in real life ls -lisah is always allowed by any user, im just trying to use it to prove a point).

Record what folder you are in, we are going to create the log into here
 
THE CLONE
==========
(Step 9)
Here come the cloning commands “ddrescue -n <source> <destination> <logfile>” and then again without “-n” but with “-r1”
# the first command copies/clones data from sdc to sdd, but it doesnt copy over any areas that give it trouble (the error areas are skipped). it keeps track of what it copied and didnt copy to ddrlog.txt, so the next time the same command (in case the command got cancelled & you had to rerun the commamd) or the next command is ran it will know where to begin to clone. The second command will read ddrlog.txt, which after the first command finishes has all of the info of where the errors are (its the areas that didnt get copied), and it will copy over the bad/error areas. If any blocks give it troubles, it will retry them an extra time (-r1). So it tries bad blocks a total of 2 times before giving up and moving on to next block. If you want it to try 6 times change it to -r5 (the first time counts as try along with 5 retries, thats a total of 6 times).

ddrescue -v -n /dev/sdc /dev/sdd ddrlog.txt
ddrescue -v -r1 /dev/sdc /dev/sdd ddrlog.txt

# --- everything below, is if the commands didnt run due to permission error, or the program asking to use "--force" --- #

# --- move on to next step if above commands worked --- #

# if program complains about some permissions issue try it with sudo (it will then ask for the users password, actually nevermind it wont as knoppix doesnt require sudo password):

sudo ddrescue -v -n /dev/sdc /dev/sdd ddrlog.txt
sudo ddrescue -v -r1 /dev/sdc /dev/sdd ddrlog.txt  

# --- move on to next step if above commands worked --- #

# if program complains about needing to use "--force", run it like this:

ddrescue -v -f -n /dev/sdc /dev/sdd ddrlog.txt
ddrescue -v -f -r1 /dev/sdc /dev/sdd ddrlog.txt 

# or if you need to run it with sudo and force:

sudo ddrescue -v -f -n /dev/sdc /dev/sdd ddrlog.txt
sudo ddrescue -v -f -r1 /dev/sdc /dev/sdd ddrlog.txt 

If the system crashes you can restart the command that it crashed on and it will restart where it left off from because the logfile keeps track of progress for the program.
 
(optional step 10)
OPTIONALY: If you encounter a crash, you can clone from the back of the drive retrying all of the troubled areas from the back of the disk
# We add "-R" for reverse (again this is optional as the gddrescue algorithm attacks the cloning problem in its own interesting way that probably doesnt require going in reverse -- well atleast thats according to their "man" page)

ddrescue -v -R -n /dev/sdc /dev/sdd ddrlog.txt
ddrescue -v -R -r1 /dev/sdc /dev/sdd ddrlog.txt

# -- if it complains about permissions use sudo -- #

ddrescue -v -R -n /dev/sdc /dev/sdd ddrlog.txt
ddrescue -v -R -r1 /dev/sdc /dev/sdd ddrlog.txt

# -- if it complains about "-force" -- #

ddrescue -v -f -R -n /dev/sdc /dev/sdd ddrlog.txt
ddrescue -v -f -R -r1 /dev/sdc /dev/sdd ddrlog.txt

# -- if it complains about "-force" and you need to use sudo -- #

sudo ddrescue -v -f -R -n /dev/sdc /dev/sdd ddrlog.txt
sudo ddrescue -v -f -R -r1 /dev/sdc /dev/sdd ddrlog.txt

This program will not recover the same sector twice (if its already been recovered and logged) because of the way it keeps the log file, so it will not be a waste of time to just repeat the same forward commands over and over if you experience crashes – however it doesnt hurt to just run a reverse command.
 
Curious to know what all of these command line switches mean? Well -v is verbose so you see more output, -R is to try from the back of the disk, -r1 is to retry bad areas once. “-n” is to try the good areas first, it skips errors. Thats why we first cover the whole disk clone with “-n” it will clone all the good areas and log all of the areas it skipped(which will be the errored areas, because “-n” skips errors) then we run “-r1” to retry on the skipped areas which happen to be the skipped areas.
 
THE END
=======
 
(Step 11)
When its done make note of the errors see if the source drives grew in error numbers (if the error numbers grow it just proved the better reason that the clone needed to happen, the drive was getting bad and errors grow on bad drives even faster) also make note on the destination drives (their errors numbers shouldnt of grown): 
# just like from previous step:

for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do echo "===drive sd$i==="; smartctl -a /dev/sd$i | egrep -i "reallocated_sector|ata error|serial|model|user capacity"; done;

# or try this one which also looks at current_pending sectors & offline_uncorrectable (some disks dont have reallocated sectors or ata errors, but are still badvia other variables such as current_pending sector and offline_uncorrectable)

for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do echo "===drive sd$i==="; smartctl -a /dev/sd$i | egrep -i "reallocated_sector|ata error|serial|model|user capacity|current_pending|offline_uncorrectable"; done;
 
(Step 12)
When its done just make sure the process isnt running anymore, you shouldnt see ddrescue in any of the outputs.
# ps aux lists the processes in linux for all current users
ps aux
# well we are only interested in ddrescue, grep usualy finds itself in the process list, so we tell grep to exclude itself using 'grep -v "grep"', which means dont show me lines that have the word "grep in them
ps aux | grep "ddrescue" | grep -v "grep"

# another variation (note you can put the [] around any character in ddrescue), the [] makes it so grep doesnt show up.
ps aux | grep "[d]drescue"
# SIDENOTE: here is an article explaining why [] works: http://unix.stackexchange.com/questions/74185/how-can-i-prevent-grep-from-showing-up-in-ps-results
 
(Step 13)
Then just shutdown knoppix this ensures all of the connections to the disks are done – honestly just unplugging the drive after the clone is done, and “ps aux” out says ddrescue is no longer running is okay.
 
To shutdown gracefully:
shutdown -h now
 
 
DDRESCUE COMMAND USEAGE AND CLONE ALGORITHM 
###########################################
 
THE USAGE
=========
`-v’ `–verbose’ Verbose mode. Further -v’s (up to 4) increase the verbosity level. 
 
`-r n’ `–retries=n’ Exit after given number of retry passes. Defaults to 0. -1 means infinity. Every bad sector is tried only one time per pass. To retry bad sectors detected on a previous run, you must specify a non-zero number of retries. 
 
`-R’ `–reverse’ Reverse direction of copying, retrying, and the sequential part of splitting, running them backwards from the end of the input file. 
 
There are alot more options we are just using those.
 
THE ALGORITHM
=============
The algorithm of ddrescue is as follows (the user may interrupt the process at any point, but be aware that a bad drive can block ddrescue for a long time until the kernel gives up):
 
1) Optionally read a logfile describing the status of a multi-part or previously interrupted rescue. If no logfile is specified or is empty or does not exist, mark all the rescue domain as non-tried.
 
2) (First phase; Copying) Read the non-tried parts of the input file, marking the failed blocks as non-trimmed and skipping beyond them, until all the rescue domain is tried. Only non-tried areas are read in large blocks. Trimming, splitting and retrying are done sector by sector. Each sector is tried at most two times; the first in this step as part of a large block read, the second in one of the steps below as a single sector read.
 
3) (Second phase; Trimming) Read forwards one sector at a time from the leading edge of the largest non-trimmed block, until a bad sector is found. Then read backwards one sector at a time from the trailing edge of the same block, until a bad sector is found. For each non-trimmed block, mark the bad sectors found as bad-sector and mark the rest of that block as non-split. Repeat until there are no more non-trimmed blocks.
 
4) (Third phase; Splitting) Read forwards one sector at a time from the center of the largest non-split block, until a bad sector is found. Then read backwards one sector at a time from the center of the same block, until a bad sector is found. If the logfile is larger than `–logfile-size’, read the smallest non-split blocks until the number of entries in the logfile drops below `–logfile-size’. Repeat until all remaining non-split blocks have less than 5 sectors. Then read the remaining non-split blocks sequentially.
 
5) (Fourth phase; Retrying) Optionally try to read again the bad sectors until the specified number of retries is reached.
 
6) Optionally write a logfile for later use.
 
EXTRA DOCUMENTATION
###################
ddrescue vs gddrescue (this article used gddrescue, which has the command line name: ddrescue – the older program is annoyingly called ddrescue and it has the command line name: dd_rescue, just remember the underscore command is older)

Leave a Reply

Your email address will not be published. Required fields are marked *