Deletion Recovery

Here I will present two tricks on how to get at that very important letter you just erased. The first one is a pure software solution and is targeted specifically to the Linux Kernel (2.6) while the second part is more hardware oriented and could be done using most other OS.

Rescuing open files

If a process happens to have the particular file you just deleted still open, i's easy. Lookup the PID and file descriptor of the file you are looking for, e.g.:

ls -l /proc/[0-9]*/fd/* | grep just_deleted.txt

If just_deleted.txt has already been removed, the above command could yield:

lrwxrwxrwx 1 jengelh root 64 Aug 28 11:45 1234/fd/3 -> /tmp/just_deleted.txt (deleted)

In Kernel 2.6, the follow behavior is different from the readlink(2) in procfs: reading the link (as `ls -l` did above) will return a string with appended "(deleted)" so doing ls -l `readlink /proc/1234/fd/3` will fail, while following the link will bring you directly to the inode, i.e. you can grab the file as follows:

cat /proc/1234/fd/3 >/tmp/rescued.txt

Restoring a deleted file

The following method only works good if you know the contents of the file. In case of text files, this is easy, however not so for binary data such as pictures, etc. You need something that you can search for in the sheer number of gigabytes.

When you noticed you deleted a file, act quickly. Unmount the affected partition, or use a rescue CD if it's the root partition. You can also get away with a running system. Many filesystems only delete the directory entry for a file. Some others may zero it. Pulling the power may be a last resort.

Depending on how the filesystem organizes its files, you might have a "safe area". That means, each directory has a "start position" to put its files on disk. Even if the directory grows, you can hope -- in a filesystem where there is enough free disk space -- that /var/log is "far away" enough from /home/deleted.

The use of a live CD is recommended to minimize the potential impact of writes to the partition. However, if you can not interrupt disk access (by unmounting, pulling cables or the power, etc.), for example it is a server that needs 99% uptime, keep the disk I/O as low as possible. Do not try to shutdown the services the normal way (by going through runlevels or the Shutdown button). Perhaps freeze processes that might become active (e.g. cron) by using killall -STOP cron. (Unfreeze later with killall -CONT cron)

After securing the disk

The best way to get at a raw disk is to use some kind of Linux, or UNIX derivate. Windows is totally unusable for such a task at this time. Two ways exist:

Finding the start position - with a hex editor

I prefer the hex editor ht when it comes to fixed strings (most of the time). Since version 2.x, ht supports large files (> 4 GB) so you do not generally need the Perl hack below. Open said device using `ht /dev/hda5`, then use F7 to search for the string. If a match is found, you can scroll up/down to see if the file looks complete. If that is not the case, try searching again, since you might have just found a match in the filesystem's journal rather than the data section.

A precompiled package for ht is available in my SUSE repository.

Finding the start position - with a little Perl

perl -le 'open F,"</dev/hda5";while(<F>){if(/int main\(int argc/s){print"(found) ",tell(F),"\a\n";<STDIN>}printf"(none) %d",tell(F)/1048576,"MB\n"if($.%40960==0)}'

You can write that as a shell command without newlines. Replace /dev/hda5 with the device to scan, and "int main\(int argc" with the text you actually look for; keep the Perl syntax for regexes in mind! This small program will print a status line of how many megabytes it has processed so far. If your text has been found, the position of the next newline following your text is printed. With that number (whose line could look like this):

(found) 11807724893

you look for your file around that position. You can use ht for that, too. Before there was ht with Large File Support however, I had written a tool that works like a mixture of the tools od/hexdump, head and tail; it is called tailhex and can be found in HXtools.

tailhex -e 11807724893 /dev/hda1 | less -MSi

Lower the number to start reading the disk's contents earlier - remember, the original number indicates the next newline after your search text (a limitation of the perl snippet). When you have found a good "start spot" where your file [might] begin, note its position (tailhex: the hexadecimal number on the left) and start getting the file off. (Hint: most files begin on a block boundary, which is 4096 bytes in most default setups. Might also be less, but a power of two.)

Extracting the file

dd_rescue -s 11807700000 -S 0 -m 1048576 /dev/hda5 /tmp/rescued.txt

This will start reading from byte position 11807700000 (and start writing it to byte position 0 of rescued.txt. It will read one megabyte (1048576 bytes).

Please use another partition if possible. If not, choose a directory that sounds "far away" from /home/deleted. For me, it actually was /boot which began near 0 MB on the disc. I was lucky (nah, I knew how to do it ;-) that my lost file was near 10 GB.

Conclusion

That's it. rescued.txt may have some bogus leading and/or trailing bytes which you can cut off with a normal 8-bit capable editor. It may occur that some characters are damaged. Well in that case, fix them up. It is better to do some postprocessing than to have to do it ALL over again.

Sometimes works with binary data too

Another scenario was that some stupid Windows2000 once told me that there was some sort of corruption on the USB stick. I dunno wtf it did, but it erased the superblock of the USB stick. Since it was VFAT (=simple directory structure), and the data I was looking for were photos, the job was easy. "EXIF" as the start marker, and the runlength could be obtained via the (still intact!) bytes that seemed like a directory.

Additional Search Engine Keywords: Rescuing a removed file