17

On Debian 8.7 I had a zfs pool. (obviously using ZFS on Linux, not Oracle or Solaris zfs)

It was needed to extend ZFS pool from mirror on 2 disks to raidz on 4 disks. I did backup (one copy of data - it was my first mistake)

I thought that zpool destroy would not work until I remove all datasets (volumes), so I did zfs destroy (this was my second mistake).

After that I issued 'zpool destroy', repartitioned all 4 disks and found out that backup is damaged.

So I started my recovery adventure: First good thing about ZFS is that it's able to import destroyed pools. After zpool destroy yourPoolName you can invoke zpool import -D to see list of destroyed pools. Your can then imoprt it using zpool import -D yourPoolName or if you have destroyed several pools with same name then you can import it by id, which is shown by zpool import -D.

zpool import -D requires partitions in their original place. It has to be exact up to sector. I have used fdisk to create partitions with exact start and end sector number. I have used cfdisk to set partition type (because it's more user friendly :) ) And then you should invoke partprobe in order to be sure that OS knows about changed partitions.

zpool import -D worked like a charm and I had my pool online in perfect health again!.. But with full consequences of zfs destroy - all the data was missing.

ZFS stores changes to files and file system in transactions, which are saved to disk in transaction groups (TXG) My further research has shown that I have to rollback last transaction groups.

There are 2 ways to rollback ZFS transaction groups:

  1. using special zpool import with -T option
  2. using zfs_revert-0.1.py

First of all you need to find last good TXG. zpool history -il helped me.

According to first way you should invoke something like: zpool import -o readonly=on -D -f -T <LAST-GOOD-TXG> poolName (with additional parameters, if you like: -F, -m, -R) Unfortunately this command worked only with actual TXG. Going back even to pre-last TXG didn't worked and showed error messages like "device is unavailable". It looks like this feature is working (or has worked) on Solaris only. Pity.

I have analyzed the code of the zfs_revert-0.1.py, it looks clear and promising. I have used this tool but it looks like I needed to delete too much TXGs. After that zpool import -D was unable to detect the pool anymore.

Currently I have recovered one of the older backups, I have dd dumps of 2 disks, which were mirrored but after zfs destroy and zpool destroy. It looks like we will just live with data from older backup and stop further recovery process. Nevertheless I will be glad to try to recover data if somebody will suggest what to do in such situation.

Further recovery would be done in VMWare Workstation, so I will need to find a way how to import zpool in a VM (disk IDs probably will change)

Question What can I try next?

Lessons learned:

  1. Always keep at least 2 copies of data. When you are manipulating main storage you need a backup of backup.
  2. zfs destroy is not needed and very dangerous if you are going to do zpool destroy anyway.

Comments: It's obvious that during recovery you should completely stop writes to disks where damaged data was stored.

Useful commands: zpool import -D zpool import -o readonly=on -D -f originalPoolName newPoolName zpool status tank

zpool online dozer c2t11d0
zpool scrub tank

zpool history -il

zpool export tank

zpool import dozer zeepool

Links:

  1. Tools
  1. Information about damaged ZFS
  1. ZFS Import
7
  • 3
    Nice writeup, but you should probably split your question up in the question part and the answer part
    – Frederik
    Apr 6, 2017 at 9:05
  • 1
    I still work on it. The next steps would be to try different versions of FreeBSD, Illumos and Solaris to recover ZFS using both methods described above. So the question is still open: "what can be done in this situation". As soon as I will finish my work on it I will create an answer. Apr 6, 2017 at 13:05
  • hey having a hard time : stackoverflow.com/questions/47044009/… your post is of help though I can't quite figure out the syntax to the commands. is zpool import -o readonly=on -D -f -T <pool | 1592026> pool2 correct? I tried zpool import -o readonly=on -D -f -T pool pool2 but that didn't seem right at all judging by the return. I'm worried that the return to zpool import -D and zpool import says : "no pools available to import" but `zpool history -il" does display a sizeable history. what am I looking for for the ID? 5000? 5000/5?
    – tatsu
    Oct 31, 2017 at 20:40
  • @tatsu: sorry, I cannot help with the syntax of these commands. try to use man. And BEFORE doing ANYTHING to zfs MAKE a FULL COPY of all volumes! Every import deletes some zfs history. Nov 1, 2017 at 11:23
  • I don't know how to make a full copy
    – tatsu
    Nov 1, 2017 at 12:25

0

You must log in to answer this question.

Browse other questions tagged .