Subject: nilfs recovery after cheap SSD failure From: Marcel (Felix) Giannelia <felix@xxxxxxxxxx> Date: Mon, 13 Feb 2012 21:16:51 -0800 Hi, I tried to send this as two posts last week, but it appears from the various archives of this list that those posts didn't make it. So, here is a tale of how a cheap SSD corrupted a nilfs2 filesystem and how I was able to recover it, consolidated and edited a bit: INITIAL SYMPTOMS My netbook froze intermittently without warning, then began showing DMA write command errors on my SSD in the dmesg. NILFS initially didn't react to this, but then then the machine froze solid for a few minutes. When it came back there were errors from NILFS, and the SSD disappeared so I was forced to reboot. After the reboot, /home refused to mount, giving the following error: NILFS: Invalid checkpoint (checkpoint number=116290) NILFS: error while loading last checkpoint (checkpoint number=116290) BACKGROUND After some reading, I discovered that the SSD in question, a Samsung P-SSD1800, is incredibly cheap: it uses flash rated for only 3000 write cycles, appears to have no wear-levelling, and will silently return data for a block on which there's been a write error. e.g., say a cell contains "abc" and you try to write "def" to it. The attempt might cause the SSD to return a DMA write error, but if you later go to read that block, you have no idea what's in it -- it might be "abc", "dbc", or "000" -- but the device will return it as a successful read. Had I known those things, I probably would not have put NILFS on this thing (seeing as NILFS has those nasty fixed-location superblocks that get updated all the time...). However, what's done is done, and NILFS's checksumming *did* prevent silent data corruption in this case. And considering I bought this netbook used after it had XP (with a pagefile!) on it, 6 months of heavy use is pretty good. INVESTIGATION & DATA RECOVERY (Everything described below I did on a dd copy of the filesystem, on a different machine.) I had backups, and in there I happened to have a text file listing the snapshots I'd made, so I tried a few of those known-good checkpoint numbers, but interestingly I got the exact same mount error (i.e. same cp number and everything). It took me a while, but I was able to track down a copy of fsck0.nilfs2 (which used to be available under nilfs2-utils-devel when that git tree was hosted on nilfs.org, but it is not present in the new git tree on github.com). The copy I got was from here: github.com/konis/nilfs-utils/tree/fsck0 (In case anyone finding this later runs into compile errors about "undefined reference to `le64_to_cpu'", le16_to_cpu, and O_LARGEFILE not being defined, insert these two lines in the top of fsck0.nilfs2.c: #include <nilfs.h> #define O_LARGEFILE 0100000 and the compile command sequence is: aclocal && autoheader && libtoolize -c --force && automake -a -c && autoconf ./configure make ) Anyway, I ran fsck0.nilfs2 on the loop device of the dd image, and it told me that the filesystem was completely clean: ====== # fsck0.nilfs2 -v /dev/mapper/netbook Super-block: revision = 2.0 blocksize = 4096 write time = 2012-02-08 16:54:52 indicated log: blocknr = 195390 segnum = 95, seq = 18922, cno=116290 Clean FS. A valid log is pointed to by superblock (No change needed): blocknr = 195390 segnum = 95, seq = 18922, cno=116290 creation time = 2012-02-08 16:34:31 ====== My guess is that the log got written correctly and with a valid checksum, but that some other important part of that checkpoint got gibbled. On mount, the check routines had declared it valid by the time they reached the gibbled part and NILFS didn't know what to do from there. In order to fix it, I opened up a hex editor on the image and jumped to address 4096 * 195390, where I deliberately corrupted a few bytes. I ran fsck0.nilfs2 again, and this time it gave me the option to roll back the (now) corrupted log. ...After which the filesystem mounted cleanly! Sort of. Some files/directories weren't readable and returned nasty messages like this in the dmesg: ===== NILFS warning (device dm-0): nilfs_ifile_get_inode_block: unable to read inode: 3421 attempt to access beyond end of device dm-0: rw=0, want=3465637700349233464, limit=4610048 NILFS warning (device dm-0): nilfs_ifile_get_inode_block: unable to read inode: 3421 NILFS: bad btree node (blocknr=191506): level = 125, flags = 0xa, nchildren = 56 NILFS error (device dm-0): nilfs_bmap_lookup_at_level: broken bmap (inode number=3) ===== But the important thing is that all the files I had created that day and most of the ones from the last week (i.e. the ones from after I got lazy with the backups :P) were readable. Files farther back in time seemed to have a higher probability of being lost. I tried both mounting some earlier checkpoints and a few repeats of the "deliberately corrupt the last log and roll back" procedure, but files that were unreadable remained so. That's about it; perhaps that'll help someone else someday. Thanks for a great filesystem (which really can't be blamed for this on such a crappy SSD) -- and if you could get those superblocks to move around a bit and be a little less write-amplified, that would be cool :) ~Felix.