Data corruption

Oh, yuck. ZFS-8000-8A. :(

-------------------
Mon May 29 16:23:47 [bash:5.2.15 jobs:0 error:0 time:35]
root@charm:/home/jj5
# zpool status -v
  pool: fast
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
config:

        NAME                                              STATE     READ WRITE CKSUM
        fast                                              ONLINE       0     0     0
          mirror-0                                        ONLINE       0     0     0
            nvme-Samsung_SSD_990_PRO_2TB_S6Z2NJ0W215171W  ONLINE       0     0     2
            nvme-Samsung_SSD_990_PRO_2TB_S6Z2NJ0W215164J  ONLINE       0     0     2

errors: Permanent errors have been detected in the following files:

        /fast/vbox/218-jj-wrk-8-charm-prod-vbox/218-jj-wrk-8-charm-prod-vbox.vdi
-------------------

Resolving ZFS issue on ‘trick’

I’m gonna follow these instructions to replace a disk in one of my ZFS arrays on my workstation ‘trick’. Have ordered myself a new 6TB Seagate Barracuda Hard Drive for $179. I hope nobody minds if I make a few notes for myself here…

-------------------
Mon Sep 12 19:24:53 [bash:5.0.17 jobs:0 error:0 time:179]
root@trick:/home/jj5
# zpool status data
  pool: data
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub canceled on Mon Sep 12 19:24:53 2022
config:

        NAME                                     STATE     READ WRITE CKSUM
        data                                     DEGRADED     0     0     0
          mirror-0                               DEGRADED     0     0     0
            scsi-SATA_ST6000VN0041-2EL_ZA16N49H  DEGRADED 11.1K     0 54.6K  too many errors
            scsi-SATA_ST6000VN0041-2EL_ZA16N4ZH  ONLINE       0     0     0

errors: No known data errors
-------------------
Mon Sep 12 19:40:13 [bash:5.0.17 jobs:0 error:0 time:1099]
root@trick:/home/jj5
# zdb
data:
    version: 5000
    name: 'data'
    state: 0
    txg: 2685198
    pool_guid: 1339265133722772877
    errata: 0
    hostid: 727553668
    hostname: 'trick'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 1339265133722772877
        create_txg: 4
        children[0]:
            type: 'mirror'
            id: 0
            guid: 802431090802465148
            metaslab_array: 256
            metaslab_shift: 34
            ashift: 12
            asize: 6001160355840
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 129
            children[0]:
                type: 'disk'
                id: 0
                guid: 9301639020686187487
                path: '/dev/disk/by-id/scsi-SATA_ST6000VN0041-2EL_ZA16N49H-part1'
                devid: 'ata-ST6000VN0041-2EL11C_ZA16N49H-part1'
                phys_path: 'pci-0000:00:17.0-ata-3'
                whole_disk: 1
                DTL: 28906
                create_txg: 4
                com.delphix:vdev_zap_leaf: 130
                degraded: 1
                aux_state: 'err_exceeded'
            children[1]:
                type: 'disk'
                id: 1
                guid: 4734211194602915183
                path: '/dev/disk/by-id/scsi-SATA_ST6000VN0041-2EL_ZA16N4ZH-part1'
                devid: 'ata-ST6000VN0041-2EL11C_ZA16N4ZH-part1'
                phys_path: 'pci-0000:00:17.0-ata-4'
                whole_disk: 1
                DTL: 28905
                create_txg: 4
                com.delphix:vdev_zap_leaf: 131
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

I think the commands I’m gonna need are:

# zpool offline data 9301639020686187487
# zpool status data
# shutdown # and replace disk
# zpool replace data 9301639020686187487 /dev/disk/by-id/scsi-SATA_ST6000DM003-2CY1_WSB076SN
# zpool status data

I love being a programmer

My ZFS RAID array is resilvering. It’s a long recovery process. A report on progress looks like this:

Every 10.0s: zpool status                                             love: Tue May  4 22:32:27 2021

  pool: data
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun May  2 20:26:52 2021
        1.89T scanned out of 5.03T at 11.0M/s, 83h19m to go
        967G resilvered, 37.54% done
config:

        NAME                       STATE     READ WRITE CKSUM
        data                       DEGRADED     0     0     0
          mirror-0                 ONLINE       0     0     0
            sda                    ONLINE       0     0     0
            sdb                    ONLINE       0     0     0
          mirror-1                 DEGRADED     0     0     0
            replacing-0            DEGRADED     0     0     0
              4616223910663615641  UNAVAIL      0     0     0  was /dev/sdc1/old
              sdc                  ONLINE       0     0     0  (resilvering)
            sdd                    ONLINE       0     0     0
        cache
          nvme0n1p4                ONLINE       0     0     0

errors: No known data errors

So that 83h19m to go wasn’t in units I could easily grok, what I wanted to know was how many days. Lucky for me, I’m a computer programmer!

First I wrote watch-zpool-status.php:

#!/usr/bin/env php
<?php

main( $argv );

function main( $argv ) {

  $status = fread( STDIN, 999999 );

  if ( ! preg_match( '/, (\d+)h(\d+)m to go/', $status, $matches ) ) {

    die( "could not read zpool status.\n" );

  }

  $hours = intval( $matches[ 1 ] );
  $minutes = intval( $matches[ 2 ] );

  $minutes += $hours * 60;

  $days = $minutes / 60.0 / 24.0;

  $report = number_format( $days, 2 );

  echo "days remaining: $report\n";

}

And then I wrote watch-zpool-status.sh to run it:

#!/bin/bash

watch -n 10 'zpool status | ./watch-zpool-status.php'

So now it reports that there are 3.47 days remaining, good to know!

Replace a disk in a ZFS pool

So smartd is suddenly emailing me about problems with one of the disks in my ZFS zpool. I have ordered a replacement disk and am waiting for it to arrive. While the smartd email says there is a problem `zpool status` says everything is fine. So I’m running a `zpool scrub` to see if ZFS can pick up on the disk errors.

Preparing for the disk replacement I searched the web and found Replace a disk in a ZFS pool.

I found the serial number of the faulty disk with `lsblk -I 8 -d -o NAME,SIZE,SERIAL`. The process is then:

  1. Shutdown the server
  2. Replace the faulty disk
  3. Boot the server
  4. Run zpool replace: sudo zpool replace data sdc
  5. Check zpool status: sudo zpool status data

I hope it turns out to be that easy! Now I just wait for my scrub to complete and my disk to arrive.

Update

Click through on the link below for some excellent documentation about how to handle this error:

Every 2.0s: zpool status                                              love: Fri Apr 30 07:52:40 2021

  pool: data
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub in progress since Thu Apr 29 02:30:54 2021
        4.32T scanned out of 5.03T at 42.9M/s, 4h48m to go
        466K repaired, 85.93% done
config:

        NAME         STATE     READ WRITE CKSUM
        data         ONLINE       0     0     0
          mirror-0   ONLINE       0     0     0
            sda      ONLINE       0     0     0
            sdb      ONLINE       0     0     0
          mirror-1   ONLINE       0     0     0
            sdc      ONLINE       0     0     4  (repairing)
            sdd      ONLINE       0     0     0
        cache
          nvme0n1p4  ONLINE       0     0     0

errors: No known data errors