Select Page

This is a journal entry detailing the weeks of struggle involved with configuring and troubleshooting a ZFS drive arrangement with RancherOS.

  • Completed RancherOS ZFS setup, worked fine
  • Tested a hard fault (cable pull mid-10GB dd write)
  • Resilvered ok, but Docker process kept restarting
  • 5 weeks of frustration & procrastination
  • /var/log/docker.log: `error initializing graphdriver: prerequisites for driver not satisfied`
  • system-docker logs docker: crashes

tail /var/log/docker.log:

time="2018-02-08T06:17:35Z" level=warning msg="The \"-g / --graph\" flag is deprecated. Please use \"--data-root\" instead"
time="2018-02-08T06:17:35.465609853Z" level=info msg="libcontainerd: new containerd process, pid: 19813"
Error starting daemon: error initializing graphdriver: prerequisites for driver not satisfied (wrong filesystem?)

sudo system-docker logs docker:

time="2018-02-08T06:17:35Z" level=info msg="Starting Docker in context: console"
time="2018-02-08T06:17:35Z" level=info msg="Getting PID for service: console"
time="2018-02-08T06:17:35Z" level=info msg="console PID 1502"
time="2018-02-08T06:17:35Z" level=info msg="[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=rancher HOME=/]"
time="2018-02-08T06:17:35Z" level=info msg="Running [docker-runc exec -- f40c371139561c6cc3c9d489ede7100c28ef656686dcab480342eebbc849936a env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=rancher HOME=/ ros docker-init daemon --log-opt max-file=2 --log-opt max-size=25m --graph /mnt/zpool1/docker --host unix:///var/run/docker.sock --storage-driver zfs --group docker]"
time="2018-02-08T06:17:35Z" level=info msg="Found /usr/bin/dockerd"

sudo zfs list:

NAME                                                                                  USED  AVAIL  REFER  MOUNTPOINT
zpool1                                                                                696M  14.0T   192K  /mnt/zpool1
zpool1/docker                                                                         683M  14.0T  41.8M  /mnt/zpool1/docker
zpool1/docker/04a363ea485d64c8b46f747f70d4a71bf9a3b5fedc7f530806e622aa9d6f00e6        181M  14.0T   181M  legacy
zpool1/docker/06a6019326ae6b3a9274be3bc2e41e3864d41e3f3622f00f3ebb47861b734db1        519K  14.0T   318M  legacy
zpool1/docker/06a6019326ae6b3a9274be3bc2e41e3864d41e3f3622f00f3ebb47861b734db1-init   336K  14.0T   318M  legacy
(... 3 dozen other containers...)

The Key

It wasn’t until a Github comment mentioned listing ZFS mounts as sudo zfs list -o name,mountpoint,mounted that a lightbulb moment happened:

NAME                                                                                 MOUNTPOINT          MOUNTED
zpool1                                                                               /mnt/zpool1              no
zpool1/docker                                                                        /mnt/zpool1/docker       no
zpool1/docker/04a363ea485d64c8b46f747f70d4a71bf9a3b5fedc7f530806e622aa9d6f00e6       legacy                   no
zpool1/docker/06a6019326ae6b3a9274be3bc2e41e3864d41e3f3622f00f3ebb47861b734db1       legacy                   no

These pools were not actually mounted! Ok, well let’s mount up then…

$ sudo zfs mount -a
cannot mount '/mnt/zpool1': directory is not empty
cannot mount '/mnt/zpool1/docker': directory is not empty

After some reading, it turns out that a resilver requires a remount and while the files are still backed on physical disks (thankfully) the leftover files in the `/mnt/zpool1` directory need to be wiped.

$ sudo system-docker stop docker
$ sudo rm -rf /mnt/zpool1/*
$ sudo zfs mount -a
$ sudo zfs list -o name,mountpoint,mounted
NAME                                                                                 MOUNTPOINT          MOUNTED
zpool1                                                                               /mnt/zpool1             yes
zpool1/docker                                                                        /mnt/zpool1/docker      yes
zpool1/docker/04a363ea485d64c8b46f747f70d4a71bf9a3b5fedc7f530806e622aa9d6f00e6       legacy                   no
zpool1/docker/06a6019326ae6b3a9274be3bc2e41e3864d41e3f3622f00f3ebb47861b734db1       legacy                   no

Did it work??

$ sudo system-docker start docker
$ docker ps -a
CONTAINER ID        IMAGE                               COMMAND                  CREATED             STATUS                   PORTS               NAMES
462cd473b54f        rancher/healthcheck:v0.3.1          "/.r/r /rancher-en..."   2 hours ago         Up 2 hours                                   r-healthcheck-healthcheck-1-467d5b2a
9281f836947f        rancher/scheduler:v0.7.5            "/.r/r /rancher-en..."   2 hours ago         Up 2 hours                                   r-scheduler-scheduler-1-4c0c98fe
97057d6bf1ac        rancher/net:v0.11.2                 "/rancher-entrypoi..."   5 weeks ago         Up 2 hours                                   r-ipsec-ipsec-router-1-1c794f92
63825ffd24df        rancher/net:holder                  "/.r/r /rancher-en..."   5 weeks ago         Up 2 hours                                   r-ipsec-ipsec-1-262fe98e
f281a0a7a7a0        rancher/dns:v0.15.0                 "/rancher-entrypoi..."   5 weeks ago         Up 2 hours                                   r-network-services-metadata-dns-1-9fc234c5
82d156b5c20a        rancher/net:v0.11.2                 "/rancher-entrypoi..."   5 weeks ago         Up 2 hours                                   r-ipsec-ipsec-cni-driver-1-13eae71d
d842c8e0f210        rancher/network-manager:v0.7.0      "/rancher-entrypoi..."   5 weeks ago         Up 2 hours                                   r-network-services-network-manager-1-40ae3e9b
ad8279f86762        rancher/metadata:v0.9.1             "/rancher-entrypoi..."   5 weeks ago         Up 2 hours                                   r-network-services-metadata-1-aa356054
5af8b36d8ae2        rancher/agent:v1.2.8                "/run.sh run"            5 weeks ago         Up 2 hours                                   rancher-agent
70f60c419794        emcniece/docker-pushbullet-reboot   "/bin/sh /entrypoi..."   5 weeks ago         Exited (0) 2 hours ago                       pushbullet-reboot

Yes! Minus two hours, but you get the idea. I performed a few reboots and ran some ZFS diagnostics to confirm health, everything looks good.

What a relief… now I can get back to building out the Plex stack!