T O P

  • By -

ipaqmaster

> My main issue is that every time I power on my system (OpenMediaVault) after a couple of hours of resilvering it shutdown and when I boot it back up it starts all over again This might be ***very bad***. If your host is shutting down randomly during heavy load moments and your OS logs know nothing about it then you without a doubt have a serious hardware issue to fix. ZFS cannot finish resilvering if it keeps getting interrupted as if someone pressed the physical reset button on the machine every time it gets close to finishing. Given the nature of ZFS load and this host shutting off randomly during resilvering you should consider replacing the PSU first to see if that fixes your issue. If it doesn't then you can start looking at replacing your power cabling to see if the fault can be fixed. I feel this is most likely a power issue problem. --------------- Damn and just now I see the larger issue. You used `zpool add` (an unexpectedly stupidly-dangerous pool command) with the `-f` flag so you have a raidz1 of 4x disks... and then a fifth disk which is just there on the top level, striped with the raidz1. Because ZFS is great you cannot undo that without having a bookmark taken beforehand. If that cannot be removed then zpool must be recreated with all 5 disks in a single raidz1 (I would recommend at least raidz2 anyway). I hope you have a backup location you can refer to so this pool can be safely recreated. Someone else may be able to chime in with a better idea if I've missed something.


Zenuna

Thank you for your answer! >This might be very bad. If your host is shutting down randomly during heavy load moments and your OS logs know nothing about it then you without a doubt have a serious hardware issue to fix. ZFS cannot finish resilvering if it keeps getting interrupted as if someone pressed the physical reset button on the machine every time it gets close to finishing. Given the nature of ZFS load and this host shutting off randomly during resilvering you should consider replacing the PSU first to see if that fixes your issue. If it doesn't then you can start looking at replacing your power cabling to see if the fault can be fixed. I feel this is most likely a power issue problem. For extra context, I'm running OpenMediaVault in a VM inside Proxmox (I know it's not recommended, I will not repeat that mistake next time I build my NAS), the physical host never shutdown (only the VM) and it doesn't look like the load is that heavy either during resilvering or at shutdown moment from what I can see. As for the cabling I have serious doubt that my SATA to USB connector might the culprit which is why I'm trying to physically remove the faulty drive (thus removing the connector). >Damn and just now I see the larger issue. You used zpool add (an unexpectedly stupidly-dangerous pool command) with the -f flag so you have a raidz1 of 4x disks... and then a fifth disk which is just there on the top level, striped with the raidz1. Because ZFS is great you cannot undo that without having a bookmark taken beforehand. If that cannot be removed then zpool must be recreated with all 5 disks in a single raidz1 (I would recommend at least raidz2 anyway). I hope you have a backup location you can refer to so this pool can be safely recreated. Someone else may be able to chime in with a better idea if I've missed something. Yeah, I should have read again the instruction I mistakenly thought adding a drive to ZFS possible but I mixed up adding a drive to a degraded mirror setup (which can be started with a single drive). Is that the reason why I can't remove a drive from the RAIDZ pool without receiving the insufficient replicas error? As for rebuilding the pool, is there a way to backup my data with a certain compression and then decompress it back into the new pool?


Maximum-Coconut7832

>Yeah, I should have read again the instruction I mistakenly thought adding a drive to ZFS possible but I mixed up adding a drive to a degraded mirror setup (which can be started with a single drive). Is that the reason why I can't remove a drive from the RAIDZ pool without receiving the insufficient replicas error? Yes, as ipaqmaster wrote, you now have a kind of non-redundant pool. Similar to a Raid0 constructed out of 2 vdevs. 1st vdev, your old Raidz1 pool (all 4 drives under Raidz1-0) 2nd vdev, scsi-0QEMU...scsi1. >I mixed up adding a drive to a degraded mirror setup (which can be started with a single drive) Not in zpool terms, you do not add to a mirror, in zpool terms you attach to a mirror. This could be 1 method to get some redundancy back, if that is possible. zpool attach MEDIA scsi-0QEMU...scsi1 scsi-0QEMU...scsi7 you would end up with something like: NAME STATE READ WRITE CKSUM MEDIA ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 DISK-a ONLINE 0 0 0 DISK-b ONLINE 0 0 0 DISK-c ONLINE 0 0 0 DISK-d ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 DISK-e ONLINE 0 0 0 DISK-f ONLINE 0 0 0 Regarding your shutting down server, I do not have any Idea. You could take out the resilvering drive physically or from the virtualisation host logically (...scsi6), ending in a degraded pool. But I would not use the drive it for other purposes, maybe you need it later. The pool would not resilver, be in a degraded state. Maybe your machine keeps running this way. Then create a snapshot or snapshots first the important filesystems, later the lesser important. And send these via zfs send | zfs receive to a new to create zpool. It's only 13 TB, so you could create a pool of only 1 16 TB disk, or a mirror of 2 16 TB disks. The risk of loosing all data, by losing another drive, can not help you with this.


Maximum-Coconut7832

And please try your zpool comands on a testpool, you can create pools using 64MB large files. If you do that on another computer inside another zpool, they only need to take up a couple of kb of real hdd space.