Custom storage plugins for Proxmox

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
too high stuff for me :p
From my side, it's for personal use only.

I successfully move an iscsi disk to a nfs raw image.
i had to reconfigure the lun on FreeNas to use 512 block size instead of 4k.
After that, the migration succeed and i returned back the lun to 4k.

for the extent naming, i would prepend with the name of the pve + the volume name like "pve:volFAST/vm-102-disk1"
 

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
So, this is a first version of the FreeNas Plugin (tested with FreeNas 9.10) :

Be careful, the attachment is in reality a .tar.gz, but i had to rename it's extension in order to be accepted. so it's not a .txt.
You should do "Save as" and rename it with .tar.gz extension.


What you have to do for use it:

Code:
 - copy the freenasCustomStorage.tar.gz on you pve cluster
 - cd /
 - tar xvfz /<path>/freenasCustomStorage.tar.gz
You have to follow the Storage: ZFS over iSCSI - Proxmox VE for set up the
/etc/pve/priv/zfs ssh keys

You have to configure the FreeNas user to use for connecting through the REST API
Code:
  vi /usr/share/perl5/PVE/Storage/Custom/LunCmd/FreeNas.pm
     >> Adapt lines containing 'freenas_user' and 'freenas_password'
Also, you have to configure your storage:
Code:
  vi /etc/pve/storage.conf
with a definition of a new storage like this:
Code:
freenas: iscsiFreeNas
        portal xxx.xx.x.xxx
        pool volFAST
        target iqn.2016-12.fr.nephri.iscsi:pve
        iscsiprovider istgt
        blocksize 4k
        content images
        sparse 0
        nowritecache 1
After that, we have to restart the pve daemon (after FreeNas iSCSI target is set up) :
Code:
  systemctl restart pvedaemon
You can check if your storage is up with this command:
Code:
  pvesm status

If you have trouble, you have some logs into /var/log/syslog

On the FreeNAS side :

You have to start the iSCSI service.
And after, you have to configure it:
- define a target base name in "Target Global Configuration". in my example, i used "iqn.2016-12.fr.nephri.iscsi"
- define a portal in "portals" with "Discovery Auth Method" set to "None"
- configure allowed initiators in "Initiators" with ALL/ALL
- create a target in "Targets", in my example, i used "pve". Select your portal and initiators. But in "Auth Method", select "None".
- it's done :p

TODO WORK
I want to define a proper method for configure the freenas_user and freenas_password.
I would support a CHAP auth method, because at this time FreeNas must be configured with an open-bar strategy.
 

Attachments

Last edited:
  • Like
Reactions: _alex

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Great,
did a quick setup of the FreeNAS 10 Beta and can also bring up 9.10 / current.
Will give it a try and see if i can help with the todo's.

Did you use this in a PVE Cluster or on a single machine ?
Guess shared 1 should be set in storage.cfg to use the storage cluster-wide ...

Alex
 

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
I used the term of "Cluster", but it's a single node.... i didn't know how works the storage in a full cluster environment.

I don't know if FreeNas 10 has always the REST API available or if some changes applied inside it.
 
Last edited:

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
In term of change request inside pve, i think the best approach is to avoid to write an other plugin than ZFSPlugin.pm
The ZFSPlugin.pm could use the same mechanism that Storage do for Custom but for LunCmd

By example:
- The ZFSPlugin.pm could scan PVE::Storage::Custom::LunCmd for all files matching "*ZFS.pm"
- The LunCmd provide a sub that give the custom "iscsiprovider" to consider that correspond to this implementation.
- The pvemanagerlib.js should consider the "*ZFS.pm" for the GUI form

After that, we could imagine to enhance the ZFSPlugin.pm itself for features like chap auth, etc...
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
hi,
with the zfs Plugin, you are right, there should be not often a need not to use it directly. so, it should handle the delegation to the appropriate luncmd module.

but guess the way current custom plugins are handled there needs to be a module that holds configuration / options and also the name. also, the API-Version mechanism is done this way.

making pvemanagerlib dynamic, and/or include extjs-bits via the Script that Builds the index is something that should really happen. Problem i see is that it would require the Autor of a storage-plugin to work with extjs - what is definitely pita if someone is not familiar with ExtJS :(

guess it's quite ok to require addition to storage.cfg manually like it is now, as the Dialogs in the GUI would require to handle ddependencies or at least validate against, what for sure will make it not easier to implement.

today i managed to get cloning vm's to use zfs send/recv instead of converting images by qemu, what is a magnitude faster. also cloning from older snaphsots works now from my zfs-over-iscsi storage :)
still missing the case when the vm is running and clone is made from current, but this should be a trivial task for tomorrow.

these things, together with chap and rdma (powered on the ib-switch today) should definitely go into the zfsplugin.pm
 
  • Like
Reactions: nephri

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
All of theses stuffs sounds good to be included into ZFSPlugin if it's possible.

Do you start a mail into the pve-devel for discussing the way to go with them ?
 
Last edited:

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
should be possible, hopefully a push/patch will be accepted.
for the cloning from snapshots/via zfs send/rcv also two small changes in qemuserver.pm are necessary. for one case i still need to make sure it doesn't break other storage-implementations.

for now i'll just finish as-is and do some more testing and cosmetics before suggesting on the pve-devel list. just to make sure there are no bigger issues left.
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Hi Nephri,
sorry, didn't have too much time the last 2 days but finally got your plugin setup and running, looks good :)

Fore the user / pass, you can do this to get them from config, just add them to properties and config and then remove the defined - test in FreeNas.pm after adding the credentials in storage.cfg



Code:
sub properties {
    return {
        dummy => {
            description => "DUMMY",
            type => 'boolean',
        },
        freenas_user => {
            type => 'string'
        },
        freenas_password => {
            type => 'string'
        }

    }
}

sub options {
    return {
        nodes => { optional => 1 },
        disable => { optional => 1 },
        portal => { fixed => 1 },
        target => { fixed => 1 },
        pool => { fixed => 1 },
        blocksize => { fixed => 1 },
        iscsiprovider => { fixed => 1 },
        nowritecache => { optional => 1 },
        sparse => { optional => 1 },
        comstar_hg => { optional => 1 },
        comstar_tg => { optional => 1 },
        content => { optional => 1 },
        shared => { optional => 1 },
        freenas_user => {fixed => 1},
        freenas_password => {fixed => 1},
    };
}
CHAP-Auth will work the same / similar way then ...

Alex
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Attached is the whole custom-folder with both plugins and a shared implementation of ZFSPlugin (ZFSPluginPlus) that handles snaps by zfs send/rvc.

Also attached is QemuServer.pm with changes in qemu_img_convert (qemu_drive_mirror still missing, should be changed in a similar way).

There are still some minor things to do, like cleaning up tmp zvols made from snaps, CHAP-Auth and rdma, but should mostly work.
 

Attachments

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
Hi,

i will test all of theses stuffs this week-end.

I just read quicly some of your stuffs and i have a question about your implementation of "materialize_snapshot"

If i understand correctly, the "Take snapshot" of proxmox delegate to a ZFS snapshot.
The materialize_snapshot is a way to create a zvol usable by an iSCSI lun in order to start the vm from this.
For that, you create a zvol and perform a send | recv.
Why to not use the "zfs clone" command ? it allow a create really quicly a zvol from a snapshot even in writable mode as long as you work on the same pool.

But, i didn't see such feature in proxmox gui !! so maybe it's another feature ?
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Hi,
materialize_snapshot is used when volume is activated on a snapshot. this is what the original ZFSPlugin didn't allow.

It`s handy when you take a clone via gui on a older snapshot (other than current) what also isn't possible with the current implementation.

So, you can take a snap of a (running) VM and then clone the VM at the state of the snap.

I was thinking about zfs clone, too.
Problem is that the clone will allways stay linked, what means this is never an independent volume.
So, if you want to use something based on the clone, you never can delete the origin volume.

For the use-case to get a clone of a running VM at a snapshot this is maybe not what makes sense.
But, would be cool to be able to select in the gui if a clone or independent volume shall be created.

The timing when the snap of the volume is activated by QemuServer.pm is a bit unlucky, i think it`s even done twice when cloning a VM and sometimes there is actually no change to cleanup such 'materialized snaps' after things are done.
Definitely something i`ll have a closer look at, and also re-think if clone's in the first step - for the case those snapshots will never be used/accessed in full afterwards, would be better.

If you're going to give it a try on the weekend i might send an updated version, want to do some review tomorrow again.

Alex
 

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
Yes, i understand, it's maybe better to do a send|recv

But do you handle the vm clone when the storage is not the same ? in this case you should go for a standard qemu event if both are ZFS over iSCSI if portal and or pool are not the same.

For me, a missing feature in proxmox is the ability to clone a single disk with theses options:
- selecting the source disk
- selecting the source snapshot
- selecting the destination vm
- selecting the destination storage

In one of my rescue plan, i create a new disk on a dedicated vm, i destroy the zvol, lun.
I clone my source disk myself on freenas with the expected volume name, i recreate the lun.
After that i can start the dedicated vm, mount the fs and explore it.
When rescue operation is done, i destroy the disk.
I do that when i want to explore a disk without using the clone vm (because i don't want to boot on it)

If i had a "clone disk" option, i could do that in one operation.

For such operation, i manually use a "zfs clone" because it's really fast for such purpose.
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Yes, i understand, it's maybe better to do a send|recv
Well, i`m not sure if it`s really 'better', as send/recv needs enough space to create the new volume.
There are several cases a volume is activated for, and for some a clone would be sufficient.
i.e. if the target-storage is not the same as the source-storage. Here it would be ok to have a clone and then let qemu_convert work from that.

As i said, timing of activating volumes in QemuServer is a bit unhappy, as it`s not really clear what the activated volume will be used for.

Maybe it`s an option to do a clone instead of send/recv first, and then send/rcv from this (and use space on the storage) when it's actually something like a copy-operation on the same storage.

But do you handle the vm clone when the storage is not the same ? in this case you should go for a standard qemu event if both are ZFS over iSCSI if portal and or pool are not the same.
Not handled at all, then qemu_convert works from the source - what is the 'materialized snap' - and writes ne volume to target storage.


For me, a missing feature in proxmox is the ability to clone a single disk with theses options:
- selecting the source disk
- selecting the source snapshot
- selecting the destination vm
- selecting the destination storage

In one of my rescue plan, i create a new disk on a dedicated vm, i destroy the zvol, lun.
I clone my source disk myself on freenas with the expected volume name, i recreate the lun.
After that i can start the dedicated vm, mount the fs and explore it.
When rescue operation is done, i destroy the disk.
I do that when i want to explore a disk without using the clone vm (because i don't want to boot on it)

If i had a "clone disk" option, i could do that in one operation.

For such operation, i manually use a "zfs clone" because it's really fast for such purpose.
That would be like cloning the VM and work on a clone of it`s volumes ?
Or do you mean mounting a clone into an existing VM ?

For me, coming from Citrix XenServer, one of the features i miss the most is detaching and (re)attaching Volumes to VM's. But guess this would be a major change as also some work in the GUI would be necessary. i.e. checking what volumes are available on the different storages that are not attached to any VM etc.
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Bildschirmfoto 2017-01-06 um 12.28.32.png

I absolutely wonder how to allow smth. like a 'linked clone' in this dialog.
What then could reside on a clone instead of independent dataset.

Tried this in the volume_has_feature sub, without success.
clone => { base => 1, snap=>1 },

But, looking at the xhr-requests the gui sends for the dialog, it doesn't even ask for available modes :(
Maybe somewhere hardcoded in the gui ...
 

nephri

Active Member
Sep 23, 2015
541
106
43
46
Paris, France
When you remove a disk, it is not destroyed but marked as "unused". But you can't assign it to another vm easily.
Effectively, i need to clone a volume instead of a vm (because i reused an existed one) and need to clone it but assigning it to another vm.
So it is exceptional use case, and i can go handy for that.

Yes i would love to have another option than "Full Clone" like a "Linked Clone"

In off topic, it's like "Backup" options strange for me:
You have "Mode" with "Snapshot", "Suspend" and "Stop"

But i would expect a feature like
- Lock Policy : "Suspend" or "Stop"
- Mode : "vzdump" , "vzdump from snapshot" , "snapshot"

"vzdump from snapshot" is equally to the current "Snapshot"
"snapshot" would do a simple "snapshot" on the underlying storage and do nothing else.

I would automate backup on proxmox but using "zfs snapshot" but with the vm off. The storage server is responsible after that to export the snapshot on some other volumes.
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
hi,
will post a zip where zfs clone is used for initial activation and send/rcv from a snap of this for the copy if needed, tomorrow.
with this the expensive send/rcv is only done when it's actually needed.
had a minor cleanup/check if clone exists issue left before i had to leave in the afternoon.

why would you want to take the vm down for backup if you could create a snap while it runs, too?
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
attached the latest ZIP`s that work on clones as long as possible.
Was a bit of a hassle to cleanup / prevent errors of duplicate zvol's etc.
Due to this, still missing / not changed is cloning a running VM from current.
This will use qemu drive_mirror, guess it could be much faster with snap + clone + zfs send/rcv, too.

I tested a lot with SCST (LUN-Numbers counted over 200 in the end), as my FreeNAS is terrible slow (VM lives on LVM on my rusty boot-disk) but did basic testing with it, too.
 

Attachments

dbo

New Member
Feb 22, 2017
2
0
1
54
Hello,
I tried the scstzfs module on my servers, it connects and reports correctly the size of zfs pool, but there are no zvols and luns listed on the target dataset. Is it normal ?

Before finding your work, I started my own implementation of scst plugin. I took another direction, based on comstar module, mapping lun_cmds to scstadmin commands. It is much simpler, as most of the work is already done by scstadmin, and maybe more robust, but it needs some additional work to be feature complete. I will try to merge it