OpenZFS is designed in a way that uses a shim layer between ZFS's internal Solarisish API usage and the native OS's usage This allows the same ZFS code to run essentially unmodified on many *nixes without major changes.
In ZoL, this is called SPL (Solaris Porting Layer), and is one of the two kernel modules required to use ZoL. Linux does not export the proper APIs that ZFS requires to work here, and SPL fills that gap.
ZFS modules do "break" often due to internal API changes, but the fix is usually shipped in ZFS stable before the kernel itself is shipped. Only people who follow ZoL development closely ever see the sausage being made.
It is highly unlikely that Linux can ever do anything that ends up with a situation that the SPL and ZFS kernel modules can't easily #ifdef their way out of. It wouldn't make sense for Linux to, either.
> ZFS modules do "break" often due to internal API changes, but the fix is usually shipped in ZFS stable before the kernel itself is shipped. Only people who follow ZoL development closely ever see the sausage being made.
Eh, it depends on the distribution. I use Fedora, and there have been quite a few times when a kernel update has resulted in ZoL becoming useless for a while until they push an updated version. The solution, of course, is just booting the earlier kernel until ZoL catches up.
In fact, this is happening right now. The 4.20 kernel was pushed to updates on Fedora (at least my system) on 1/23. The latest stable ZoL doesn't work with it. They're on RC3 for the next release, though, and that does. Hopefully it comes out soon.
In the mean time, I'm running the last 4.19 kernel. shrug
> I use Fedora, and there have been quite a few times when a kernel update has resulted in ZoL becoming useless for a while until they push an updated version.
Alternatively, you can use the variation of a distro, if it exists, that provides a more stable kernel and package version set. In this case, that would be RHEL/CentOS.
For a workstation, that can be annoying, but as you said, just use an older kernel for a bit longer. Perhaps mark it as to be skipping in normal package operations, and have a cron job that runs to check the kernel specifically and emails you if there are updated versions of the kernel that exist.
For a server, I imagine it's rarely a problem since those should be running more stable distros anyway, since 99% of the time the kernel is older and back-patched (it may be the current kernel, but still weeks after it was released right at point release updates), which should result in a stable API for ZoL.
The trade-offs there don't seem too onerous to me. A little hand configuration for a workstation (of which there's likely one or two for a person to deal with as long as there's smooth server support which is generally both more important to have solid because it can be harder to fix if there's a problem and because it can scale from none to many per person.
Sure, it doesn't bother me a whole bunch, I just thought I'd point out that, at least with Fedora, ZoL is occasionally behind tracking the most recent kernel release.
How does RHEL help there? They ship kernel updates. Those kernel updates once broke HP b120i properietary driver (HP releases new version of this driver for every minor RHEL release). I don't see how fakeraid driver is fundamentally different from ZFS.
Kernel updates are back patched. That means that in-between point release updates (which generally happen every 6-12 months) the kernel version stays the same, and any bug-fixes are ported into the older kernel that is shipped. Point releases may update the kernel version (I think?), but generally keep it the same as well, but they will back port some features into the older kernel as well, not just bug/security fixes. You can see here[1] for RHEL versions and the kernels they ship with.
If a security fix breaks your ZoL integration, my guess is you're actually better off waiting for that to play out and resolve itself than to expect it to work. If a feature back port breaks it, that might be a little more annoying, but I imagine it will be fixed in short order, and you only have to worry about that once every 6-12 months (and it's well publicized).
> Those kernel updates once broke HP b120i properietary driver (HP releases new version of this driver for every minor RHEL release).
If HP is releasing closed source drivers for RHEL, I imagine they would want to be on the certified hardware list and test, or at a minimum seek access to the beta spins of the point releases (which I think is where it broke) so they can test before it comes out. I'm not sure I blame Redhat for HP trying to specifically support RHEL and failing to do so, given the systems I know they have in place to help companies in just that situation (because it helps RHEL users).
In any case, all I'm really noting is that between Fedora, which ships a new kernel version every kernel update (AFAIK) and RHEL/CentOS, which ship larges the same kernel with only the specific changes needed the majority of the time, keeping ZoL working should be vastly easier on RHEL systems (and in fact, any OS which does back patching of kernels, which I believe includes SuSE and the LTS releases of Ubuntu).
It's not really the same kernel. They backport a lot of features with every minor release. They still call it 2.6.x or whatever, but it really is different. I know that RHEL has some subset of internal kernel API that they promised to keep stable within major release, so if HP failed to rely on those API, it's their problem, but it might happen.
On this tangent, other out of tree patchsets like OpenVZ have had similar issues where the kernel has massively changed between versions, and forward porting their changes is challenging at best, even with a massive userbase.
That allows ZoL to run on many different * nixes, which means that if Linux made a drastically breaking change, you could use it on another OS, sure.
And the Linux kernel has made numerous breaking changes to their APIs that ZoL has been able to work around. So it has happened, they've just been able to deal with it.
Despite the belief that a breaking change that ZoL can't work around being improbable, it is still possible. The Linux Kernel could majorly overhaul an API in a major version release, in a way that ZoL can't handle. Given ZoL's status as a separate kernel module not under a GPL license, it's entirely possible that no amount of yelling gets the Linux maintainers to change their mind. In fact Wowfunhappy notes a discussion along those lines is happening currently based on function symbols being removed for an API required for Mac hardware support.
And sure, that compatibility layer means that users could switch to FreeBSD, or Solaris, etc, and keep using ZFS. If that happens, does Delphix change their target platform again to move to FreeBSD? Or do they come up with another solution, and stop supporting ZoL?
Except only recently we have seen an instance of access to an API being removed that has broken the build of ZoL, while this one is at worst case a performance regression it does show that the mainline kernel is very much able to breaking ZoL with an API change.
Which would essentially be impossible.
OpenZFS is designed in a way that uses a shim layer between ZFS's internal Solarisish API usage and the native OS's usage This allows the same ZFS code to run essentially unmodified on many *nixes without major changes.
In ZoL, this is called SPL (Solaris Porting Layer), and is one of the two kernel modules required to use ZoL. Linux does not export the proper APIs that ZFS requires to work here, and SPL fills that gap.
ZFS modules do "break" often due to internal API changes, but the fix is usually shipped in ZFS stable before the kernel itself is shipped. Only people who follow ZoL development closely ever see the sausage being made.
It is highly unlikely that Linux can ever do anything that ends up with a situation that the SPL and ZFS kernel modules can't easily #ifdef their way out of. It wouldn't make sense for Linux to, either.