Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

More:

- WSL2 sometime corrupt .zsh_history and git https://github.com/microsoft/WSL/issues/5026

- WSL2 corrupts ext4 filesystem https://github.com/microsoft/WSL/issues/5895



This is probably HyperV. I’ve seen exactly this ext4 corruption in production on windows server 2012R2 with CentOS 7. Even to the point that the machine remounts root read only. Unfortunately our windows operations guys are severely lacking in diagnostic savvy and just reboot the machine over and over again or blast it and provision a new one and don’t analyse the problem.

From what I’ve seen it’s a combination of the storage drivers and the storage virtualisation in HyperV rather than a specific issue. I imagine it’s something similar in WSL.

I really don’t trust it as a platform at all. It’s barely better with windows guests.


Had a serious talk a Unix manager over a decade ago who was convinced Windows ops didn't require as much expertise as Unix/Linux. It was a common misconception that MS seemed to encourage. As someone who came over from Windows, I knew better. That attitude continues to influence standard practices, hiring and, most importantly, training and education opportunities for Windows admins -- to the detriment of all. I've also had my collisions with Hyper V, and have come away with the same impressions as you have.


Agreed. I've done both and if you ask me Windows ops is vastly more difficult because everything is brittle, inconsistent and unreliable and rarely repeatable almost all of the time. It requires great skill, determination and persistence to navigate issues like this. Unfortunately as you suggest, the outcome is hiring as cheap as possible and fixing all issues by not changing anything other than replacing everything every few years. There is rarely any day to day admin I see other than planning the next major rollout with some vain hope it'll have less problems than the last one.


I don't know if I'd call it brittle, I would call it super complex (when you get into wmic & friends) and harder to get info online, compared to Linux, because everyone has to tinker with Linux while only a minority of power sysadmins dig that deep into Windows.


> Unfortunately as you suggest, the outcome is hiring as cheap as possible and fixing all issues by not changing anything other than replacing everything every few years.

I think you’ve hit the nail on the head there. It seems like every organisation I’ve been part of which has a significant Windows presence is either static or planning a rollout of and migration to some new magical enterprise software that replaces the old enterprise software they purchased and this time it’ll definitely make everything better. It’s amazing to me how much money gets spent on per-seat licensing for what essentially amounts to no noticeable improvement for anyone involved. But sure, I’m sure this company-wide spyware of choice will be the one that finally means we can just stop caring about security or provisioning machines, right folks?

It’s like the folks doing it have mastered the art of finding busywork that’s just complicated enough that folks signing the cheques can’t really tell they’re burning money. In that sense it’s beautiful I suppose.

Just, erm, ignore the fact a dozen different developers have essentially root access to production databases... at least they can’t install software!


It's not that brittle, MS backward-compatible supports a lot of APIs, but ... the upper layers on top are just vendor-ware shit 99% of the time. (Even/especially their own config/setup wizards/GUIs.)

I mean a bash script that handles no errors and outputs nothing just screams madness. The same thing wrapped in a .MSI .. well, you'll never know what hit your, and if it's your job to somehow unfuck this, it's virtually a piece of literal hell itself, slowly rotting and eroding people's soul and mind.


I've worked with good windows admins.

I have so much respect for them.


MCSE was a punchline 20 years ago so this is a misconception almost as old as the entire profession of Windows Admins.


The only situation when BTRFS breaked badly to me, was when we run guests on Hyper-V .


I can confirm that hyper-v snapshots break ext4 just about everytime


After upgrading to WSL2, I started having issues with a Virtualbox VM. Turns out it didn't play nicely with HyperV. I went back to WSL1.


Newer VirtualBox releases can run virtual machines on top of Hyper-V as a virtualization engine. It is slower than VirtualBox' own engine, but overall it is still a better experience than Hyper-V Manager.


Networking doesn't work properly if you do this. It's a mess.


disagree, it's unusably slow... i had to move to hyper-v vagrant driver as vbox, while it would boot the VM, was so slow it defeated any purpose on speeding up development


That’s because HyperV is a type 1 hypervisor and vbox is a type 2 hypervisor. They don’t mix well :)

Best option is still vbox and putty IMHO. VScode will work with it and SSH fine.

Or say fuck it, buy a Mac and do all your Linux work in the cloud.


> Unfortunately our windows operations guys are severely lacking in diagnostic savvy and just reboot the machine over and over again

What are you talking about, that is how you diagnose a Windows box...


But why? WSL1 was something like wine but reverse, but WSL2 is actually linux.


The problem is likely not in the Ext4 code, but in the block I/O driver (which is Hyper-V specific, IIRC) or even in Hyper-V itself. Several reports mention Windows shutdowns, sleep or hibernation, so it may be a simple unclean shutdown of the VM.

A bigger problem would be if Hyper-V is either ignoring memory barriers, or caching writes to the disk and losing them when the Hyper-V service is shutdown. But that would likely affect more than just WSL, so we'd have seen the problem sooner (or so I vehemently hope).


Huh, interesting. I run a variety of linux based services at home. For years I ran them on a Hyper-V VM (because my computer was technically my gaming machine). I only recently migrated everything to a cluster of Raspberry Pi devices.

I used to have occasional problems with this setup, and it was always some kind of drive corruption or mounting issue. I wonder if this is related?


I recall ext4 had[1] some issues[2] with data loss due to unclean shutdowns.

I assumed that had all been fixed by now, but yeah, these things can get tricky fast.

[1]: https://lwn.net/Articles/322823/

[2]: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...


> Hyper-V service is shutdown

as far as I know when wsl2 is actiavted, hyper-v runs as type-1 and linux, windows are virtual after activating it.

thus if it crashes, you get a bsod


Virtualbox has (had?) similar issues in certain configurations where it maintains a small write cache and doesn't honor IO barriers which lead to journaled/cow filesystems reporting an inconsistent state that should have been prevented by journaling.


That's probably the cause.

WSL2 being Linux means that, unlike WSL1 which directly uses the host NTFS filesystem, it's probably using an emulated block device to hold its filesystem. If that emulated block device doesn't correctly honor write barrier requests from the Linux kernel, it could explain the corruption.


Wild guesses:

* the kernel is not properly shutdown (and sometimes some buffers are not flushed)

* the virtual block device and/or its linux driver has bugs


FreeBSD has native support for Linux binaries by mapping system calls, and it's fairly reliable when it works. What's nice is that when it works, it works, adding support for system calls improves coverage, and since underlying things like the FS aren't virtualized, it tends to be pretty reliable.


Yeah; Windows had something like that, too. It was WSL1 (or just "WSL"). I also tend to think that was the better approach.


It didn't extend to use cases like containers, that would have basically required MS to rewrite large parts of the Linux kernel's core code for namespaces, mount points etc.


Sure. Running a Linux VM (WSL2) just to use containers seems to kind of defeat the point, though. You might as well just run your containers in VMs.


The use case is for developers to use their Linux tools with Windows integration. WSL1 only did the latter half well, "traditional" VMs only did the former. WSL2 does both, however that brings both advantages and disadvantages of VMs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: