There were some popular comments on HN recently saying that manual threading has got much easier in Rust lately and it's now easier to just not use async.
I have no idea about this myself, it's just a bit of public opinion that felt worth noting in the back of my mind!
Imho this has always been true. Multithreading is very mature in rust, and generally a pleasure to work with (especially if you lean towards channels over mutexes). Async is still a bit tacked on and has a number of footguns and UX problems.
The main reason to use async in rust is that seemingly every network library uses async. And of course servers that would have to start a huge number of threads without async, though from a performance standpoint many servers could honestly live without it.
The main reason to use async in rust is that seemingly every network library uses async.
Yes. I was calling that "async contamination" a few years back when that trend started. There's a strong lobby for doing web backend stuff in Rust, and they want it to work like Javascript.
On the thread side, deadlocks can be a problem. You don't get any help from the language in keeping your lock hierarchy consistent. There are systems that can detect at compile time that one path locks A before B, and another path locks B before A. Rust needs such analysis.
It turns out that the built in mutexes are not "fair". Actually, they're worse than not fair. If you have two thread loops that do "get item, lock, handle item, unlock", one of them can be starved out completely. This happens even though there's a moment when the mutex is unlocked and something is waiting on it. It's a consequence of an optimization in std::sync::Mutex for the fast case.
The "parking_lot" crate has fair mutexes, but it doesn't have panic poisoning and is not unwind safe, so catching failed threads to get a clean shutdown is hard. This is an issue for GUI programs, because failing to catch panics means the user experience of a failure is that the window closes with no message.
"On the thread side, deadlocks can be a problem. You don't get any help from the language in keeping your lock hierarchy consistent."
The 98% solution to this problem is to dodge it entirely. Communicate with things like channels, mailboxes, and actors, or other higher-level primitives. This is how Go programs hold together in general despite not even having as much support as Rust does for this sort of analysis.
The rule I've given to my teams is, never take more than one lock. As soon as you think you need to, move to an actor-ownen-resource or something. The "real" rule (as I'm sure you know, but for others) with multiple locks is "always take them in the same deterministic order" (and even that's a summary, it really also ought to discuss how only mutexes that can ever be taken at the same time have to be considered in order, and heck for all I know it gets complicated beyond that too; at this point I bug out entirely) but this is, in my opinion, the true source of "thread hell" as written about in the 90s. I don't think it was threading that was intrinsically the problem; yes, threads are more complicated than a single thread, but they're managable with some reasonable guard rails. What was insane was trying to do everything with locks.
In theory, this rule breaks down at some point because of some transactionality need or another; in practice, I've gotten very far with it between a combination of being able to move transactionality into a single actor in the program or offload it to a transactionally-safe external DB. For which, much thanks to the wizards who program those, test those, and make them just a primitive I can dip into when I need it. Life savers, those folk.
Yes, I might have been better off with an actor model. I'm writing a high-performance metaverse client, which has both the problems of a MMO game and of a web browser. Must maintain the frame rate, while content you don't control floods in.
I'm coming from go, what did you mean by "if you lean towards channels"? Did you mean std::sync::mpsc? Seems like a rather crippled channel if you can only have a single consumer.
I recently wrote a program that used rayon to do a multithreaded but heavily I/O-bound computation. It was super simple to write that and get it working. It executed hundreds of millions of tasks, but you just tell rayon how big the thread pool should be and execute everything on a parallel iterator. It looks just like single-threaded code. As simple as this sort of thing can be.
But in theory, because of the I/O, it should be much more efficient to run this async. So I tried doing it async on a single thread to start with. That was a bit more difficult to get working, just because of all the messiness around async. And it was much slower because of the single thread.
I then tried to get that working with multiple threads. That proved problematic. I tried libraries like par_streams but ran into issues. I ended up using Tokio’s buffered_unordered to get a simpler setup with a pool of Tokio tasks.
This ended up giving quite unpredictable performance and being very heavy on memory usage. It seemed like it was going to need quite a bit of tweaking to get it to work as smoothly as the multithreaded version. I ended up abandoning it, since I didn’t have time to debug the black box of Tokio’s internal scheduling.
The lesson I took from this is that unless you have a really strong need to use async, multithreaded is going to be a lot easier and more reliable to implement, and the code will be much more transparent, not littered with awaits and complex Future type signatures.
Unfortunately async is the only straightforward way you can do cancellation so there's still a lot of scenarios where you can't just use threading easily.
I'm not sure why people are disagreeing with this: how would you cancel UdpSocket::recv without an async runtime? If you're using just the sync UdpSocket, you will need to go set a read timeout and repeatedly poll and insert your own cancellation points, or otherwise jury-rig your own mechanism up with syscalls to select/epoll/whatever, and be careful the whole time you're not leaking anything. However, if you're using an async runtime, you only have to drop the future and the recv is cancelled and the FD is dropped from the scheduler's waitset and everyone is happy.
It depends on what the OS gives you. As far as I know the old BSD sockets-like Linux I/O syscalls are not really cancellable (the process would have to send a signal to itself to get interrupted, or close the fd, or pthread_kill ... https://stackoverflow.com/questions/17822025/how-to-cleanly-... )
And that's usually where the zero-cost [compared to hand rolling it] abstractions come in, but it turns out that's not enough, we would also need composability (decomposability to be precise), because async Rust is doing a lot, especially if you need only the cancellation feature.
Stepping back, it all depends on the development budget and goals. If time is short but there's plenty of RAM and performance budget, just go with Tokio. If that's too bloated, go with something low-level, eg. mio, and if you need every last bit of performance, then you are basically bargaining with the kernel anyway, so io_uring, and XDP (eBFP) ... and along the way there are potential stops at Seastar (a C++ framework) and dark alleys like DPDK (https://talawah.io/blog/linux-kernel-vs-dpdk-http-performanc...) :)
I have no idea about this myself, it's just a bit of public opinion that felt worth noting in the back of my mind!