Please, name these academic Unixes! I would love to go see what they do. Down-thread there's a mention of minix, which does the normal thing: demand paging, context switches only the page table directory pointer, and process memory images are moved around the storage hierarchy indirectly, through page faults. Which other academic Unix or Unix-like did you have in mind?
Linux is indeed not the owner of the concept of PID 0. It's fortunate that I didn't say that! It is, however, not frequently involved with paging in and out memory.
xv6 and its many forks are what I'm thinking about
You address this somewhat in the post:
> Going back to the Wikipedia article, it seems the author of that edit wanted to write “swapping”, in the classic Unix V5 sense of swapping out whole processes as a consequence of scheduling. But the edit didn’t clarify that “swapping” was being used in an archaic sense that was likely to confuse the modern reader.
> context switches only the page table directory pointer
Swapping out the the PTD pointer is exactly what I'm thinking of. I'm wrong, because I didn't have the common colloquial meaning of "swapping" (paging out memory to disk) in my mind
I think it's a little strange such a meaning has come to dominate, at least in a classroom setting it is still fairly common to discuss the operation of the scheduler as "swapping pages".
Yeah, admittedly it's confusing terminology generally, because it's still natural to say you're swapping the page tables out when you do a context switch on current systems. I probably do at some point in the post!
The distinction I was trying to get at was that, in early Unix, all process bytes were being actively streamed to and from disk as part of scheduling because the hardware didn't yet have a concept of virtual memory. So, if you wanted to make a program ready to run, you had to fully load it into memory, and shove anything else out of the way right then and there. That makes the scheduling function 5% deciding what should run, and 95% playing memory sokoban to make that happen.
OTOH, on systems with paged virtual memory, the scheduler is almost entirely "what's a good thing to run?", and implementing that decision is updating a couple of pointers. The only place the memory hierarchy creeps in, is if the scheduling algorithm wants to be fancy and account for things like NUMA nodes in its ranking of tasks.
I think it's reasonable, looking at it in isolation, to describe this part of the kernel as a "swapper", or the operation as "swapping". I think where it turns into a bear trap is when presenting these concepts to folks less familiar with kernel internals, where words like "swap" and "pages" are firmly the domain of the memory subsystem. And so, if I hand them a task and say "this is the swapper", IMO the majority will interpret that as being a component of virtual memory management, and they wouldn't be at fault for thinking that.
Empirically this happened in the 2008 wikipedia edit: "swapping" mutated to "paging" because in modern vmm-land that's a valid synonym, and that in turn became "this task is sometimes called 'sched' for historical reasons, and it handles paging" on the web. And cue a decade of confused students and stackoverflow users asking followups like "but if this task does paging, why does linux have all these kswapd threads?" That to me suggests that, for better or worse, the memory subsystem owns those words now, and the rest of the kernel has to be very careful if it uses them to mean something else, if it wants to avoid casual onlookers creating false associations. Something something naming things is still the hardest thing in computer science :)
Linux is indeed not the owner of the concept of PID 0. It's fortunate that I didn't say that! It is, however, not frequently involved with paging in and out memory.