Maybe I'm just really out of my depth on this, but it feels like there's not a lot of information about _why_ these particular steps and tools are used. Why are 4 different Linux images needed? Why are there all of these steps to create a one-time use "init.iso"? Is it just so the cloud-init script can run? I see the call to mkisofs is referencing "cidata", but it's the only place in the whole page that "cidata" shows up. Does that mean mkisofs is cloud-init aware? And why use emulation instead of virtualization?
I guess part of why I'm asking is because I've set up virtual machines in UTM on Apple Silicon before, and I never had to go through all of this just to get Linux installed and configured. The post makes me wonder if there's something I'm maybe missing, but it doesn't give any explanation for me to be able to figure out if that's the case. Maybe the post is meant more just as a checklist for the person that wrote it, for their own reference? But the way the post reads doesn't quite sound that way.
Hmm... that's all coming out sounding more critical than I mean to. I just want more info and I am curious about the approach.
If you just need a single Linux VM, you don't need to fiddle with cloud-init. If you want repeatability or automation, that's when you'd reach for it, whether for setting up per-project VMs, shareable VM configs between developers etc.
Also, you don't need all 4 Linux images, just the one you want to run as your guest OS. Emulation / virtualization depends on the guest OS CPU architecture.
From my understanding, you really need only one image – the article is just providing four for your tastes (Fedora/Ubuntu, aarch64/x86_64). The images are linux installers, which “understand” cloud-init (because it’s “industry standard”), so if you place the {user,meta}-data file in the right place (a volume named CIDATA, it seems[0]), they can configure your installation without really having to go through the tedious process of configuring through the installation process, install packages, and so on.
I don’t understand why anyone would go to the route of emulation in 2025, but if someone wants to run an x86_64 image with UTM, well that’s the only route – I’d suggest just going to an aarch64 image. Things were a bit more rough back in 2020, but stuff got much better and I don’t remember any compatibility problems these days.
My thought was why use UTM? Most of this can be achieved with qemu alone :). But it showed me something new. The cloud init tool was new to me. From my toolbox I would have used ansible or something. But I think it very interesting that this runs all automatically during first boot.
But I agree one needs to read between the lines to understand what the purpose of this post is. As you said it reads like a overcomplicated install setup.
> My thought was why use UTM? Most of this can be achieved with qemu alone
Afaik UTM uses Qemu under the hood, but provides a nice UI on top for the basic use cases. It also has a library of prepared images, so that your VM is a few clicks away from intention to have one.
It can also modify the VM, resize storage after creation etc.
Of course all of it can be done with QEMU alone, but this makes it easier to deal with than remembering tons of QEMU command line arguments.
Guess you misunderstood me. I know that UTM is built on top of qemu. I use it as well. I mean when already using this init image tooling etc why clicking through the UI to setup a VM. One would think to offload this also to a script. Because in the posts steps UTM is just a means to start the resulting image.
> My thought was why use UTM? Most of this can be achieved with qemu alone :)
qemu needs to be studied a bit, UTM is fairly intuitive.
I recently decided to learn how to create VMs with bare qemu (using the command-line).
As I have an arm macbook for work, UTM helped me a ton with aarch64 virtual machines because I could enable debug log and see what qemu options/flags/switches would UTM use.
Unrelated: I have some ideas about writing a tool that aims at being a "spiritual successor" to vagrant (from hashicorp), but focused on targeting qemu rather than virtual box.
Anyone interested? Please let me know (upvote or comment)
Theoretically the entire docker workflow can be translated to VMs. In practice it is a shit show.
The biggest problem by far is building a VM image, because it consists of multiple highly irritating steps.
1. Building custom packages for the target distribution.
Since we aren't using containers, we would in principle need a full VM per application. This is not a good idea in practice. We want to avoid containers, but we still want something very much like docker images on the application level. The obvious answer is building distro specific packages.
Building distro packages is annoying, because the developer machine doesn't necessarily run the same OS as the servers. This means that building the package requires you to spin up a temporary virtual machine or a docker container. Let me tell you, it is by far easier to build your packages inside a docker container and that's why I never even bothered with the VM route, even in situations where I'm deploying VM images. There needs to be a VM based alternative to "docker build" that doesn't necessarily spit out a VM image, but rather it spits out the result of your build (e.g. packages) onto a mounted directory on the host.
If you never built your own alpine packages. Try writing an APKBUILD. It is very easy.
2. Building VM images
Now let's say we are done and just want to build our VM images. There are already distro specific tools like https://github.com/alpinelinux/alpine-make-vm-image. What you want to do is install the packages created in the first step, run a simple bash script for finishing touches and setup cloud-init for the first boot. Unlike a Dockerfile, this should be kept very simple, because the packages are already doing everything the Dockerfile is expected to do. The only thing I would overcomplicate here is directly integrating a package repository into the tool to make it effortless.
3. Running the VM
At this point everything should be quite simple. The primary use case is to run the VM image locally on a developer computer before deployment. Some quality of life features like docker style port proxying and mounting directories would be nice. This is by far the easiest part because tools like virt-manager already exist.
Cloud-init isn't a replacement for ansible unless you're building your own VM images. cloud-init only runs on the first boot. You would use it to install and setup ansible.
I guess part of why I'm asking is because I've set up virtual machines in UTM on Apple Silicon before, and I never had to go through all of this just to get Linux installed and configured. The post makes me wonder if there's something I'm maybe missing, but it doesn't give any explanation for me to be able to figure out if that's the case. Maybe the post is meant more just as a checklist for the person that wrote it, for their own reference? But the way the post reads doesn't quite sound that way.
Hmm... that's all coming out sounding more critical than I mean to. I just want more info and I am curious about the approach.