More

fock · 2025-12-17T13:41:26 1765978886

Good to see that hands are still not solved...

fock · 2025-11-10T21:21:52 1762809712

> It also strikes me as a uniquely difficult challenge to track down the decision maker who is willing to take the risk on revamping these systems (AI or not).

here that person is a manager which got demoted from ~500 reports to ~40 and then convinced his new boss that it's good to reuse his team for his personal AI strategy which will make him great again.

fock · 2025-11-10T21:15:16 1762809316

I work at a shop (a specialized provider for finance in your eyes) which still has the "transaction" workload on IBM z/OS (IMS/DB2). The parts we manage (in Openshift) interface with that (as well as other systems) and I have heard of people/seen the commits moving PL/I to Cobol. In 2021. Given Cobol's nature, those apps have more than 1k LoC easily.

We also sublease our mainframes to at least 3 other ventures; one of which is very outspoken they have left the mainframe behind. I guess that's true if you view outsourcing as (literally) leaving it behind with the competitor of your new system... It seems to be the same for most banks, none of which are having mainframes anymore publicly, but for weird reasons they still hire people for it offshore.

Given that our (and IBM's!) services are not cheap I think either a) our customers are horribly dysfunctional in anything but earning money slow and steady (...) and b) they actually might depend on those mainframe jobs. So if you are IBM or a startup adding AI to IBM I guess the numbers might add up to the claims.

fock · 2025-11-01T06:58:22 1761980302

how large are the clusters then?

fock · 2025-11-01T06:53:45 1761980025

we have on-prem with heavy spikes (our batch workload can utilize the 20TB of memory in the cluster easily) and we just don't care much and add 10% every year to the hardware requested. Compared to employing people or paying other vendors (relational databases with many TB-sized tables...) this is just irrelevant.

Sadly devs are incentivized by that and going towards the cloud might be a fun story. Given the environment I hope they scrap the effort sooner rather than later, buy some Oxide systems for the people who need to iterate faster than the usual process of getting a VM and replace/reuse the 10% of the company occupied with the cloud (mind you: no real workload runs there yet...) to actually improve local processes...

g-mork · 2025-11-01T17:52:03 1762019523

Somewhat unrelated, but you just tied wasteful software design to high it salaries, and also suggest a reason why Russian programmers might also seem to on the whole be far more effective than we are

I wonder if msft simply cut dev salaries by 50% in the 90s, would it have had any measurable effect on windows quality by today

fock · 2025-10-24T19:15:04 1761333304

I guess for IMS/CICS/TPF/... the IBM mainframe is a just fine appliance compared to the alternatives. While not exactly transaction processors, SAP HANA, Oracle Exadata and co. all market themselves towards the same customer groups; SAP even sells full banking systems for medium-sized banks.

Your point that TCO is lower than a well executed alternative seems very dubious to me though. Maybe lower than cloud and also certainly lower than whatever crap F100-consultants sold you, but running database unloads with basic ETL for a few dozen terrabytes per month creating a MSU-bill in the millions is just ridiculous. The thing which probably lowers the TCO is that EVERY mainframe-dev/ops-person in existence is essentially a fin-ops-expert formed by decades of cloud-style billing. Also experience on a platform where your transaction processing historically has KB-range size limits, data-set-qualifiers are max. 44 chars, files (which you allocate by cylinders) don't expand by default and whatever else you miss from your 80ties computing experience naturally leads to people creating relatively efficient software.

In general even large customers seem to agree with me on that (see Amadeus throwing out TPF years ago) with even banks mostly outrunning the milking machine called IBM. What is and will be left is governments. Captured by inertia and corruption (at the top) and being kept alive by underpaid lifelong experts (at the bottom) who have never seen anything else.

> during the AWS outage this week.

Also the reliability promises around mainframes are "interesting" from what I've seen so far. The (IBM) mainframe today is a distributed system (many LPARs/VMs and software making use of it) which people are encouraged to run on maximum load. Now when one LPAR goes down (and might pull down your distributed storage subystem) and you don't act fast to drop the load you end up in a situation not at all unlike what AWS experienced this week: critical systems are limping on, while the remaining workload has random latency spikes which your customers (mostly Unix systems...) are definitely going to notice...

The non-IBM-way of running VMs on a Linux box and calling it a mainframe just seems like a scam if sold for anything but decommissioning. So I guess those vendors are left with governments at this point.

rbanffy · 2025-10-25T12:24:50 1761395090

> The (IBM) mainframe today is a distributed system (many LPARs/VMs and software making use of it)

Not really. While you can partition the machine, you can also have one very large partition and much smaller ones for isolated environments. It also has multiple redundancy paths for pretty much everything, so you can just treat it as a machine where hardware never fails. It’s a lot more flexible than a rack of 2u servers or some blade chassis. It is designed to run at 100% capacity with failover spares built in. This is all transparent to the software. You don’t need to know a CPU core failed or some memory died - that’s all managed by the software. You’ll only noticed a couple transactions failed and were retried. You are right in that mainframe operations are very different from Linux servers, and that a good mainframe operator knows a lot about how to write performant software.

fock · 2025-10-26T11:14:25 1761477265

And incidentally all documentation recommends not extending your LPARs beyond what is available on a single CPC-"node" (see [0]-2-23 for a nice (and honest...) block-diagram). If you extend your LPAR across all CPCs I doubt that many of the HA and hotswap-features continue to work (also there is bugs...). E.g.: you won't hotswap memory when it's all utilized: > Removing a CPC drawer often results in removing active memory. With the flexible memory option, removing the affected memory and reallocating its use elsewhere in the system is possible.

So while you can have single-system-images on a relatively large multinode setup I doubt many people are doing that (at the place I know, no LPARs have TB of memory...). Also in the given price-range you easily can get SSI-images for Linux too: https://www.servethehome.com/inventec-96-dimm-cxl-expansion-...

If you don't need the single-system-images, VMWARE and Xen advertise literally the same features on a blade chassis minus redundant hardware per blade, which is not really necessary when you just migrate the whole VM...

Also if you define the whole chassis as having 120% capacity, running it at 100% capacity becomes trivial too. And this is exactly what IBM is doing keeping around spare CPUs and memory in all setups spec'ed correctly: https://en.wikipedia.org/wiki/Redundant_array_of_independent...

You are right though that the hardware was and is pretty cool and that kind of building for reliability has largely died out. Also up until ARM/Epyc arrived maximum capacity was over-average, but that is gone too. Together with the market-segment likely not buying for performance I doubt many people today are running workloads which "require" a mainframe...

[0] https://www.redbooks.ibm.com/redbooks/pdfs/sg248951.pdf

rbanffy · 2025-10-27T17:17:15 1761585435

> building for reliability has largely died out.

A real shame, but offloading reliability to software engineers makes the hardware cheaper, something IBM mainframes aren't known for.

> doubt many people today are running workloads which "require" a mainframe...

It seems to me mainframes are built with profoundly different requirements than the ordinary hyperscaler server, with a lot more connectivity and specialized IO processors than CPU power. The CPUs are really fast, but it's the IO capacity that really set them apart from the top-of-the-line Dell or HPE.

If IBM really wanted to make the case for companies to host their Linux workloads on LinuxONE hardware, they'd make Linux on s390x significantly cheaper than x86 on their own cloud. I am sure they could, but they don't seem willing to do so.

fock · 2025-10-05T18:37:02 1759689422

Quite some time ago I implemented NFS for a small HPC-cluster on a 40GBe network. A colleague set up RDMA later, since at start it didn't work with the Ubuntu kernel available. Full nVME on the file server too. While the raw performance using ZFS was kind of underwhelming (mdadm+XFS about 2x faster), network performance was fine I'd argue: serial transfers easily hit ~4GB/s on a single node and 4K-benchmarking with fio was comparable to a good SATA-SSD (IOPS + throughput) on multiple clients in parallel!

fock · 2025-09-08T06:18:49 1757312329

and there have been continous ports since then: https://github.com/Godzil/ftape/tree/master - note the caveats which apparently all disappeared here...

kelnos · 2025-09-08T10:05:15 1757325915

Looks like that hasn't been updated in 6 years, and only supports the 2.6.x kernel.

I doubt it would have been significantly easier to start the porting effort from that vs. the original 2.4.x source.

fock · 2025-09-08T06:36:09 1757313369

and of course this didn't take into account you posted that, because I got directed straight here by AI!

fock · 2025-09-01T06:02:46 1756706566

no. In the short time I work at a z/OS-shop, they had to IPL twice. And the IPL takes ages...

Now, if you can live with the weird environment and your people know how to programm what is essentially a distributed system described in terms noone else uses: I guess it's still ok, given the competition is all executing IBMs playbook too.

p_l · 2025-09-01T14:33:58 1756737238

Entire mainframe IPL, or just LPAR?

My understanding is that usually you subdivide into few LPARs and then reboot the production ones on schedule to prevent drift and ensure that yes, unplanned IPLs will work

fock · 2025-08-17T16:32:32 1755448352

and sadly not too many words on how they made sure those shelves don't vibrate and squeak horribly - which they will if no placed on perfectly smooth surfaces... I somehow could picture something like this to work out nice using metal structural framing - but the pricepoint then probably comes close to quite nice carpentry if you add some bells and whistles.