Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Really excited for more people to get to use Metal. Let me know if you have any questions.




Why is Metal not offered for single instance deploys? Our app does not need this kind of uptime. We would be happy with a node going down once in a while (no data loss, of course) with a little bit of downtime to save 66% on the cost of running 2 additional nodes that will never see action.

It's a durability thing, we need to make sure writes are replicated off to at least one node. There might be avenues to get Metal down to single node in the future.

I definitely think there are use-cases out there which are fine with daily backups. Not every use-case requires high availability or high durability.

Even to take a case in point where durability is irrelevant - people building caches in Postgres (so as to only have one datastore / not need Redis as well). Not a big deal if the cache blows up - just force everyone to login again. Would love to see the vendor reduce complexity on their end and pass through the savings to the customer.

edit: per your other reply re. using replication to handle resizing, maybe being upfront with customers about additional latency / downtime being necessary with single-node discounts, then for resizing you could break connections, take a backup, then restore the backup on a resized node?


Do such small caps on CPU/RAM mean that multiple customers are sharing the same server? Is there concern for noisy neighbors here, either IOPS or in case another customer's workload grows to take the full available storage on the NVMe? What kind of downtime would be needed to switch to a larger size?

We've engineered in protections from noisy neighbors in both CPU and I/O usage and we do not over-commit resources.

If your or another customer's workload grows and needs to size up we launch three whole new database servers of the appropriate size (whether that's more CPU+RAM, more storage, or both), restore the most recent backups there, catch up on replication, and then orchestrate changing the primary.

Downtime when you resize typically amounts to needing to reconnect i.e. it's negligible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: