Google: "An SLA normally involves a promise to someone using your service that its availability should meet a certain level over a certain period, and if it fails to do so then some kind of penalty will be paid. This might be a partial refund of the service subscription fee paid by customers for that period, or additional subscription time added for free."
"Partial refund". That's a very low standard for a service level agreement, but typical of Google. Your whole business is down, it's their fault, and all you get a partial refund on the service.
A service level agreement is really a service packaged with an insurance product. The insurance product part should be evaluated as such - does it cover enough risk and is the coverage amount high enough? You can buy business interruption insurance from insurance companies, and should price that out in comparison with the cost and benefits of a SLA. If this is crucial to your core business, as with an entire retail chain going down because a cloud-based point of sale system goes down, it needs to be priced accordingly.
> "Partial refund". That's a very low standard for a service level agreement, but typical of Google.
It's a standard across the industry, pretty much since the beginning of SLAs.
They're not insurance, and not meant to compensate you if your business is disrupted. That's on you. (And there are many ways to protect your business from provider outages.)
SLA payouts are meant to be mildly punitive, and to align incentives -- in aggregate, the SLA payouts add up and can hurt Google if there are a lot of customers affected by frequent outages.
This doesn't make any sense to me. How is Google supposed to be in a position to price the business risk of individual customers into a standard SLA that they offer to all their customers? That would require Google to charge different amounts of money per customer (commensurate with the business risk placed on Google's services for that customer), running actuarial numbers to ensure that Google would have the means to pay out when the SLA is violated. Doing so would place undue burden on customers, who would need to prove business risk before buying the service, and many customers are unaware of the real business risk of downtime (having not run the numbers) anyway.
With that said... Maybe it's a good product idea, to sell varying levels of SLA violation insurance alongside the service covered by the SLA. The default, free level of insurance covers the cost of the service itself, as it does today, but perhaps a customer could buy premium insurance from Google that the SLA will not be violated, increasing the payouts to offset business risk. After all, who better to put a price on the risk than Google themselves? So probably, Google can offer a better price on offsetting the risk, than a third party insurer which doesn't have access to Google's internal data.
The evil part of outages is that, no matter how much resource you dumped into developments toward a more reliable system, it still happens. This is true for every company including Google. So when one company is choosing between cloud providers, they compare these SLAs with themselves. Usually it's pretty hard for a random shop to reach good SLAs. So I don't see "business risk" here. Risks present all the time, CTO should try hard to minimize them but no way to remove them.
Selling insurance for SLAs seems to be an interesting idea, but this kind of insurance might be really similar to earthquake insurance, since violation of SLAs tend to be not common (otherwise why committing) but it might be a huge cascade failure once happens. Would you like to buy one? Earthquake insurance quirks all apply.
On the other side, Google has zero incentives to violate SLAs. A. You really cannot control how large the violation would be. B. Damage to branding >>>>>> money payout.
> "Partial refund". That's a very low standard for a service level agreement, but typical of Google.
It seems to be the standard. The most generous SLA I've seen is 5% off the monthly bill for each 30 minutes of downtime (up to 100%). If I'm down for 10 hours, waiving one month of bills doesn't come close to the damage done.
An SLA seems to be more of a promise than an agreement, because if the service goes down you're SOL and the provider gets a slap on the wrist (partial refund).
I'm not aware of any SLA from any cloud provider or ISPs that offer anything other than a partial refund and/or credit. This is most certainly not specific to Google.
Check out the contract for a lottery or gambling system provider. They usually provide that the service provider is responsible for all losses for downtime or other errors on the provider's part, including fraud and theft. GTech pays about 0.5% of their revenue in penalties.
>A service level agreement is really a service packaged with an insurance product.
Now, such a service may be sold on the very high end... but in the general case, that's not what "Service Level Agreement" usually means.
(as an aside, I strongly suggest you get your business insurance from a party other than your service provider; serous outages can bankrupt service providers as-is... if they had to pay out customer damages, that would become a lot more likely.)
> "Partial refund". That's a very low standard for a service level agreement, but typical of Google. Your whole business is down, it's their fault, and all you get a partial refund on the service.
The SLA is the contract. While this may not be possible, you'd normally have to negotiate a higher payout for a higher service cost, but otherwise it's fixed based on the amount you pay, not on the amount your business makes.
"Partial refund". That's a very low standard for a service level agreement, but typical of Google. Your whole business is down, it's their fault, and all you get a partial refund on the service.
A service level agreement is really a service packaged with an insurance product. The insurance product part should be evaluated as such - does it cover enough risk and is the coverage amount high enough? You can buy business interruption insurance from insurance companies, and should price that out in comparison with the cost and benefits of a SLA. If this is crucial to your core business, as with an entire retail chain going down because a cloud-based point of sale system goes down, it needs to be priced accordingly.
See: [1]
[1] https://www.researchgate.net/publication/226123605_Managing_...