To ensure that your servers are kept in a cached state, you can manually configure a scheduled task for your Zappa function that'll keep the server cached by calling it every 5 minutes.
The cost of running the warmer (300s * ~3M * 512MB) comes out to about $18 and that's not counting the number of actual requests from end users. It's interesting but costly as a substitute.
The call should only run for under the minimum 100ms each time as it's just a ping and the instances is usually already loaded although AWS will likely recycle the instance from time to time outside of your control. There would be roughly 8640 calls/month which is about 864 seconds of execution time. You get 800,000 free seconds per month. This is totally negligible because even if you are counting it above the free level it is about 7/10 of a cent per month.
From my understanding of the pricing (I could be wrong), even if the call is 100ms, because the timeout is set to 300s so the app stays in cache, you'll be charged for 300s (app is "running" for that long).
I combined seconds into 3M, which is roughly 8640 (number of calls a month to keep it in cache) * 300s (timeout set so it stays in memory). My understanding of AWS lambda is that if you keep it up for 300 seconds, you get charged for that much. Because it's #reqs * #secs. Here's the example from the pricing page (with my comment at the end of the Total compute line):
The monthly compute price is $0.00001667 per GB-s and the free tier provides 400,000 GB-s.
Total compute (seconds) = 3M (1s) = 3,000,000 seconds # ( roughly equals to 8640 * 300 ~ 2.6M seconds in the case of this app)
Total compute (GB-s) = 3,000,000 * 512MB/1024 = 1,500,000 GB-s
Total compute – Free tier compute = Monthly billable compute GB- s
I don't think the timeout is related to the caching. When your function returns, it is over. The caching is not transparent, I don't know how it works under the hood, but it seems like if you just call it every 10 minutes it stays hot.
Would love an AWS engi to shed some light on this though.
You will wind up with multiple containers in use at once if you have enough traffic so a call to the general pool that is running your Lambda function will most likely only keep one of the containers from recycling.
It looks to me like you're misunderstanding the recommendation here, which is to hit a fast endpoint to ensure that lambda keeps it somewhere in their caches. This is similar to what one might do with Heroku.
The idea is that something like actually distributing the code to a front-line server seems to add to boot time, so if the lambda is not warm, it will be a little slower. If you keep it warm by having at least occasional traffic headed to the server, you're able to avoid this penalty.
If you look above, you'll see someone else did the math for you, and you're actually only talking about 864s of execution time, not 3M.
It also sort of looks like you just pulled up the pricing example which describes 3M requests that take 1s and worked back to it, because when you do the math you describe it works out to ~ 2.5M
To head of a few questions at the pass:
Here are the hacks necessary to make this work: https://github.com/Miserlou/Zappa#hacks
Here's how to avoid the cold-start problem: https://github.com/Miserlou/django-zappa#keeping-the-server-...