I've seen 30s on AWS, so it's not that surprising. They have now improved it greatly though.
And yet I still believe it's a great technology, as always it's a matter of putting it on the right use case. Message consumption from a queue or topic, low traffic and low criticality API are two great use cases.
No, you've seen 30s on a random implementation running on AWS.
If you write your lambdas without knowing what you're doing then you can't blame the technology for being grossly misused by you.
Case in point: developing AWS Lambdas for the JDK runtime. The bulk of the startup time is not the lambda at all but the way the lambda code is initialized. This means clients and auth and stuff. I've worked on JDK Lambdas where cold starts were close to 20s due to their choice of dependency injection framework (Guide, the bane of JDK lambdas) which was shaven down to half that number by simply migrating to Dagger. I've worked on other JDK lambdas which saw similar reductions in cold starts just by paying attention to how a Redis client was configured.
Just keep in mind that cold start times represent the time it takes for your own code to initialize. This is not the lambda, but your own code. If you tell the lambda to needlessly run a lot of crap, you can't blame the lambda for actually doing what you set it to do.
That sort of time could be easily reached with lambdas that require VPC access, where a new ENI needs to be provisioned for each lambda container. I don't think alive seen 30s, but could easily see 5-10s for this case. And since this is required to run an isolated DB that is not some other AWS service, it isn't that uncommon. I believe they have since improved start times in this scenario significantly more recently.
And yet it magically came down to 10s when amazon improved their system. Specifically it became much faster to join a VPC.
And don't get me wrong: yes I was running some init code but not that much: load config from ssm, connect to a DB. I did bundle a lot of libs that didn't need to be there. But:
- fact is, it took 30s
- the use case didn't need it to be faster so I didn't care much
30s is probably an edge case. Did this use Java/JVM runtime without AOT/GraalVM? I cannot imagine any other runtime that would cause 30s cold start. Care to share more details on this?
And yet I still believe it's a great technology, as always it's a matter of putting it on the right use case. Message consumption from a queue or topic, low traffic and low criticality API are two great use cases.