Not necessarily. For low frequency workloads with reasonably long step times, Lambda can still make sense. (E.g. When videos appear in this S3 bucket, process them.)
You might only drop videos in once a week, but when you do you want to run some code against them. There are plenty of distributed workflow reasons to run long running Lambdas infrequently rather than spinning up and down an EC2 instance.
Lambdas are underpowered and often poor choices for compute-heavy workloads. Unless there's an urgency to processing infrequent videos, it might make more sense to backlog messages to the queue and use spot instances for draining the queue and processing videos, especially from a cost perspective. Though I acknowledge that this is a more complex setup.
As was mentioned by qvrjuec in a sibling comment, hardware is limited. I seem to remember CPU speeds listed alongside available memory for AWS Lambdas, but the pricing page seems to just list memory now[0]. At the highest end, you're still limited to ~10.2GB of memory, which is considerably lower than what's available via EC2. And while I have no personal experience with the EC2 finer-grained pricing that was announced[1], it sounds like that approach may be a better approach to the described scenario above. We can nitpick on these architectural details, but my response was largely that there are other architectural alternatives that could be more ideal; especially in response to a comment that seems to dismiss the value of pricing at finer time intervals.
> We can nitpick on these architectural details, but my response was largely that there are other architectural alternatives that could be more ideal; especially in response to a comment that seems to dismiss the value of pricing at finer time intervals.
Not trying to nitpick anything; just curious what was meant by "underpowered". Seems like there's still a breadth of compute-intensive use cases that are more appropriate for lambda--e.g., cost is more sensitive than latency and I have too low a volume of requests for a dedicated EC2 instance to make economic sense. This has been where I've spent most of my career, but no doubt there are many use cases where this doesn't hold.
Limitations on hardware one can run a lambda function on and constraints on execution time mean they are "underpowered" compared to other options, like ECS Fargate tasks.
Does Fargate allow you to run on beefier hardware? I know you can bring your own hardware with vanilla ECS. I’m aware of the execution time constraints (15 minutes), but I thought we were talking about 60s?
Right but the first thought I had was, couldn’t you fan out and run a lambda for each frame or a group of related frames? (E.g. Batch HLS processing would be really easy!) If so, you’re back to short lambdas again. It’s really the sweet spot for using Lambda after all: lots of big jobs can be broken down into lots of little jobs, etc.
Plausibly. But that might be more effort than just writing the code to ingest a video file (or some other big data blob) in the simplest, most straightforward way possible.
Lambda has a 15 minute limit, I'm not sure exactly how it compares to ec2 but for a low duty cycle application it still makes sense! It is also pretty easy to combine a lambda to SNS or SQS
Instead of processing 10 messages off of SQS per lambda we process 10, then start polling for more using the same lambda, and don't stop until the lambda is just about to die.
Forgive me if this is naive, but why not trigger a lambda for each message separately? I think they’ll automatically reuse lambdas instead of spinning down