Apache Spark can use Kubernetes as a scheduler out the box. I don’t know if op is using Spark.
A lot of data tools are starting to target Kubernetes directly as a runtime so using them with GKE/EKS is a bit simpler as it’s officially supported, allows to run locally and on the cloud with no vendor lock in.
ECS in a scaling group works well if your app is stateless but as soon as you scale workers dynamically, do service discovery, orchestration, you end up building some of the features Kubernetes provides.
A lot of data tools are starting to target Kubernetes directly as a runtime so using them with GKE/EKS is a bit simpler as it’s officially supported, allows to run locally and on the cloud with no vendor lock in.
ECS in a scaling group works well if your app is stateless but as soon as you scale workers dynamically, do service discovery, orchestration, you end up building some of the features Kubernetes provides.