First, K8S doesn't force anyone to use YAML. It might be idiomatic, but it's certainly not required. `kubectl apply` has supported JSON since the beginning, IIRC. The endpoints themselves speak JSON and grpc. And you can produce JSON or YAML from whatever language you prefer. Jsonnet is quite nice, for example.
Second, I'm curious as to why dependencies are a thing in Helm charts and why dependency ordering is being advocated, as though we're still living in a world of dependency ordering and service-start blocking on Linux or Windows. One of the primary idioms in Kubernetes is looping: if the dependency's not available, your app is supposed to treat that is a recoverable error and try again until the dependency becomes available. Or, crash, in which case, the ReplicaSet controller will restart the app for you.
You can't have dependency conflicts in charts if you don't have dependencies (cue "think about it" meme here), and you install each chart separately. Helm does let you install multiple versions of a chart if you must, but woe be unto those who do that in a single namespace.
If an app truly depends on another app, one option is to include the dependency in the same Helm chart! Helm charts have always allowed you to have multiple application and service resources.
> One of the primary idioms in Kubernetes is looping
Indeed, working with kubernetes I would argue that the primary architectural feature of kubernetes is the "reconciliation loop". Observe the current state, diff a desired state, apply the diff. Over and over again. There is no "fail" or "success" state, only what we can observe and what we wish to observe. Any difference between the two is iterated away.
I think it's interesting that the dominant "good enough technology" of mechanical control, the PID feedback loop, is quite analogous to this core component of kubernetes.
PID feedback loop, OODA loop, and blackboard systems (AI design model) are all valid metaphors that k8s embodies, with first two being well known enough that they were common in presentations/talks about K8s around 1.0
What you're describing is a Controller[0]. I love the example they give of a room thermostat.
But the principle applies to other things that aren't controllers. For example a common pattern is a workload which waits for a resource (e.g. a database) to be ready before becoming ready itself. In a webserver Pod, for example, you might wait for the db to become available, then check that the required migrations have been applied, then finally start serving requests.
So you're basically progressing from a "wait for db loop" to a "wait for migrations" loop then a "wait for web requests" loop. The final loop will cause the cluster to consider the Pod "ready" which will then progress the Deployment rollout etc.
i developed a system like this (with reconciliation loop, as you call it) some years ago. there is most definitely failed state (for multiple reasons). but as part of "loop" you can have logic to fix it up in order to bring it to desired state.
we had integrated monitoring/log analysis to correlate failures with "things that happen"
> Or, crash, in which case, the ReplicaSet controller will restart the app for you.
This does not work good enough. Right now I have an issue, where keycloak takes a minute to start and dependent service which crashes on start without keycloak, takes like 5-10 minutes to start, because repicaset controller starts to throttle it and it'll wait for minutes for nothing, even after keycloak started. Eventually it'll work, but I don't want to wait 10 minutes, if I could wait 1 minute.
I think that proper way to solve this issue is to develop an init container which would wait for dependent service to be up before passing control to the main container. But I'd prefer for Kubernertes to allow me to explicitly declare start dependencies. My service WILL crash, if that dependency is not up, what's the point to even try to start it, just to throttle it few tries later.
Dependency is dependency. You can't just close your eyes, pretending it does not exist.
I’d contend that you’re optimizing for initial deployment speed rather than overall resilience. Backing off with increasing delays before retrying a dependent service call is a best practice for production services. The fact that you’re seeing this behavior on initial rollout is inconvenient, but it’s also self healing. It might take a bit longer than you like, but if you’re really that impatient, there are workarounds like the one you described.
Big +1 to dependency failure should be recoverable.
I was part of an outage caused by a fail-closed behavior on a dependency that wasn't actually used and was being turned down.
Dependencies among servers are almost always soft. Just return a 500 if you can't talk to your downstream dependency. Let your load balancer route around unhealthy servers.
You say supposed to. That's great when building your own software stack in house but how much software is available that can run on kubenetes but was created before it existed. But somebody figured out it could run in docker and then later someone realized it's not that hard to make it run in kubenetes because it already runs in docker.
You can make an opinionated platform that does things how you think is the best way to do them, and people will do it how they want anyway with bad results. Or you can add the features to make it work multiple ways and let people choose how to use it.
The counter argument is that footguns and attractive nuisances are antithetical to resilience. People will use features incorrectly that they may never have needed in the first place; and every new feature is a new opportunity to introduce bugs and ambiguous behaviors.
> One of the primary idioms in Kubernetes is looping: if the dependency's not available, your app is supposed to treat that is a recoverable error and try again until the dependency becomes available.
This is gonna sound stupid, but people see the initial error in their logs and freak out. Or your division's head sees your demo and says "Absolutely love it. Before I demo it though, get rid of that error". Then what are you gonna do? Or people keep opening support tickets saying "I didn't get any errors when I submitted the deployment, but it's not working. If it wasn't gonna work, why did you accept it"
You either do what one of my collogues does, add some weird ass logic of "store error logs and only display them if they fire twice, (no three, 4? scratch that 5 times) with 3 second delay in between except for the last one, that can take up to 10 seconds, after that, if this was a network error, sleep for another 2 minutes and at the very end make sure to have a `logger.Info("test1")`
Or you say "fuck it" and introduce a dependency order. We know that it's stupid, but...
This sounds like an opportunity to educate your colleagues and introduce a higher level of functionality to your deployment mechanisms. There’s a difference between Kubernetes stating a deployment of a given component is successful and the CI/CD pipeline confirming the entire application’s deployment is successful. The former frequently happens before—sometimes long before—the latter does. If the boss is seeing errors, it’s because the deployment hasn’t finished and someone or something is falsely suggesting to them that it is.
Nearly 1/5th of the article is dedicated to criticizing YAML as the de facto language people use to work with it, and implicitly blaming Kubernetes for this fault.
What? It's one of the most often repeated arguments against kubernetes. Even in the article, that this entire thread is about, yaml is mentioned repeatedly.
Second, I'm curious as to why dependencies are a thing in Helm charts and why dependency ordering is being advocated, as though we're still living in a world of dependency ordering and service-start blocking on Linux or Windows. One of the primary idioms in Kubernetes is looping: if the dependency's not available, your app is supposed to treat that is a recoverable error and try again until the dependency becomes available. Or, crash, in which case, the ReplicaSet controller will restart the app for you.
You can't have dependency conflicts in charts if you don't have dependencies (cue "think about it" meme here), and you install each chart separately. Helm does let you install multiple versions of a chart if you must, but woe be unto those who do that in a single namespace.
If an app truly depends on another app, one option is to include the dependency in the same Helm chart! Helm charts have always allowed you to have multiple application and service resources.