Hacker News new | past | comments | ask | show | jobs | submit login

> - once a pod has been scheduled onto a node, it never reschedules it anywhere else. The node may be experiencing problems, thus the pod is affected. k8s doesn't do anything to heal that -- it could simply delete the pod so it is rescheduled somewhere else.

Can you please elaborate this? When you are using replication controllers or deployments, don’t they drive the state to the desired/goal state, which is N replicas of a pod? So when the node is shut down, I guess it should be rescheduling those dead pods somewhere else to satisfy the goal state?




You may have misunderstood me. The case I'm talking is, the node reports Ready, but the pod itself is not functioning properly.

One common issue we have is the pod gets stuck in a restart loop (for whatever reason, including starvation of resources). k8s just keeps restarting it for days on that node, instead of simply rescheduling it after X restarts or some other condition.

https://github.com/kubernetes/kubernetes/issues/13385


Isn't that what custom liveliness probes are for?


No- liveness probe just restarts the pod if the check fails, doesn't kill it.


I think a failed liveness will restart the container, not the whole pod.


You're right. We always have one container per pod so I didn't fully think about this.


Isn't that because the pod's restart policy is set to restart? What if you set that to off? Does it fail the whole pod?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: