Hello everyone,
After the task of manual scaling, the pods was stacked at pending status.
After the task of autoscaling, the pods was stacked at pending status.
Even in the tasks that the deploymentconfig has been prepared, the pods was just dead(timeout) there.
So you can not understand what happened there and what to do for the exam.
When I check the events I found the error of the pods like this: (sorry,but I can't remember all the words)
0/5 nodes are available, 2 nodes taint {node.worker} that can not tolerate, 3 nodes taint {node-role.kubernetes.io/master} that can not tolerate.
What should I do for this? I'm sure that I won't pass the exam if I can't solove the problem.
I have taken the DO280 and I think I didn't see the trouble in this course.
Regards.
Hello shcmzzj
I had exactly the same problem during my exam, which I also didn't pass and I think this "tolerations" contributed.
After revisited the "tolerations and taints" topic, I can see now that, either you need to modify the deploymentConfig to include the correct "tolerations" or remove the taints from the nodes.
Which is the correct? still I don't know.
Thank you, dgaona77
I checked the doc of "tolerations and taints" too, and I think both of the ways can resolve this problem.
I also discussed this with my colleague who passed this exam. His choise was modifying all the DeploymentConfig.
I think I will use the same solution with him in my next exam, though I think that removing the taints should be a faster way.
Tolerations are defined at pod level not at dc or deployment. If you add toleration to a pod then try to scale it up, all new replicas won't have the same toleration so they will be in pending state
same thing. I have read DO280, and nothing useful were there (( Only one string in theoretic part of Pod Scheduling chapter - that taints and tolerations are exist in Openshift.
But in prerequisites for exam were said DO280 is the course for that exam. Can't understand how could this happen. Maybe it was "a challenge"?
But what should you do if for example you can't change dc by task conditions?
There is only one way to remove taints
chapter 6 Controlling Pod Placement, beyond taints and tolarant node label can also cause the pod status in pending status.. untill both matches ..
@joseph_joy you are right about node labeling, it can also be a likely cause of the pod being stuck in a pending state.
@shcmzzj I recently failed the exam and I had similar questions. I did change the label on the node to match that of the DeploymentConfig. I also added the toleration in the DeploymentConfig since the nodes were tainted. However, I had an error relating to tolerationSeconds because the taint on the node was NoSchedule. I found an article on this here -> https://medium.com/kubernetes-tutorials/making-sense-of-taints-and-tolerations-in-kubernetes-446e750...
I think the easier thing to do is to remove the taint on the node. I don't know whether this will have an impact on the subsequent questions in the exam
tolerationSeconds is suitable only for effect=NoExecute. If your toleration has effect=NoSchedule, you cant have tolerationSeconds in it (only key, operator, value, effect).
Applications are supposed to run on worker nodes, so I would probably just remove the taints from the worker nodes unless the question tells me not to. I usually don't see any taints on worker nodes in the production clusters I work with.
It would be good to add some taints and tolerations labs to the DO280 training material.
Red Hat
Learning Community
A collaborative learning environment, enabling open source skill development.