cancel
Showing results for 
Search instead for 
Did you mean: 
carbajgu
Mission Specialist
Mission Specialist
  • 982 Views

openshift / dc / troubleshooting / out of ideas

Hello community!

I just passed the ex280 exam (with openshift 4.10). I completed almost everything and I had more than an hour for troubleshooting this particular scenario:

 

Pods were up and running in this project, (there were one pod running and a deploy pod as complete).

When I wanted to check whether the pod works, I tried:

oc rsh pod/thepod-0001 curl http://localhost:8080

oc rsh pod/thepod-0001 curl http//xx.xx.xx.xx:8080

There was a message like... 

ERRO[0000] exec failed: xxxxxxxxx:XXX: starting container process caused "not recognize curl: executable file not found in $PATH"

Looks like curl was not in the container. So I tried to open a session in this pod with: 

oc rsh pod/thepod-0001

But i got a message like this:

ERRO[0000] exec failed: xxxxxxxxx:XXX:XXX: starting container process caused "not recognize sh: executable file not found in $PATH"

So, it looks like container can't reach the sh (or bash). Like the container can reach the /bin/; /usr/bin/ where these commands are. I check logs with oc logs thepod-0001 and there were no error message, I just saw two lines that said something like it connected to "port 8080" and "port 8443".

I tried to debug with:

oc debug dc/thepod --image registry.access.redhat.com/ubi8/ubi

oc debug dc/thepod --image registry.redhat.io/ubi8/ubi

oc debug pod/thepod-001 -image registry.access.redhat.com/ubi8/ubi

but the prompt never came back, just got freeze, so I had to cancel it.

I bounce the pods:

   oc delete pods --all

   oc rollout latest dc/thepod

All kept the same. I also compared the whole definition with another dc from other project:

oc get dc/thepod -o yaml > ./dc_proj1.yaml

oc get dc/otherpod -o yaml -n otherproject > ./dc_proj2.yaml

vim -d ./dc_proj1.yaml ./dc_proj2.yaml

Both definition were the same (beside obvious differences). They also declared the same image, so I didnt understand what was wrong.

Besides, I did an oc get pods -o wide and move the pod in other node (with nodeSelector). Nodes looked ok (ready) and since the pods were complete/running I didnt think this could be the issue.

I also check the cm and its declaration in dc, all was ok.

I googled about this kind of error, and I just ran into Dockerfile error message.

 

What else should I have checked? I tryied everything I knew in troubleshooting.

0 Kudos
2 Replies
jflores
Cadet
Cadet
  • 886 Views

Hello

I have also received the same error stated.

I don't find any solution

0 Kudos
carbajgu
Mission Specialist
Mission Specialist
  • 868 Views

Hi jflores!

So, i am not alone! are you able to reproduce the error? Do you have the oc new-app that triggers this issue? would be great if the whole community troubleshoot your cluster. In my case this happened in ocp 4.10. Appreciate you to share more information about your case.

0 Kudos
Join the discussion
You must log in to join this conversation.