cancel
Showing results for 
Search instead for 
Did you mean: 
TudorRaduta
Community Manager
Community Manager
  • 158 Views

Wednesday Challenge: The Process That Won't Die

Handling Zombie Processes 

Happy Wednesday, everyone.

Whether you are just starting your Linux journey or already working in production, this one is for you. Today we look at a classic situation that appears in interviews and on real servers.

Most of us learn that kill -9 is the "ultimate" way to stop a process. But what if you use it correctly and the process still refuses to disappear?

This challenge helps you understand how Linux processes really work, which is a key part of the "Operate running systems" objective.

The Scenario:

You are troubleshooting a legacy application. You run top and notice a process named legacy_app_worker.

You decide to stop it:

[root@server ~]# kill -9 4055

You check again with ps aux | grep 4055, but the process is still there:

root  4055  0.0  0.0  0  0 ?  Z  10:00  0:00 [legacy_app_worker] <defunct>

You try kill -9 again. No error, no change. The entry refuses to go away.

Your Challenge:

You are looking at a zombie process. Your task is to explain it and clean it up.

  1. The Theory: In your own words, why did kill -9 not work here? (Hint: can you kill something that is already dead?)
  2. The Diagnosis: You cannot kill the zombie directly, but you can find its parent. What exact ps command would you run to show the PID, PPID, state, and command for process 4055 so you can see its parent process ID clearly?
  3. The Fix: After you find the parent (for example PID 4001), what is your next step to remove the zombie from the process table without rebooting the server?
  • Bonus Question: Zombie processes show 0.0 CPU and memory usage, but they still cause problems in real systems. Why do we care about cleaning them up?

If you are or preparing for an exam, this is a great small challenge to understand processes beyond the basic "kill the PID" approach.

Let us see how you would explain and fix this. Post your answers below.

0 Kudos
4 Replies
Ad_astra
Flight Engineer Flight Engineer
Flight Engineer
  • 141 Views

1) The kill -9 command did not work because the process is already 'dead'. The -9 or SIGKILL signal has no effect here. The process has already been removed from memory and the signal does not get processed. This is often a sign that the parent application has not handled the child process correctly. 

2) The command ps -O ppid= 4055 will show the process id, the parent process id, the process state and the command that launched the process.

3) To remove the zombie process (without a reboot) we have 2 options. We can try sending the SIGCHILD signal to the parent process with kill -s SIGCHILD <parent process id>. If that fails, then we can kill the parent process.

Bonus:

Whilst zombie process are shown as not using any memory they still use a small ammount of memory. They also have a process ID (PID) which uses up the number of PIDs the OS can use. As the number or zombie processes increases, these can have an effect on the operating systems ability to create new processes. 

Roshani_A
Cadet
Cadet
  • 114 Views

A zombie process is already dead.
It has finished execution, but its parent has not collected its exit status yet.

kill -9 sends a signal to force-kill a process —
but a zombie cannot be killed, because it is already terminated.
Only its entry in the process table remains.
  • 78 Views

just a typo in the answer: amount - simple m

ps: long long time ago Midnight commander caused on one our community (student) server thousands of zombie processes  

0 Kudos
  • 51 Views

Why kill -9 didn’t work:
A zombie process is already dead. Only its entry is left in the process table, so signals like kill -9 cannot stop it.

Find the parent process:
ps -o pid,ppid,state,cmd -p 4055

Fix:
Find the parent PID (for example 4001) and restart or kill the parent process so it reaps the zombie:
kill 4001
(or kill -SIGCHLD 4001)

Why clean zombies:
Too many zombie processes fill the process table and can stop new programs from starting, causing system problems even though they use no CPU or memory.

Join the discussion
You must log in to join this conversation.