Discussion:
When a parent process dies...
(too old to reply)
Daniel Ramsbrock
2005-09-20 01:21:23 UTC
Permalink
"Also consider what should happen with a killed process's child
processes: Their parent pointer is now invalid, and so they should be
adjusted accordingly."

Obviously, the owner pointer of each child needs to be reset, but do we
also decrement the refCount by one? It would seem not, since that would
imply that the child won't become a zombie when it exits in the case
where the parent is killed while the child is still processing.

Thanks,

Daniel
cs412017
2005-09-20 01:57:55 UTC
Permalink
Post by Daniel Ramsbrock
"Also consider what should happen with a killed process's child
processes: Their parent pointer is now invalid, and so they should be
adjusted accordingly."
Obviously, the owner pointer of each child needs to be reset, but do we
also decrement the refCount by one? It would seem not, since that would
imply that the child won't become a zombie when it exits in the case
where the parent is killed while the child is still processing.
Thanks,
Daniel
1. This was also one of the questions that I had. The problem with not
decrementing the refCount for the child processes is that the refCount
for the parent won't be decremented otherwise.Hence, the refCount will
always be 1 for these child processes after they finish and they will
never be reaped.

On another note, if the child process for the parent that is being
killed has already finished, i.e., alive = false, and refCount = 1, if
we manually decrement the refCount for this process when the parent gets
killed (instead of calling Detach_Thread on the child process), in that
case too, the child process will never get reaped, as now, its refCount
is already zero. This led me to the conclusion that Detach_Thread should
be called for all child processes when their parent gets killed, along
with setting owner to 0. Is this assumption / conclusion correct?

2. Another question that I had was whether we have to wake up the join
queue of the process being killed. The project spec. states:

"The semantics are that the given process is killed immediately."

If we wake up the join queue of the process to be killed, then the
problem is that the parent of the process (which was in the join queue)
would call Detach_Thread on the process.
This is fine in a general setting, however, since the process is now
being killed, if it gets killed before the parent calls Detach_Thread,
this could lead to problems (kthread would not exist when
Detach_Thread(kthread) is called from the parent). This led me to the
conclusion that we have to wake up the join queue of the thread being
killed.But the problem here is that we would now have to wait for the
parent to detach and only then would this process get killed, and the
statement from the project spec pasted above might not hold true.

So the question is whether we should manually reap (call ReapThread) on
the process being killed or should we wait until the parent detaches and
let the reaping take its course? If we manually reap, in that case, the
refCounts get messed up along with Detach_Thread, and if we wait until
the parent detaches, in that case the process is not "killed immediately".

I hope I'm on the right track! Any answers to this question?

Thanks.
c***@CSIC.UMD.EDU
2005-09-20 02:26:04 UTC
Permalink
for 1) You can decrement the refCount of the child and when decrementing you can
check whether the child is alive or not, if it is not alive you can manually send
it to reap_thread otherwise do nothing.
Saurabh Srivastava
2005-09-20 15:42:28 UTC
Permalink
------

The semantics of Detach_Thread dictate that it needs to be called whenever
a reference to a thread is broken. A Sys_Kill request for the parent
'means' that the reference to the child will be broken.

So instead of manually decrementing the refCount, calling Detach_Thread
makes more sense because it will additionally call Reap_Thread, if
required. ie, if the child process is running (in which case it would have
a self reference) it will keep running. Which is proper. If on the other
hand it has finished execution (the only refCount in that case would be
from the parent), then removing the reference from the parent would cause
the reaper to get the child. Again proper.

-------

Now consider what you need to do to have proper handling of threads that
are waiting on the process being killed. You cannot ignore the fact that
they need to be woken up; because they are waiting on the join queue for
the process and if they are not woken up then they will have pointers to
invalid memory when the killed process is removed from the system.

Now coming to what we mean by the process being 'killed immediately'. What
we mean by that is _not_ that the process should be completely removed
from the system by end of the call. Such semantics would make it difficult
for the kernel to do its normal bookkeeping of the processes.

Rather than this, what we really mean is that the process being killed,
lets say: Proc_K, should not have a chance to execute further. So
essentially the idea is that if a thread is at some point of its
execution, and a context switch happens and some other thread calls
Sys_Kill on Proc_K. The idea of 'killed immediately' is that when the call
returns Proc_K should never get to execute further. It can remain in the
system (or rather will have to remain in the system) till its resources
are deallocated. The parents can wake up and other bookkeeping tasks can
be done but the process should never be scheduled for execution in the
future.

--------

hope this clarifies the scenario. if not please send in comments asap.


| Daniel Ramsbrock wrote:
| > "Also consider what should happen with a killed process's child
| > processes: Their parent pointer is now invalid, and so they should be
| > adjusted accordingly."
| >
| > Obviously, the owner pointer of each child needs to be reset, but do we
| > also decrement the refCount by one? It would seem not, since that would
| > imply that the child won't become a zombie when it exits in the case
| > where the parent is killed while the child is still processing.
| >
| > Thanks,
| >
| > Daniel
|
| 1. This was also one of the questions that I had. The problem with not
| decrementing the refCount for the child processes is that the refCount
| for the parent won't be decremented otherwise.Hence, the refCount will
| always be 1 for these child processes after they finish and they will
| never be reaped.
|
| On another note, if the child process for the parent that is being
| killed has already finished, i.e., alive = false, and refCount = 1, if
| we manually decrement the refCount for this process when the parent gets
| killed (instead of calling Detach_Thread on the child process), in that
| case too, the child process will never get reaped, as now, its refCount
| is already zero. This led me to the conclusion that Detach_Thread should
| be called for all child processes when their parent gets killed, along
| with setting owner to 0. Is this assumption / conclusion correct?
|
| 2. Another question that I had was whether we have to wake up the join
| queue of the process being killed. The project spec. states:
|
| "The semantics are that the given process is killed immediately."
|
| If we wake up the join queue of the process to be killed, then the
| problem is that the parent of the process (which was in the join queue)
| would call Detach_Thread on the process.
| This is fine in a general setting, however, since the process is now
| being killed, if it gets killed before the parent calls Detach_Thread,
| this could lead to problems (kthread would not exist when
| Detach_Thread(kthread) is called from the parent). This led me to the
| conclusion that we have to wake up the join queue of the thread being
| killed.But the problem here is that we would now have to wait for the
| parent to detach and only then would this process get killed, and the
| statement from the project spec pasted above might not hold true.
|
| So the question is whether we should manually reap (call ReapThread) on
| the process being killed or should we wait until the parent detaches and
| let the reaping take its course? If we manually reap, in that case, the
| refCounts get messed up along with Detach_Thread, and if we wait until
| the parent detaches, in that case the process is not "killed immediately".
David Marcin
2005-09-20 15:56:55 UTC
Permalink
Are you saying then that it is the intended behavior of kill to create
zombie processes with the hope that the parent will be nice and Wait()
on them?

It would then be possible for kill to be called and still have the
processes show up in the process list as zombies indefinitely because
the parent did not call Wait(), correct?

If so, that makes bookkeeping much simpler :-)
Daniel Ramsbrock
2005-09-20 17:10:10 UTC
Permalink
So does this mean, that out of these two scenarios, only 1. will produce
a zombie?

1. A parent spawns a foreground process but does not Wait() on it. This
will always produce a zombie when the child calls Exit(), correct?
2. A parent (which may or may not be waiting on the child) is killed
while the child is running. Once the child finishes, I previously had it
becoming a zombie because it still had the parent's reference (i.e.
refCount = 1 even after the child called Exit()). However, if we call
Detach_Thread on the child when killing the parent (which decrements
refCount to 1), then the child would no longer be a zombie on exit
(since its self-reference would be removed when calling Exit(), thus
making refCount 0). The child would die a "normal" death on exit,
getting handed over to the Reaper. Is this correct?

Daniel
Post by Saurabh Srivastava
------
The semantics of Detach_Thread dictate that it needs to be called whenever
a reference to a thread is broken. A Sys_Kill request for the parent
'means' that the reference to the child will be broken.
So instead of manually decrementing the refCount, calling Detach_Thread
makes more sense because it will additionally call Reap_Thread, if
required. ie, if the child process is running (in which case it would have
a self reference) it will keep running. Which is proper. If on the other
hand it has finished execution (the only refCount in that case would be
from the parent), then removing the reference from the parent would cause
the reaper to get the child. Again proper.
-------
c***@CSIC.UMD.EDU
2005-09-20 17:35:42 UTC
Permalink
In 2)

Say a parent spawns a process in foreground but does not wait on it.
And the child finishes decrementing its refCount so the child will becmoe a Zombie
as you pointed out.

However, now consider you are killing the parent and you call Detach_Thread on the
child ( which is a Zombie ) you will detach that, this will remove both the parent
and the Zombie.

Is that correct ?
Saurabh Srivastava
2005-09-20 18:30:20 UTC
Permalink
| Say a parent spawns a process in foreground but does not wait on it.
| And the child finishes decrementing its refCount so the child will
| becmoe a Zombie as you pointed out.
|
| However, now consider you are killing the parent and you call
| Detach_Thread on the child ( which is a Zombie ) you will detach that,
| this will remove both the parent and the Zombie.

Again, this behaviour would seem logical.

Look at it this way: the only reason we _wanted_ a zombie in the system
was because the kernel could not differentiate between the following two
scenarios:

1 - the parent thread just forgot to call Wait on the fg child.
2 - the parent thread is just lazy and will be calling Wait after 5 hours
of operation, to pick up the child's exitcode.

Given that the kernel has no way of knowing what the user process is doing
(or rather will be doing), ie it cannot differentiate between cases 1 and
2.

It has no option _but_ to keep the child around (as a zombie) till it has
more information. More information could be in the form of the parent
dying. Now it knows that whichever one of the cases (1/2) it was, there is
no point in keeping the child around. The parent has been killed and it
cannot ask for the child's exitcode anytime in the future. So yes, we can
let go of the child too.

Saurabh Srivastava
2005-09-20 18:22:01 UTC
Permalink
| So does this mean, that out of these two scenarios, only 1. will produce
| a zombie?
|
| 1. A parent spawns a foreground process but does not Wait() on it. This
| will always produce a zombie when the child calls Exit(), correct?
| 2. A parent (which may or may not be waiting on the child) is killed
| while the child is running. Once the child finishes, I previously had it
| becoming a zombie because it still had the parent's reference (i.e.
| refCount = 1 even after the child called Exit()). However, if we call
| Detach_Thread on the child when killing the parent (which decrements
| refCount to 1), then the child would no longer be a zombie on exit
| (since its self-reference would be removed when calling Exit(), thus
| making refCount 0). The child would die a "normal" death on exit,
| getting handed over to the Reaper. Is this correct?


This is right. removing the refcount due to the parent, when the parent is
killed is logical and so in case 2 there should not be any zombie created.

If on the other hand, the user process fails to follow the rules of
spawning a foreground process (ie it has to Wait on it) then the kernel
has no way of knowing that it will not be doing that at some time in the
future, necessiating the creation of a zombie.
c***@CSIC.UMD.EDU
2005-09-20 02:19:04 UTC
Permalink
You can still have a zombie when a parent process spawns a child process in
background but does not wait on it and the child finishes before the parent,

In such a case you would have to use you sys_kill system call to kill this zombie
thread to free space

But yeah, i am not sure if should decrement the refCount referred to in your
original post.

I think i would decrement it as this is a condition i can check and rectify and
avoid zombies.
Continue reading on narkive:
Loading...