DragonFly bugs List (threaded) for 2011-05
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
[
Date Index][
Thread Index]
Re: panic: assertion: p->p_lock == 0 in kern_wait
On Mon, Apr 25, 2011 at 09:13:04PM +0900, YONETANI Tomokazu wrote:
> On Sun, Apr 24, 2011 at 11:36:27AM +0900, YONETANI Tomokazu wrote:
> > > With regards to getting rid of the timeout in the tsleep and using a
> > > proactive wakeup(), we have to avoid calling wakeup() for 1->0
> > > transitions unless someone is known to be waiting on p_lock. This
> > > can be implementing by adding a WAITING flag to the field and using
> > > atomic_cmpset_int() to handle the (WAITING | 1) -> (0) transition and
> > > then calling wakeup() if WAITING was set.
> > >
> > > I will augment the sys/refcount.h API and add refcount_wait() and
> > > refcount_release_wakeup() which encapsulate the appropriate atomic
> > > ops. I will leave it up to you if you want to then use the new API
> > > functions for PHOLD/PRELE, which would give the tsleep case a
> > > proactive wakeup() instead of having to wait for it to timeout.
> >
> > So what I need to do is to change PHOLD/PRELE to use refcount_acquire/
> > refcount_release_wakeup and replace p->p_lock loop with
> > refcount_release_wakeup? I'll give it a try.
>
> I've been running the kernel with patch(es) attached to this message
> and so far it's running fine under load. It reduced the number of
> non-zero p->p_lock just before calling proc_remove_zombie() even without
> holding proc_token around the first wait loop.
I added a small code to PHOLD/PRELE to leave the last p->p_lock holder in
p->p_pad0 (well, far from perfect but better than nothing) and found that
it's always sysctl_kern_proc() who calls PHOLD() at a bad timing.
I guessed that's probably because it walks through zombproc and PHOLD()'s
on the processes, some of which are just about to be reaped. So I added
the following code to skip such processes; the relavant part in kern_wait()
waits for processes whose p->p_nthreads > 0, so I thought it should be fine,
no?
I think I need to wait for a few more days before pbulk can spot other
possible bad callers of PHOLD().
Best Regards,
YONETANI Tomokazu.
diff --git a/sys/kern/kern_proc.c b/sys/kern/kern_proc.c
index 6d760e2..942ce6b 100644
--- a/sys/kern/kern_proc.c
+++ b/sys/kern/kern_proc.c
@@ -945,6 +945,11 @@ sysctl_kern_proc(SYSCTL_HANDLER_ARGS)
if (!PRISON_CHECK(cr1, p->p_ucred))
continue;
+
+ /* don't touch processes about to be reaped */
+ if (p->p_nthreads == 0)
+ continue;
+
PHOLD(p);
error = sysctl_out_proc(p, req, flags);
PRELE(p);
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
[
Date Index][
Thread Index]