DragonFly bugs List (threaded) for 2008-08
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

Re: panic: assertion: layer2->zone == zone in hammer_blockmap_free

From:	Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date:	Sat, 2 Aug 2008 20:24:56 -0700 (PDT)

:Matt, you sent me two other messages privately, but I think this message
:covers what you asked me in them. apollo doesn't like my IP address, so
:I need to configure my mail to go through ISP to do so (and I haven't, yet).

Oh, its the .ppp. in the reverse dns. I need to change over to spamcop
or something.

:$ ls -l /HAMMER
:total 0
:lrwxr-xr-x 1 root wheel 26 Jul 19 15:09 obj -> @@0xffffffffffffffff:00001
:lrwxr-xr-x 1 root wheel 26 Jul 19 16:24 slave -> @@0xffffffffffffffff:00002
:
:/HAMMER is the only HAMMER filesystem on this machine and is mounted without
:nohistory flags.
:
:...
:It experienced two types of major crashes until now: the first one was
:triggered by an attempt of cross-device link in the middle of July.
:The other was triggered by network code (reused socket on connect).
:According to /var/log/messages, the recovery was run only once, though.
:
: Jul 19 11:34:52 firebolt kernel: HAMMER(HAMMER) Start Recovery 30000000002c7350 - 30000000002c93f0 (8352 bytes of UNDO)(RW)
: Jul 19 11:34:53 firebolt kernel: HAMMER(HAMMER) End Recovery

Ok, I'm not so worried about the net crash. The cross-device link
crashes (before we fixed it) are interesting... those could be important.

Did you newfs_hammer the filesystem after the cross-device link crashes
or is it possible that some cruft from those crashes leaked through to
current-day? The timestamp in that filesystem's FSID reads July 18th,
which was right around when you reported that issue.

:I use mirror-copy to sync the slave. ${.OBJDIR} for buildworld usually
:grows upto 2Gbytes, and ${WRKDIR}s for pkgsrc can reach around 1Gbytes
:if I build a meta-package. Usually mirror-copy after buildwold or building
:packages, remove the directories in master, then mirror-copy again to
:see if removing files or directories are properly propagated to slave.

Yah, that's pretty much what I've been doing for testing too.
I usually also throw in a reblock run in a sleep loop, and an
occassional prune-everything in its own sleep loop. Running everything
in parallel is a pretty good test.

:I remember interrupting reblock on /HAMMER/obj, but I haven't done
:mirror-copy to slave after that, so I don't think it's something to do
:with it.

Ok. Interrupting reblocking should be fine, it isn't a real interrupt,
the kernel code polls for the signal at a safe point.

:> * Are the ~500K inodes mostly associated with the slave or unrelated?
:
:They are mostly assosiated to /HAMMER/source and /HAMMER/obj.
:
:Cheers.

I'm crossing my fingers and hoping that the issue was related
to the cross-device link crashes. If your filesystem still has some
cruft from those crashes then I will undo the test locally so I can
reproduce the cross-link crashes and see if I can corrupt the
filesystem that way.

If you can, please wipe that filesystem and continue testing fresh,
and see if you can reproduce that panic (or any bug).

While trying to reproduce your panic today I found another, unrelated
bug which I will commit a fix for tomorrow. There's a small window
of opportunity where reblocking live data can interfere with programs
accessing that live data. It only effects the data though, not the
meta-data, so it can't be related to the panic you got.

I am also planning on writing a 'hammer fsck' feature to clean up
corrupted freemaps & (maybe) B-Trees... kinda a last-resort directive.
It will probably take most of next week to do.

-Matt
Matthew Dillon
<dillon@backplane.com>

Follow-Ups:
- Re: panic: assertion: layer2->zone == zone in hammer_blockmap_free
  - From: YONETANI Tomokazu <qhwt+dfly@les.ath.cx>

References:
- panic: assertion: layer2->zone == zone in hammer_blockmap_free
  - From: YONETANI Tomokazu <qhwt+dfly@les.ath.cx>
- Re: panic: assertion: layer2->zone == zone in hammer_blockmap_free
  - From: Matthew Dillon <dillon@apollo.backplane.com>
- Re: panic: assertion: layer2->zone == zone in hammer_blockmap_free
  - From: YONETANI Tomokazu <qhwt+dfly@les.ath.cx>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]