DragonFly bugs List (threaded) for 2009-10
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
[
Date Index][
Thread Index]
Re: Re: panic: assertion: leaf->base.obj_id == ip->obj_id in hammer_ip_delete_range
:It's at ~y0netan1/crash/{kernel,vmcore}.9 on leaf.
:
The file it blew up on was a nohistory-flagged file being deleted
about 2 seconds after it was created. The parts of the B-Tree that
the hammer cursor was sitting on look fine, so it's a software bug
in the iteration somewhere that the assertion caught.
I think I may have found the issue but it's a very rare and difficult
edge case so I just can't be sure yet. I will need to do some testing
before I'm even willing to post the patch.
Basically we have this situation in the B-Tree leaf:
F1 = records related to file 1 (cached records not yet inserted)
F2 = records related to file 2 (cached records not yet deleted,
and not on media)
F3 = records related to file 3 (records on-media)
We are trying to delete F2, so hammer_ip_delete_range() is called
for F2's records. There happen to be no records on-disk but it
doesn't know that yet.
BTREE NODE ON MEDIA:
F3 F3 F3
^ index
The iteration looking for records for F2 is essentially at the end,
having not found any records for F2. But then before the iteration
can finish the following happens:
* All the records for F3 are deleted due to the rm -rf (i.e.
some other inode flushing its deletion at the same time F2
is).
* Because the node is now empty the btree tries to remove the
entire node.
* This pushes the cursor for the F2 delete_range iterations up to
the parent.
* BUT btree_remove() is unsucessful at removing the entire node
due to hitting a deadlock, so now the cursor for F2 is pointing
at the parent instead of pointing at the node.
* Records for file F1 are inserted via another unrelated parallel
flush op. Since the node could not be deleted the records
are inserted in the node. And since the cursor for F2 is now
pointing at the parent, it is effectively pointing at the
records for F1 (which are indexed prior to the F2 cursor's begin
point).
* On the next iteration of the cursor it pushes down from the parent
node to the node that could not deleted and finds the records for
F1. This is the bug. The cursor for F2's deletion is now indexed
at F1 which is prior to the cursor begin point.
BTREE NODE ON MEDIA (BUG):
F1 F1 F1
^ index for F2's iteration
BTREE NODE ON MEDIA (DESIRED):
F1 F1 F1
^ index for F2's iteration
The iteration code assumes that the index already points to a record
greater or equal to the starting point, so it only checks that the index
has not gone past the ending point. But due to the insertion of the
records for F1 the index is now pointing to a point BEFORE the starting
point of the range we want to delete.
The KKASSERT catches the situation and quite properly panics the system
in order to avoid corrupting the on-disk media.
The solution is basically that I cannot call hammer_cursor_removed_node()
until the node is actually removed. That is, if F2's cursor is still
pointing at the node instead of the parent, then when F1's records are
inserted F2's cursor's node index will be incremented and stay past them,
instead of pointing at the parent which then pushes down into the node
and 'finds' the F1 records.
This creates a fairly severe complication in btree_remove() due to the
depth-first recursion it does. The patch looks trivial but if I make
a mistake we wave goodbye to the B-Tree. Hence I have to carefully
test the patch before I'm willing to even post it. So... it is going
to be a day or two I think.
I think this bug has been hit occassionally by several people over the
last year, including myself. Fortunately that assertion prevented
the media from getting corrupted (at the cost of panicing the system :-)).
-Matt
Matthew Dillon
<dillon@backplane.com>
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
[
Date Index][
Thread Index]