DragonFly BSD
DragonFly kernel List (threaded) for 2013-07
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

[GSOC] HAMMER2 compression feature week5 report


From: Daniel Flores <daniel5555@xxxxxxxxx>
Date: Sat, 20 Jul 2013 19:35:16 +0200

--001a11c3574cd53eb204e1f4dd10
Content-Type: text/plain; charset=ISO-8859-1

Hello everyone,

Here is my report for week 5. First of all, when the week just started,
with the invaluable help from my mentor, a lot of bugs were fixed in read
and write paths. A bit later yet another small bug, that affected files
that couldn't be compressed, was fixed.

As the result, the compression/decompression feature using LZ4 algorithm
started to work, finally. There is still some bug though, which manifests
itself when large files are being decompressed. What happens is that
sometimes the decompressed file isn't the same as the original, even though
it can be decompressed correctly after remounting the HAMMER2 partition,
which means that the bug is somewhere in read path. However, the whole
feature does work correctly in many cases, which is encouraging. But it's
not ready for use yet and there is a lot testing and bug hunting to be done
during next week and, possibly, beyond that. It's very important for a file
system and all its features to be rock-solid, so there will be a lot of
exhaustive tests.

Another thing I done is optimizing both read and write paths. When I
created them my main goal was just getting them to work, so they were
highly inefficient. Just to give an example, when I checked the
performance, the write path with LZ4 compression was approximately 5,6
times slower than the path without compression and read path with
decompression was approximately 3 times slower. It wasn't LZ4 that caused
this, but all those buffers that I generously used.

Now I got rid of the intermediary buffer in the read path, it decompresses
directly from physical buffer to logical. As the result, the speed of the
read path with decompression is the same as of the read path without it or,
at least, the difference is virtually unnoticeable.

I couldn't get rid of that buffer in the write path for now, but it's used
more efficiently now and, as the result, now the write path with
compression is approximately 2 times slower than the write path without it.
It's not very noticeable for the small files, most of which are compressed
and written in less than a second, but it does make a noticeable difference
for big files. The main cause of slowness seems to be the use of the
buffer, not the compression itself, which is a very efficient algorithm. If
we find a way to get rid of that buffer, I expect a huge increase in speed.

I'll post later the exact timings, when I'll have more results from
different test cases (most likely, in next week's report).

Another thing that was implemented this week is zero-checking, so the
default option for HAMMER2 compression seems to be working now. There is
still a lot of testing to be done on it, so I can't assure that it's
working correctly at this point, but the initial tests are showing
successful result. I also have to incorporate it in 2nd option, which is
LZ4, hopefully I'll do this today.

Next week and the remaining part of the weekend I'll be bug hunting,
testing all the implemented features with more complex cases and also
trying to optimize more the write path. There is also some work to be done
with LZ4 files, because they contain more functions than we need.

I'll appreciate any comments, suggestions and criticism. You can check my
work from my leaf repository, branch "hammer2_LZ4" [1].


Daniel

[1] git://leaf.dragonflybsd.org/~iostream/dragonfly.git

--001a11c3574cd53eb204e1f4dd10
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello everyone,<div><br></div><div>Here is my report for w=
eek 5. First of all, when the week just started, with the invaluable help f=
rom my mentor, a lot of bugs were fixed in read and write paths. A bit late=
r yet another small bug, that affected files that couldn&#39;t be compresse=
d, was fixed.</div>
<div><br></div><div>As the result, the compression/decompression feature us=
ing LZ4 algorithm started to work, finally. There is still some bug though,=
 which manifests itself when large files are being decompressed. What happe=
ns is that sometimes the decompressed file isn&#39;t the same as the origin=
al, even though it can be decompressed correctly after remounting the HAMME=
R2 partition, which means that the bug is somewhere in read path. However, =
the whole feature does work correctly in many cases, which is encouraging. =
But it&#39;s not ready for use yet and there is a lot testing and bug hunti=
ng to be done during next week and, possibly, beyond that. It&#39;s very im=
portant for a file system and all its features to be rock-solid, so there w=
ill be a lot of exhaustive tests.</div>
<div><br></div><div>Another thing I done is optimizing both read and write =
paths. When I created them my main goal was just getting them to work, so t=
hey were highly inefficient. Just to give an example, when I checked the pe=
rformance, the write path with LZ4 compression was approximately 5,6 times =
slower than the path without compression and read path with decompression w=
as approximately 3 times slower. It wasn&#39;t LZ4 that caused this, but al=
l those buffers that I generously used.</div>
<div><br></div><div>Now I got rid of the intermediary buffer in the read pa=
th, it decompresses directly from physical buffer to logical. As the result=
, the speed of the read path with decompression is the same as of the read =
path without it or, at least, the difference is virtually unnoticeable.</di=
v>
<div><br></div><div>I couldn&#39;t get rid of that buffer in the write path=
 for now, but it&#39;s used more efficiently now and, as the result, now th=
e write path with compression is approximately 2 times slower than the writ=
e path without it. It&#39;s not very noticeable for the small files, most o=
f which are compressed and written in less than a second, but it does make =
a noticeable difference for big files. The main cause of slowness seems to =
be the use of the buffer, not the compression itself, which is a very effic=
ient algorithm. If we find a way to get rid of that buffer, I expect a huge=
 increase in speed.</div>
<div><br></div><div>I&#39;ll post later the exact timings, when I&#39;ll ha=
ve more results from different test cases (most likely, in next week&#39;s =
report).</div><div><br></div><div>Another thing that was implemented this w=
eek is zero-checking, so the default option for HAMMER2 compression seems t=
o be working now. There is still a lot of testing to be done on it, so I ca=
n&#39;t assure that it&#39;s working correctly at this point, but the initi=
al tests are showing successful result. I also have to incorporate it in 2n=
d option, which is LZ4, hopefully I&#39;ll do this today.</div>
<div><br></div><div>Next week and the remaining part of the weekend I&#39;l=
l be bug hunting, testing all the implemented features with more complex ca=
ses and also trying to optimize more the write path. There is also some wor=
k to be done with LZ4 files, because they contain more functions than we ne=
ed.</div>
<div><br></div><div>I&#39;ll appreciate any comments, suggestions and criti=
cism. You can check my work from my leaf repository, branch &quot;hammer2_L=
Z4&quot; [1].</div><div><br></div><div><br></div><div>Daniel</div><div>
<br></div><div><span style=3D"font-family:arial,sans-serif;font-size:13px">=
[1] git://</span><a href=3D"http://leaf.dragonflybsd.org/~iostream/dragonfl=
y.git" target=3D"_blank" style=3D"font-family:arial,sans-serif;font-size:13=
px">leaf.dragonflybsd.org/~iostream/dragonfly.git</a><br>
</div><div><br></div><div><br></div></div>

--001a11c3574cd53eb204e1f4dd10--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]