Proposal for solving CD Size problems
Phillip Lougher
phillip.lougher at gmail.com
Thu Sep 28 22:47:19 BST 2006
On 9/28/06, Tim Schmidt <timschmidt at gmail.com> wrote:
> On 9/28/06, Phillip Lougher <phillip.lougher at gmail.com> wrote:
> > More than five times slower! I'm glad I've resisted calls to move Squashfs
> > over to LZMA compression, the compression improvements are not worth the speed
> > slowdown. I'm rather surprised SLAX and Puppy have moved over to Squashfs-LZMA.
>
> Not worth the slowdown? I hardly think there's enough information
> here to make that conclusion.
I wrote Squashfs, and there's enough information there for me to make
that conclusion. I've spent many hours over the last few years
analysing traces of Squashfs' performance, and I have a very good
understanding of what effect such a slowdown will have on Squashfs.
>It is slower, but (as shown earlier),
> there is room for optimization. Also, if the extra time spent
> uncompressing data would normally be spent doing nothing, waiting for
> hardware probing or the like, then there's no real loss.
>
Three cases to be considered:
1. Block to be decompressed is on disk only. Overhead to get
decompressed data is seek-time + block I/O + decompression, and this
can't be done in parallel. Five times slower decompression even in
this case is no loss only if the ratio of seek+I/O time to
decompression time makes decompression overhead negligible, i.e 5
times nothing = nothing. Taking some measurements I did a couple of
years ago (http://tree.celinuxforum.org/CelfPubWiki/SquashFsComparisons),
for example reading a squashfs re-encoded Ubuntu livecd from CDROM
took 5 minutes 15.46 seconds with System time of 51.12 seconds, i.e.
16% of the time was decompression overhead. Five times this is
certainly no loss, figuring this into the stats would make something
like total overhead 8 minutes 39 seconds with decompression time of 4
minutes 15 seconds, or 2 times slower. Reading from hard disk where
seek-time and block I/O is a smaller percentage of overhead, makes the
performance loss even worse.
2. Block to be read is not in page cache but it is in the block cache
compressed (the block cache lies below the filesystem). Overhead to
read block is all decompression time, i.e. five times. Major
performance loss.
3. Block to be read is in page cache. Already decompressed, no slowdown.
>In other words, performance in this case can only really be measured
>on real hardware with a wall clock. Measuring performance with
>artificial benchmarks like this one isn't giving us the whole picture.
Using Unsquashfs reading from a warm buffer cache is actually a very
good benchmark. It gives a very accurate indication of decompression
overhead without any interference from I/O. Everything else can be
inferred from previous results.
Scott Remnant previously mentioned a real-life slowdown of 2 times
which includes I/O overhead. This is identical to my 2 times figure
from item 1 calculated using previous results and my understanding.
Phillip
More information about the ubuntu-devel
mailing list