[Jaunty] Proposing some ext4 patches

Stefan Bader stefan.bader at canonical.com
Tue Jun 23 15:32:00 UTC 2009


Theodore Tso wrote:
> On Mon, Jun 22, 2009 at 07:39:12PM +0200, Stefan Bader wrote:
>> After we had https://bugs.launchpad.net/bugs/389555 I had a look at Ted's 
>> 2.6.28-stable repo to figure out, whether there might be some more 
>> dangers lurking. Since his 2.6.28.10 tag there are 21 patches difference 
>> between our and his ext4 tree. Quite a few of them are in a gray zone of 
>> being of little risk but do not seem to be critical enough to qualify for 
>> SRU. This would leave 7 (or 8) which we probably should consider...
>> And (question to the SRU team) if we do, could we use one tracking report?
>>
>> Stefan
>>
>> 05 (suggest skip)
>> http://kernel.ubuntu.com/git?p=smb/ubuntu-jaunty.git;a=commitdiff;h=774c43079ddc04a92030dd31109421518e1fcf14
>> Modifications to avoid lock contention.
> 
> Sorry, the patch commit message is a bit misleading.  This fixes a
> lock ordering problem (detected by lockdep) that could potentially
> lead to a system lockup.  The patch was originally designed to fix a
> performance problem, but then we discovered it also fixed a lockdep
> warning, and we dropped a reference to a kernel bugzilla entry w/o
> updating the commit description:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=12787
> 

Ah, thanks for that update.

>> 11 (?)
>> http://kernel.ubuntu.com/git?p=smb/ubuntu-jaunty.git;a=commitdiff;h=d9ec01eafda7ec7b5fd63b623c86bd95dbd8349a
>> Fix to not discard preallocations on close. Not sure of the impact here.
> 
> We could not discard preallocations on close if there were any delayed
> allocation blocks; this lead to preallocations not getting discarded
> until much, much later, which could prevent the block allocator from
> not being able to allocate blocks efficiently.   
> 
> The patch checks so that once an inode has all of its delayed
> allocation blocks allocated, and there are no open r/w
> filedescriptors, we discard the preallocated blocks so they can be
> used by another file.
> 
> This helps to promote a better (less fragmented) layout of block
> allocations on disk.  It doesn't fix a critical bug, so this patch is
> one that you can decide to skip.
> 

Thanks for the clarification. Yes, in that case we rather would skip it.

>> 20 (maybe skip)
>> http://kernel.ubuntu.com/git?p=smb/ubuntu-jaunty.git;a=commitdiff;h=2742d4833ba07f06a18ad2df750b9f3a712864a4
>> Use a large (non-zero) block number for delayed allocation buffers. Those 
>> should never be written but if this is tried it is more obvious where 
>> this comes from.
> 
> .... instead of blowing away the boot block / partition table, which
> could lead to all sorts of user complaints.  :-)
> 

Heh, got that only once and failed to blame ext3/4 for it. IOW, it was a bug 
somewhere else (the mmc driver).

> At the time we weren't completely convinced that the code was
> bug-free, but we haven't had any reports of people triggering an
> attempted write with the very large non-zero block number, so it's
> probably safe to to skip this.
> 
> 
> I am suspicious that hard to debug hang described in Launchpad #330824
> (Soft lockups when deleting files from ext4 partitions) may very well
> be caused by either (a) a failed backport, or (b) an subtle patch
> dependency that triggerred a big due to a skipped patch.  I would
> therefore encourage you to use the xfstests suite to test the
> resulting ext4 filesystem.  The xfstests can be found here:
> 
> 	url = git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> 
> I believe you will need the following packages for it to build and
> run: xfsprogs, xfslibs-dev, libaio1, libaio-dev.  I might be missing
> one or two, but those seem to be the critical ones.  Hopefully any
> others will be obvious.  Cd to the top-level of the xfstests
> directory, and run ./configure; make.
> 
> To run the XFS tests, you will need to have one (and preferably two)
> partitions, called TEST and SCRATCH.  TEST should be a mounted ext4
> partition that can be mounted and unmounted.  SCRATCH should be an
> empty device that you don't mind getting reformatted (many of the
> tests will reformat the SCRATCH partition).  SCRATCH is optional; so
> if you don't have an 2nd partition, you can simply omit setting the
> SCRATCH_DEV and SCRATCH_MNT environment variables.
> 
> Set the environment variables:
> 
> TEST_DEV    	   device file (i.e., /dev/sda1) containing the TEST partition
> TEST_DIR	   mount point of the TEST partition
> SCRATCH_DEV	   device file (i.e., /dev/sda2) containing the SCRATCH partition
> SCRATCH_MNT	   mount point of the SCRATCH partition
> 
> (note TEST_DIR vs. SCRATCH_MNT; don't blame me, blame the SGI
> engineers.  :-)
> 
> Then run "./check -ext4 -g auto" as root in the top-level xfstests
> directory.  Everything should pass; if not, then there's probably
> something wrong the the Ubuntu backports.  Bug reports with mainline
> kernels should be sent linux-ext4 at vger.kernel.org.  If the mainline
> kernel passes, and the Ubuntu backports don't, the sooner it is
> detected, the easier it will be to try to find the problem with
> bisection searches.
> 
> 						- Ted
> 

Great, yes, I definitely would like to get testing done with the updated ext4 
code in Jaunty. So I will go and get kernels and the environment prepared and 
the testing done.

Thanks again for the feedback.

Stefan


-- 

When all other means of communication fail, try words!






More information about the kernel-team mailing list