[Bug 1020436] Re: Cannot read superblock after FC multipath failover

Peter Petrakis peter.petrakis at canonical.com
Mon Jul 23 14:40:04 UTC 2012


Hmm, we've got some counter indicators here.

lvs claims that the volumes are active. But the probe itself
is showing problems reading the volumes.

XFS is telling us that it cannot write it's journal to disk.
[1039812.311433] Filesystem "dm-4": Log I/O Error Detected. Shutting down filesystem: dm-4

"fs/xfs/xfs_rw.c"

 90 /*
 91  * Force a shutdown of the filesystem instantly while keeping
 92  * the filesystem consistent. We don't do an unmount here; just shutdown
 93  * the shop, make sure that absolutely nothing persistent happens to
 94  * this filesystem after this point.
 95  */
 96 void
 97 xfs_do_force_shutdown(
...
127         if (flags & SHUTDOWN_CORRUPT_INCORE) {
128                 xfs_cmn_err(XFS_PTAG_SHUTDOWN_CORRUPT, CE_ALERT, mp,
129     "Corruption of in-memory data detected.  Shutting down filesystem: %s",
130                         mp->m_fsname);
131                 if (XFS_ERRLEVEL_HIGH <= xfs_error_level) {
132                         xfs_stack_trace();
133                 }
134         } else if (!(flags & SHUTDOWN_FORCE_UMOUNT)) {
135                 if (logerror) {
136                         xfs_cmn_err(XFS_PTAG_SHUTDOWN_LOGERROR, CE_ALERT, mp,
137                 "Log I/O Error Detected.  Shutting down filesystem: %s",
138                                 mp->m_fsname);

The logs conveniently tell us where it was called from too.
1009 void
1010 xlog_iodone(xfs_buf_t *bp)
1011 {

...

1036         /*
1037          * Race to shutdown the filesystem if we see an error.
1038          */
1039         if (XFS_TEST_ERROR((XFS_BUF_GETERROR(bp)), l->l_mp,
1040                         XFS_ERRTAG_IODONE_IOERR, XFS_RANDOM_IODONE_IOERR)) {
1041                 xfs_ioerror_alert("xlog_iodone", l->l_mp, bp, XFS_BUF_ADDR(bp));
1042                 XFS_BUF_STALE(bp);
1043                 xfs_force_shutdown(l->l_mp, SHUTDOWN_LOG_IO_ERROR);
1044                 /*
1045                  * This flag will be propagated to the trans-committed
1046                  * callback routines to let them know that the log-commit
1047                  * didn't succeed.
1048                  */
1049                 aborted = XFS_LI_ABORTED;

I assume dm-4 is the LV that XFS is mounted on, did you run the dd test
on that?

I'm starting to wonder if the LVM device filter is lying to us, after failover, something
changes which misrepresents the LV and then XFS bails out.

If you can perform that DD for every PV that backs dm-4 successfully then there's
something wrong with the DM map for those LVs after failover occurs.

OK, what I need from you now is a before and after  (same fault injection method) of:
0) ls -lR /dev/ > dev_major_minor.log
1) lvs -o lv_attr
2) pvdisplay -vvv
3) lvdisplay -vvv
4) dmsetup table -v
5) "dd test" on all block devices: lv, mp, sd
6) dmesg output

Please attach this as a single tarball, that has a timestamp in the filename
and has a directory structure of:

foo.tgz
  before/
  after/

If this all checks out, then what's probably happening is the when
multipath begins the failover process, there's enough of a delay that
XFS simply bails out early before IO is ready to be sent down the
remaining paths. group by priority may perform better here and is
something you can test.

I looked at the XFS mount arguments and didn't find anything that would
make it more lenient in these situations. 

If you can manage it, a LUN formatted with ext3 under these circumstances
would help in ruling out whether the filesystem is part of the problem.

Thanks.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1020436

Title:
  Cannot read superblock after FC multipath failover

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+subscriptions



More information about the Ubuntu-server-bugs mailing list