[Bug 1434222] [NEW] Spurious valgrind errors due to memcpy replacement getting autovectorised

Launchpad Bug Tracker 1434222 at bugs.launchpad.net
Tue Mar 31 14:45:26 UTC 2015


You have been subscribed to a public bug:

== Comment: #0 - Anton Blanchard <antonb at au1.ibm.com> - 2015-03-18 00:05:56 ==
I'm seeing enormous numbers of these type of errors when using valgrind:

==95540== Invalid read of size 8
==95540==    at 0x408D038: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-ppc64le-linux.so)
==95540==    by 0x10414F5B: mem_file_write (in /usr/bin/gdb)
==95540==    by 0x10414CF3: null_file_fputs (in /usr/bin/gdb)
==95540==    by 0x104160E3: fputs_unfiltered (in /usr/bin/gdb)
==95540==    by 0x1040E20B: fprintf_unfiltered (in /usr/bin/gdb)

In this case I ran "valgrind gdb". The issue here is the valgrind memcpy
replacement code is getting autovectorised (since we are building the
package with -O3 on Ubuntu):

   0x000000000408d034:  rldicr  r9,r5,0,59
=> 0x000000000408d038:  lxvd2x  vs33,0,r9
   0x000000000408d03c:  xxswapd vs33,vs33
   0x000000000408d040:  vperm   v13,v1,v0,v12
   0x000000000408d044:  xxlor   vs32,vs33,vs33
   0x000000000408d048:  xxswapd vs0,vs45
   0x000000000408d04c:  stxvd2x vs0,r10,r5
   0x000000000408d050:  addi    r5,r5,16
   0x000000000408d054:  bdnz    0x408d034

In this case the source and destination are not 16B aligned, and gcc has
decided to realign things via a permute. The problem is this code will
always read too much data (which it just throws away). A safe
optimisation, but one which confuses valgrind.

The simple fix is to override any optimise flags and build
shared/vg_replace_strmem.c with -O2.

Some of the commit messages on shared/vg_replace_strmem.c, suggest we
would like these loops to be autovectorised for performance, but I'm not
sure if we can do that and avoid gcc tricks that read in too much data.

== Comment: #1 - William J. Schmidt <wschmidt at us.ibm.com> - 2015-03-18 09:23:23 ==
Hi Anton,  

Note there is a pending fix to GCC that will avoid the realignment code
for POWER8, where unaligned load cost is much lower than previously.
See https://bugzilla.linux.ibm.com/show_bug.cgi?id=122395.

The current status is that the GCC trunk is closed until GCC 5 releases.
Once that occurs, I will be backporting the fix to 5, 4.9, and 4.8 where
it can get picked up at the next opportunity by each of the distros.  We
will also provide it in the next releases of the Advance Toolchain (AT7,
AT8, AT9).

== Comment: #2 - David Heller <hellerda at us.ibm.com> - 2015-03-19 01:28:25 ==
So is the short term fix to build valgrind (or at least the one module) with -O2, and is that what we want to ask Canonical to do?

== Comment: #3 - William J. Schmidt <wschmidt at us.ibm.com> - 2015-03-19 09:37:57 ==
For 15.04, yes, that would be best.  The GCC schedules make it impossible for us to fix the compiler in time for 15.04.

Note that a less impactful change to the compile would be to replace -O3
with -O3 -fno-tree-vectorize.  I'd predict this will still solve the
problem.

We will be fixing this properly in time for 15.10, so Canonical should
treat this as a one-time change.

** Affects: valgrind (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: architecture-ppc64le bugnameltc-122870 severity-medium targetmilestone-inin1504
-- 
Spurious valgrind errors due to memcpy replacement getting autovectorised
https://bugs.launchpad.net/bugs/1434222
You received this bug notification because you are a member of Ubuntu Foundations Bugs, which is subscribed to valgrind in Ubuntu.



More information about the foundations-bugs mailing list