[Bug 1940505] Re: monero FTBFS on riscv64

Bug Watch Updater 1940505 at bugs.launchpad.net
Fri Aug 20 09:26:17 UTC 2021


Launchpad has imported 10 comments from the remote bug at
https://sourceware.org/bugzilla/show_bug.cgi?id=25181.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2019-11-10T03:31:16+00:00 Yitingwang16 wrote:

If there are two callers in two .text sections (for example sections A,
B), followed by two sections with R_RISCV_ALIGN type relocations
(section C, D), then followed by the called functions in sections behind
(section E), there is a chance to reproduce the issue.

I draw a picture to illustrate the problem:

                   ____________________________________
       high        |                                   |
        ^       -> | section E: cc, dd                 | <-
 	|       |  |___________________________________|  |
        |       |  |                                   |  |
        |       |  | section D: align2 (R_RISCV_ALIGN) |  |
        |       |  |___________________________________|  |
        |       |  |                                   |  |
        |       |  | section C: align1 (R_RISCV_ALIGN) |  |
        |       |  |___________________________________|  |
        |       |  |                                   |  |
        |       |  | section B : ff  (R_RISCV_CALL)    |--
        |       |  |___________________________________|
        |       |  |                                   |
  address low   -- | section A : _start (R_RISCV_CALL) |
                   |___________________________________|

Consider this situation: Section A, B, C, D, E are linked in sequence.
Section C and D have R_RISCV_ALIGN type relocations. Function _start()
calls function cc(), function ff() calls dd(). The size of each section
is 8, 8, 64, 0xfffbc, 10 bytes before relaxation.

In the first relaxation pass (info->relax_pass ==0), the R_RISCV_CALL
relocations in section A and B couldn't be relaxed to R_RISCV_JAL,
because the call from _start() to cc() and the call from ff() to dd()
don't fit with 21-bit offset. The offset is 0x10001c indeed. In the
first relaxation pass, nothing is done.

In the third relaxation pass (info->relax_pass ==2), the R_RISCV_ALIGN
relocations in section C and D could be relaxed. The sizes of section C
and D were changed to 34 and 0xfffba bytes. Now, the offset of the call
from _start() to cc() is 0xffffe, the offset from ff() to dd() equals to
this value too.

In the second round of relaxation, in the first pass (info->relax_pass ==0), the R_RISCV_CALL relocation in section A could be relaxed to R_RISCV_JAL. After the relaxation, section A's size is reduced by 4 bytes. So, section B's base address and every symbol go down 4 bytes forward. However section C, D and E will not go down 4 bytes, because of the .balign restriction placed in section C and D.
But, when linker processes the relaxation of section B, it uses the original section B symbol addresses to calculate the offset of R_RISCV_CALL relocation (from ff() to dd()). The offset is within 1M bytes offset (0xffffe).  So, the R_RISCV_CALL relocation is relaxed to R_RISCV_JAL. This is a mistake!
When linker finally performs the relocation (in perform_relocation()), the section B's symbol has been adjusted, then the bug is exposed.

I created a simple test case to trigger this issue:

    riscv64-unknown-linux-gnu-gcc -nostdlib -o a.out a.s align_1.s align_2.s c.s
    /tmp/ccgFgKhw.o: In function `ff':
    (.text+0x0): relocation truncated to fit: R_RISCV_JAL against symbol `dd' defined in .text section in /tmp/ccR0Ha6r.o

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/0

------------------------------------------------------------------------
On 2019-11-10T03:37:51+00:00 Yitingwang16 wrote:

Created attachment 12070
test case

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/1

------------------------------------------------------------------------
On 2019-11-10T03:40:31+00:00 Yitingwang16 wrote:

I have a fix locally which changes the common ld/ldlang.c. I can post to
binutils mailing list for review.

I am not sure whether there is a possibility of a similar problem
occurring *between* passes instead of within a single pass. I.e. what if
a call in B is within range and so gets relaxed in one pass, then in a
later pass B moves down due to relaxation in a previous section - but
the destination stays in the same place due to alignment? Is that
possible?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/2

------------------------------------------------------------------------
On 2019-11-10T05:13:03+00:00 Yitingwang16 wrote:

I've not figured out how to send a proper patch to binutils mailing
list.

github is much easier. See https://github.com/riscv/riscv-binutils-
gdb/pull/185

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/3

------------------------------------------------------------------------
On 2019-11-12T00:03:59+00:00 Wilson-5 wrote:

The way that this should work is that if the call crosses section
boundaries, then we need to use the max alignment of any section between
the call instruction and the target, including the call and target
sections.  Except we don't have code to compute that yet, so we just use
the maximum alignment of all sections in the output.  If the call does
not cross section boundaries, then it should only use the alignment of
the section it is in.  This alignment value is added in to the offset to
make sure that we don't relax something that might not fit after
alignment is handled.  This has the downside that sometimes we refuse to
relax something that might be relaxable because of a section with large
alignment constraint.  But it is safe, in that we never accidentally
relax something that should not be relaxed.

This is how _bfd_riscv_relax_lui and _bfd_riscv_relax_pc work.  They use
max_alignment unconditionally, and set max_alignment to the section
alignment if the lui and gp are in the same section and it isn't the abs
section.

This is unfortunately not how _bfd_riscv_relax_call works.  It only adds
in the alignment if the call and target are not in the same section.
I'd call this a bug and that it needs to work the same as the other two
functions.

I attached a proposed patch that fixes _bfd_riscv_relax_call and adds
your example as a testcase.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/4

------------------------------------------------------------------------
On 2019-11-12T00:05:14+00:00 Wilson-5 wrote:

Created attachment 12071
untested patch to fix _bfd_riscv_relax_call

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/5

------------------------------------------------------------------------
On 2019-11-12T00:10:50+00:00 Wilson-5 wrote:

You can create a patch with git diff and then attach it to the bug
report.  It needs to be mailed to the binutils list if it gets checked
in, but you can always ask someone else to do that for you.

Contributing a patch to Binutils requires an FSF copyright assignment.
I don't see an assignment in your name, and it isn't clear who you work
for.  Though from github I see you are in Beijing, so Alibaba and Huawei
would be my first two guesses.  I see GCC assignments for both of them,
but not Binutils assignments.  If you are serious about trying to
contribute to GNU toolchain, then you should try to get a copyright
assignment.  Meanwhile, it might be simpler if someone who already has
an assignment writes a patch.

Generally, I agree with Nelson Chu's github comment that we would prefer
to avoid a ldlang.c change if we can.  We might be able to get a better
result by changing ldlang.c instead of the RISC-V port, but a change to
ldlang.c requires a lot more testing because it affects all targets.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/6

------------------------------------------------------------------------
On 2019-11-12T10:06:30+00:00 Yitingwang16 wrote:

Hi Jim,

I am very grateful to you for fixing the defect very quickly and giving
me the instructions on how to contributing a patch to Binutils! My
colleague has verified your patch works fine. :)

I am a WindRiver engineer, working on VxWorks. I mainly do system
software and know very little toolchains. I just tried to contribute a
patch. I see what you and Nelson Chu mean, we'd better not modify the
common file, however, I didn't know how to fix elfnn-riscv.c. I sure am
glad you resolved the defect, thank you again. :)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/7

------------------------------------------------------------------------
On 2019-11-12T23:54:48+00:00 Cvs-commit wrote:

The master branch has been updated by Jim Wilson
<wilson at sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-
gdb.git;h=c6261a00c3e70dd8e508062ea43a1bcb6d547621

commit c6261a00c3e70dd8e508062ea43a1bcb6d547621
Author: Jim Wilson <jimw at sifive.com>
Date:   Tue Nov 12 15:50:48 2019 -0800

    RISC-V: Fix ld relax failure with calls and align directives.
    
    Make _bfd_riscv_relax_call handle section alignment padding same as
    the _bfd_riscv_relax_lui and _bfd_riscv_relax_pc functions already
    do.  Use the max section alignment if section boundaries are crossed,
    otherwise the alignment of the containing section.
    
    	bfd/
    	PR 25181
    	* elfnn-riscv.c (_bfd_riscv_relax_call): Always add max_alignment to
    	foff.  If sym_sec->output_section and sec->output_section are the same
    	and not *ABS* then set max_alignment to that section's alignment.
    
    	ld/
    	PR 25181
    	* testsuite/ld-riscv-elf/call-relax-0.s: New file.
    	* testsuite/ld-riscv-elf/call-relax-1.s: New file.
    	* testsuite/ld-riscv-elf/call-relax-2.s: New file.
    	* testsuite/ld-riscv-elf/call-relax-3.s: New file.
    	* testsuite/ld-riscv-elf/call-relax.d: New test.
    	* testsuite/ld-riscv-elf/ld-riscv-elf.exp: Run call-relax test.
    
    Change-Id: Iaf65cee52345abf1955f36e8e72c4f6cc0db8d9a

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/8

------------------------------------------------------------------------
On 2019-11-12T23:57:03+00:00 Wilson-5 wrote:

Fixed on mainline.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/monero/+bug/1940505/comments/9


** Changed in: binutils
       Status: Unknown => Fix Released

** Changed in: binutils
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to binutils in Ubuntu.
https://bugs.launchpad.net/bugs/1940505

Title:
  monero FTBFS on riscv64

Status in binutils:
  Fix Released
Status in binutils package in Ubuntu:
  New
Status in monero package in Ubuntu:
  New

Bug description:
  monero FTBFS on riscv64 due to toolchain issues:

  [ 64%] Linking CXX executable signature_fuzz_tests
  cd "/<<PKGBUILDDIR>>/obj-riscv64-linux-gnu/tests/fuzz" && /usr/bin/cmake -E cmake_link_script CMakeFiles/signature_fuzz_tests.dir/link.txt --verbose=1
  /usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -pthread -DNO_AES  -fno-strict-aliasing -std=c++11 -D_GNU_SOURCE   -Wall -Wextra -Wpointer-arith -Wundef -Wvla -Wwrite-strings -Wno-error=extra -Wno-error=deprecated-declarations -Wno-unused-parameter -Wno-unused-variable -Wno-error=unused-variable -Wno-error=undef -Wno-error=uninitialized -Wlogical-op -Wno-error=maybe-uninitialized -Wno-error=cpp -Wno-reorder -Wno-missing-field-initializers  -fPIC  -Wformat -Wformat-security -fstack-protector -fstack-protector-strong -fstack-clash-protection -fno-strict-aliasing -ftemplate-depth=900 -Wl,-Bsymbolic-functions -Wl,-z,relro  -pie -Wl,-z,relro -Wl,-z,now -Wl,-z,noexecstack  -rdynamic CMakeFiles/signature_fuzz_tests.dir/signature.cpp.o CMakeFiles/signature_fuzz_tests.dir/fuzzer.cpp.o -o signature_fuzz_tests  ../../lib/libwallet.a ../../src/cryptonote_core/libcryptonote_core.a ../../src/p2p/libp2p.a ../../contrib/epee/src/libepee.a ../../src/device/libdevice.a -Wl,-Bstatic -lrt -Wl,-Bdynamic -ldl ../../src/rpc/librpc_base.a ../../src/mnemonics/libmnemonics.a ../../src/device_trezor/libdevice_trezor.a ../../src/cryptonote_core/libcryptonote_core.a ../../src/multisig/libmultisig.a ../../src/blockchain_db/libblockchain_db.a ../../external/db_drivers/liblmdb/liblmdb.a ../../src/ringct/libringct.a ../../src/cryptonote_basic/libcryptonote_basic.a ../../src/device/libdevice.a -lhidapi-libusb ../../src/ringct/libringct_basic.a ../../src/blocks/libblocks.a ../../src/checkpoints/libcheckpoints.a ../../src/hardforks/libhardforks.a ../../src/net/libnet.a ../../src/common/libcommon.a ../../src/crypto/libcncrypto.a ../../contrib/epee/src/libepee.a ../../external/easylogging++/libeasylogging.a -lrandomx -lunbound -lboost_regex -lboost_date_time -lssl -lcrypto -lzmq -lpgm -lnorm -lgssapi_krb5 -lsodium ../../src/libversion.a -lminiupnpc -lboost_chrono -lboost_serialization -lboost_filesystem -lboost_system -lboost_thread -lboost_program_options -Wl,-Bstatic -lrt -Wl,-Bdynamic -ldl  -latomic 
  ../../external/db_drivers/liblmdb/liblmdb.a(mdb.c.o): in function `mdb_node_add':
  ./obj-riscv64-linux-gnu/external/db_drivers/liblmdb/./external/db_drivers/liblmdb/mdb.c:8245:(.text+0x44e6): relocation truncated to fit: R_RISCV_JAL against `mdb_assert_fail.constprop.0'
  collect2: error: ld returned 1 exit status
  make[3]: *** [tests/fuzz/CMakeFiles/signature_fuzz_tests.dir/build.make:166: tests/fuzz/signature_fuzz_tests] Error 1
  make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-riscv64-linux-gnu'
  make[2]: *** [CMakeFiles/Makefile2:4165: tests/fuzz/CMakeFiles/signature_fuzz_tests.dir/all] Error 2
  make[2]: Leaving directory '/<<PKGBUILDDIR>>/obj-riscv64-linux-gnu'
  make[1]: *** [Makefile:163: all] Error 2
  make[1]: Leaving directory '/<<PKGBUILDDIR>>/obj-riscv64-linux-gnu'
  dh_auto_build: error: cd obj-riscv64-linux-gnu && make -j1 "INSTALL=install --strip-program=true" VERBOSE=1 returned exit code 2
  make: *** [debian/rules:47: binary-arch] Error 25
  dpkg-buildpackage: error: debian/rules binary-arch subprocess returned exit status 2

To manage notifications about this bug go to:
https://bugs.launchpad.net/binutils/+bug/1940505/+subscriptions




More information about the foundations-bugs mailing list