[Bug 2062118] Re: autopkgtests fail on s390x (segfault)
Pragyansh Chaturvedi
2062118 at bugs.launchpad.net
Thu Nov 21 07:32:04 UTC 2024
I spent some time on this, as the tests in utest/ were segfaulting.
Turns out this is an endianness issue in libtraceeevent.
In libtraceevent/src/event_parse.c, we have:
```
/**
* tep_alloc - create a tep handle
*/
struct tep_handle *tep_alloc(void)
{
struct tep_handle *tep = calloc(1, sizeof(*tep));
if (tep) {
tep->ref_count = 1;
tep->host_bigendian = tep_is_bigendian();
}
return tep;
}
```
So on s390x, tep->host_bigendian is TEP_BIG_ENDIAN, but
tep->file_bigendian stays the default value (TEP_LITTLE_ENDIAN)
Then in libtracefs/src/kbuffer_parse.c, we have:
```
enum {
KBUFFER_FL_HOST_BIG_ENDIAN = (1<<0),
KBUFFER_FL_BIG_ENDIAN = (1<<1),
KBUFFER_FL_LONG_8 = (1<<2),
KBUFFER_FL_OLD_FORMAT = (1<<3),
};
#define ENDIAN_MASK (KBUFFER_FL_HOST_BIG_ENDIAN | KBUFFER_FL_BIG_ENDIAN)
...
static int do_swap(struct kbuffer *kbuf)
{
return ((kbuf->flags & KBUFFER_FL_HOST_BIG_ENDIAN) + kbuf->flags) &
ENDIAN_MASK;
}
```
kbuf->flags is populated based off the tep_handle object. So the tests
fail because libtraceevent thinks the files it opens are stored in
little endian format, while actually it is the other way round.
My fix was to change `tep->host_bigendian = tep_is_bigendian();` to
`tep->host_bigendian = tep->file_bigendian = tep_is_bigendian();`
We can make a default assumption that the host and FS endianness is
same. If it is different, the user must set the correct endianness using
the event-parse-api (tep_set_file_bigendian)
I am not sure if this must go upstream as well, and even if this would
be the right fix. But it does fix the tests
```
Run Summary: Type Total Ran Passed Failed Inactive
suites 1 1 n/a 0 0
tests 36 36 35 1 0
asserts 16407066 16407066 16407064 2 n/a
Elapsed time = 22.623 seconds
```
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to libtracefs in Ubuntu.
https://bugs.launchpad.net/bugs/2062118
Title:
autopkgtests fail on s390x (segfault)
Status in Ubuntu on IBM z Systems:
Triaged
Status in libtracefs package in Ubuntu:
Triaged
Bug description:
As part of the added QA to libtracefs it was found that it triggers a segfault on s390x.
This isn't just a test failing, it seems this is still deeply broken on s390x.
Either way, while in the time pressure of the noble release the
decision was simplified like "The tests didn't make it worse, just now
we know" and continued (To not leave these platforms behind later
unable to add it, albeit knowing it is still incomplete for now).
It does not mean that we can ignore them for too long and certainly
need to work on completing that into being fully functional in tests
and real usage. Hence we create this spin off bug from the MIR work in
bug 2051925 for tracking the further efforts.
Example test log:
https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/s390x/libt/libtracefs/20240417_184123_8ab96@/log.gz
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2062118/+subscriptions
More information about the foundations-bugs
mailing list