Mir stress testing

Thomi Richards thomi.richards at canonical.com
Tue May 21 02:13:04 UTC 2013


Hi mir-folk,


I've been working on some stress tests for the mir code, and I've
encountered some interesting results.

tl;dr: I can crash the mir_demo_shell using nothing more than
mir_connect_sync and mir_connection_release.

The code I'm using is rather hacky in places, and is very much
work-in-progress. However, you can find it here:

lp:~thomir/+junk/mir-stress

Ultimately I'll include this in the lp:mir source tree, but until it's
of a reasonable quality it can live in a junk tree. To go from nothing
to crashing your mir server, run:

bzr branch lp:~thomir/+junk/mir-stress
cd mir-stress/
mkdir build
cd build
cmake ..
make

# start your mir_demo_server here

./mir-stress -n 10

This will stress the mir server for 10 seconds (you may omit the -n
parameter, in which case it will run for 600 seconds / 10 minutes),
using multiple threads. The number of threads started will be equal to
the number of cores on your machine. You can control that explicitly as
well. To start 8 threads:

./mir-stress -n 10 -t 8

For me, less than a second after starting, the mir-stress application
reports the following errors:

ERROR: void mir::client::MirSocketRpcChannel::on_header_read(const boost::system::error_code&)ERROR: 
... ERROR: void mir::client::MirSocketRpcChannel::on_header_read(const boost::system::error_code&)
... Connection reset by peer
Connection reset by peer
void mir::client::MirSocketRpcChannel::on_header_read(const boost::system::error_code&)
... Connection reset by peer
ERROR: void mir::client::MirSocketRpcChannel::on_header_read(const boost::system::error_code&)
... Connection reset by peer


... and the mir server seems to hang. The only way I can get a usable system again is to restart lightdm.

Initially I thought that perhaps the mir client library was not built to be thread safe, so I added an option to make mir-stress spawn multiple processes, instead of multiple threads: just add the -p parameter:

./mir-stress -p -n 10

Will spawn $(num_cores) processes.... with exactly the same results.

I've tried using gdbserver to debug the crashing mir_demo_server, but I've not been able to get a decent stack trace out of gdb, even when I compiled mir_demo_server from source. The best I can get is:

#0  0x00007ffff6acb037 in ?? ()
#1  0x00007ffff6ace698 in ?? ()
#2  0x0000000000000020 in ?? ()
#3  0x0000000000000000 in ?? ()

...which isn't very useful. I notice that in the CMakeLists.txt in the
root directory we set the CXX_FLAGS to include "-g", which, AIUI, should
be all that's needed.


So, at this point, I have a few questions for you all:

1) Is the mir client API designed to be thread-safe?

2) Have I actually uncovered a genuine issue, or is this a case of
"Thomi is doing something dumb"? Can anyone else here reproduce these
results?


Cheers,

-- Thomi Richards

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 899 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/mir-devel/attachments/20130521/7018ff16/attachment.pgp>


More information about the Mir-devel mailing list