How to reproduce autotest fails: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
(Add link to flakiness information)
(Added an example how to loop with CTest)
 
Line 67: Line 67:
Run tests in a loop:
Run tests in a loop:
         for i in {0..100}; do taskset -c 1 ./tst_example >> log.txt 2>&1; done
         for i in {0..100}; do taskset -c 1 ./tst_example >> log.txt 2>&1; done
Or just use the looping functionality of CTest:
        ctest --repeat-until-fail 100
You can also try to run the same tests from two different terminals and set the process affinity.
You can also try to run the same tests from two different terminals and set the process affinity.



Latest revision as of 13:53, 23 October 2024

The current CI system suffers from high CPU load and since the virtual machines are using shared resources, on high load sometimes something glitches and the autotests may crash. Here are a few pointers on how to try to reproduce such issues on different platforms.

As a rule of thumb you need to put load on your system, reduce niceness and memory, and fix processor affinity.

If you don't want to jeopardize your own machine, you can ask someone from the CI team to clone a VM for you. Anyone from the release team can help you to create a VM. Just ask in #qt-qa or #qt-labs in the IRC. You can also use minicoin to bring up a usually-suitable VM for the platforms it supports. Also note, that if you create a new bug report for the failing autotest, remember to add labels: 'autotest' and 'flaky' to the label field, so that it gets tracked properly.

When diagnosing a crashing test, passing the command line option "-nocrashhandler" to the test suppresses dumping of the stack trace by QTestlib and makes it possible to attach a debugger or have it launched automatically by the OS (post-mortem).

Linux

1. 'stress' or 'stress-ng' imposes certain types of compute stress on your system

  • Install
Ubuntu and/or Debian:
sudo apt-get install stress
openSUSE:
sudo zypper install stress-ng
  • Example usage:
Ubuntu and/or Debian:
stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M
openSUSE:
stress-ng --cpu 8 --io 4 --vm 2 --vm-bytes 128M
  • you can check with 'top' that it runs

See more information from: https://linux.die.net/man/1/stress https://www.cyberciti.biz/faq/stress-test-linux-unix-server-with-stress-ng/


2. Increase niceness

   nice -n 19 ./test

See: http://bencane.com/2013/09/09/setting-process-cpu-priority-with-nice-and-renice/


3. Use 'taskset' to set process affinity

   taskset -c 1

Which means: "use second core".

You can also launch a test with a set priority:

   taskset -c 1 ./tst_foo

From the 'taskset' man page: -c, --cpu-list "specify a numerical list of processors instead of a bitmask. The list may contain multiple items, separated by comma, and ranges. For example, 0,5,7,9-11." See more information: https://linux.die.net/man/1/taskset


4. Run tests in a loop over and over

Launch 'stress' (see above, 1) Run tests in a loop:

       for i in {0..100}; do taskset -c 1 ./tst_example >> log.txt 2>&1; done

Or just use the looping functionality of CTest:

       ctest --repeat-until-fail 100

You can also try to run the same tests from two different terminals and set the process affinity.


5. What if it's a segmentation fault/core dump you cannot get a crash log from:

In a different terminal window, set:

       export LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so
       ulimit -c unlimited

after the crash in the same terminal you have set the LD_PRELOAD:

       gdb ./testCrash ./core

and, in gdb,

       bt

6. Use rr

See https://rr-project.org/ and especially chaos mode for rr https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mode.html to get threads to run with different priorities, hopefully reproducing races.

7. Limit the available memory

systemd-run --scope -p MemoryMax=500M tst_example

8. Combine several approaches

Limit memory, maximise niceness, use one core (hard to verify that these all work in conjunction, but they seem to):

systemd-run --scope -p MemoryMax=500M --user nice -n 19 taskset -c 0 ./tst_example

Mac OS

1. stress testing CPU

Repeat the word “yes” at such speed that it consumes all available processor resources. In a terminal do:

   yes > /dev/null & yes > /dev/null & yes > /dev/null & yes > /dev/null &

Check with 'top' that you see 4 'yes' processes running. Run "killall yes" to kill all instances.

See: http://osxdaily.com/2012/10/02/stress-test-mac-cpu/

Windows

1. stress testing CPU:

install CPUSTRES.EXE from https://blogs.msdn.microsoft.com/vijaysk/2012/10/26/tools-to-simulate-cpu-memory-disk-load/ Activate all threads (select with tick marks) set 'Thread Priority' of the threads to be 'time critical' or 'highest' and 'Activity' to 'Busy'


2. launch test using only one thread:

    start /B /WAIT /affinity 1 test.exe

1 == use CPU 0, 2 == use CPU 1 etc. See the table from: https://blogs.msdn.microsoft.com/santhoshonline/2011/11/24/how-to-launch-a-process-with-cpu-affinity-set/


3. run the tests in a loop:

   for /L %i in (1, 1, 10); do start /B /WAIT /affinity 1 tst_example.exe >> log.txt 2>&1

(Note that the >> ensures that the output is appended to log.txt, rather than overwriting its contents after each run - see https://technet.microsoft.com/en-us/library/bb490982.aspx)

3.1. run several tests in a loop simultaneously You can also try to run the same tests from two different terminals and set the process affinity Note: When writing a .bat script, use: for /L %%i in (1, 1, 10)....