Fix: src.ctf.lttng-live: manually clear sessions vector in ~lttng_live_msg_iter()
authorSimon Marchi <simon.marchi@efficios.com>
Sat, 31 Aug 2024 02:39:50 +0000 (22:39 -0400)
committerSimon Marchi <simon.marchi@efficios.com>
Wed, 11 Sep 2024 03:25:36 +0000 (23:25 -0400)
commitd263b0b1505d4581badc00fea75c6303bc84c825
tree4139002f795355fcde029b255edcf10a7472a806
parent4d6634b83c864b6f2d40619cfe6dedb1c55701c7
Fix: src.ctf.lttng-live: manually clear sessions vector in ~lttng_live_msg_iter()

Starting with 751aaa6218f ("src.ctf.lttng-live: make
lttng_live_msg_iter::sessions an std::vector"), when live message
iterator gets destroyed while some stream iterators are still active, we
get this assertion:

    (╯°□°)╯︵ ┻━┻  /home/simark/src/babeltrace/src/plugins/ctf/lttng-live/lttng-live.cpp:153: ~lttng_live_msg_iter(): Assertion `this->active_stream_iter == 0` failed.

When a live message iterator completes successfully, by consuming
everything the relay has to offer until all stream iterators are ended,
there are no more active stream iterators when reaching
~lttng_live_msg_iter() (they all get destroyed by
next_stream_iterator_for_trace() when they end).  So the
`this->active_stream_iter == 0` assertion in the destructor holds.

But when an unexpected exit occurs, for instance because of an error or
a SIGINT interruption, the message iterator is destroyed while there are
still some stream iterators active.

Prior to 751aaa6218f, lttng_live_msg_iter_destroy() (which has since
been renamed to ~lttng_live_msg_iter()) had:

    if (lttng_live_msg_iter->sessions) {
        g_ptr_array_free(lttng_live_msg_iter->sessions, TRUE);
    }

Clearing the `sessions` array caused the `lttng_live_session` objects to
be destroyed, which caused the `lttng_live_trace` objects to be
destroyed, which caused the stream iterators to be destroyed.  When
reaching the assertion, the count of active stream iterators was
therefore 0.

With the refactor introduced by 751aaa6218f, we rely on the destruction
of the `sessions` vector for all this to happen, which occurs after the
user-specified destructor has run, therefore after the assertion.  At
the time the assertion is checked, there are still some stream iterators
alive.

Fix this by manually clearing the sessions vector before the assertion,
to mimic the old behavior.

Add a test that purposefully sends metadata with a syntax error, which
triggers the bug.

The change adding the `kill_server_on_cli_failure` parameter to
get_cli_output_with_lttng_live_server() is useful to avoid the test
script trying to kill the Python live server, in the new test.  Since
babeltrace successfully connects to the server and the disconnects, the
server exits on its own.  This would print a "kill 1234 failed: no such
process" message.  Not fatal, but not welcome either.  It's not pretty,
but it works.

Change-Id: Id777e7ac0294d89ea236dd05651c8fbe708b2be4
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/13204
Tested-by: jenkins <jenkins@lttng.org>
Reviewed-by: Philippe Proulx <eeppeliteloop@gmail.com>
src/plugins/ctf/lttng-live/lttng-live.cpp
tests/data/ctf-traces/1/live/invalid-metadata/channel0_0 [new file with mode: 0644]
tests/data/ctf-traces/1/live/invalid-metadata/index/channel0_0.idx [new file with mode: 0644]
tests/data/ctf-traces/1/live/invalid-metadata/metadata [new file with mode: 0644]
tests/data/plugins/src.ctf.lttng-live/invalid-metadata.json [new file with mode: 0644]
tests/plugins/src.ctf.lttng-live/test-live.sh
This page took 0.025711 seconds and 4 git commands to generate.