lttng-tools.git
2 years agoFix: sessiond: action-executor: misquoted strings in logging
Jérémie Galarneau [Fri, 10 Dec 2021 21:13:27 +0000 (16:13 -0500)] 
Fix: sessiond: action-executor: misquoted strings in logging

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5b37472c0b9e49116ec542c57346a78359d732d7

2 years agoTests: live kernel: no plan printed when non-root
Jérémie Galarneau [Fri, 10 Dec 2021 19:10:13 +0000 (14:10 -0500)] 
Tests: live kernel: no plan printed when non-root

The live kernel test does not produce a valid TAP output when
skipping due to not running the test as root.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4a10de494260084216ddb1b8f4ee27a546c4d8ed

2 years agoFix: sessiond: assert on lttng_ht_add_unique_str on ltt_sessions_ht_by_name
Jonathan Rajotte [Thu, 9 Dec 2021 20:14:26 +0000 (15:14 -0500)] 
Fix: sessiond: assert on lttng_ht_add_unique_str on ltt_sessions_ht_by_name

Observed issue
==============

The lttng-sessiond asserts with the following backtrace on lttng create:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007ffff7ab5859 in __GI_abort () at abort.c:79
 #2  0x00007ffff7ab5729 in __assert_fail_base (fmt=0x7ffff7c4b588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5555556ab0a6 "node_ptr == &node->node", file=0x5555556ab085 "hashtable.c", line=298, function=<optimized out>) at a
 #3  0x00007ffff7ac6f36 in __GI___assert_fail (assertion=assertion@entry=0x5555556ab0a6 "node_ptr == &node->node", file=file@entry=0x5555556ab085 "hashtable.c", line=line@entry=298, function=function@entry=0x5555556ab380 <__PRETTY_FUNCTIO
 #4  0x000055555560be44 in lttng_ht_add_unique_str (ht=<optimized out>, node=0x7fffe0026c58) at hashtable.c:298
 #5  0x000055555558fb6a in add_session_ht (ls=0x7fffe0024970) at session.c:372
 #6  session_create (name=<optimized out>, uid=1000, gid=1000, out_session=out_session@entry=0x7fffedfddbd8) at session.c:1308
 #7  0x000055555559b219 in cmd_create_session_from_descriptor (creds=<optimized out>, creds=<optimized out>, home_path=<optimized out>, descriptor=<optimized out>) at cmd.c:3040
 #8  cmd_create_session (cmd_ctx=cmd_ctx@entry=0x7fffedfe5fa0, sock=<optimized out>, return_descriptor=return_descriptor@entry=0x7fffedfddd68) at cmd.c:3176
 #9  0x00005555555cc341 in process_client_msg (sock_error=0x7fffedfddd10, sock=0x7fffedfddd0c, cmd_ctx=0x7fffedfe5fa0) at client.c:2177
 #10 thread_manage_clients (data=<optimized out>) at client.c:2742
 #11 0x00005555555c5fff in launch_thread (data=0x55555571b780) at thread.c:66
 #12 0x00007ffff7c8b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #13 0x00007ffff7bb2293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The issue can be reproduced with modifications to the rotation thread
code and the following scenario:

 $ lttng create test
 $ lttng enable-event -u -a
 $ lttng start
 run any app just so that we have a complete valid session. (might not be necessary)
 $ lttng destroy --no-wait
 $ lttng create test
 ^ Should assert here.

The diff to be applied:

 diff --git a/src/bin/lttng-sessiond/rotation-thread.cpp b/src/bin/lttng-sessiond/rotation-thread.cpp
 index ac149c845..c11f068ed 100644
 --- a/src/bin/lttng-sessiond/rotation-thread.cpp
 +++ b/src/bin/lttng-sessiond/rotation-thread.cpp
 @@ -565,6 +565,8 @@ int handle_job_queue(struct rotation_thread_handle *handle,
  {
         int ret = 0;

 +       sleep(5);
 +
         for (;;) {
                 struct ltt_session *session;
                 struct rotation_thread_job *job;

Note that the initial report for this issue was on a system under load
for which the `lttng destroy` completion check failed and a `lttng
create` was performed. As of today the exact reason why the completion
check failed is not known. Still we can "fix" the race leading to the
lttng-sessiond assertion considering a user might use the `--no-wait`
variant of `lttng destroy` and could easily end up in this
situation.

Cause
=====

Note: all `lttng create` commands have the same session name passed as
argument.

On `lttng destroy` the ltt_session object is flagged as destroyed
(ltt_session::destroyed). The removal of the object from the hash
table (ltt_sessions_ht_by_name) will be performed during the
`session_release` which is driven by the session refcount.

A reference on the `ltt_session` object is held for the
rotation initiated by the `lttng destroy` command. The rotation is
enqueued by the rotation thread.

At this point the system is busy and the rotation thread does not run.
We simulate this with a `sleep(5)` during the `handle_job_queue`.

The `lttng destroy --no-wait` returns. If the `--no-wait` option is not
passed the `lttng destroy` command will work as expected and wait for
completion. We can SIGINT the `lttng destroy` command and perform a
`lttng create` yielding the same backtrace.

On `lttng create`, `session_create` validates that the name does not
conflict with an existing session using `session_find_by_name`. It is
important to note that `session_find_by_name` discriminates also on the
`session->destroyed` flag (introduced by [1]).

The `ltt_sessions_ht_by_name` hash table was introduced by [2] to remove
the need to lock the session list to sample a session id during the
queueing of actions to be executed related to a trigger. The assumption
was made that, during the creation phase, the session would
always be unique in that hash table based on its name. This is simply
not true since multiple sessions with the same name can coexist as long
as only a single one is marked as "not destroyed". This is an important
concept due to the refcounting of the object and the feature relying on
the lifetime of the object (i.e rotation). This is mostly valid when
talking about the global session list.

Solution
========

Move the hash table removal earlier during the release of the session
object.

Move the removal from `del_session_ht`, which is done during the
`session_release` function, to the `lttng_session_destroy` function.

It is safe to do so since currently the only user of that hash table
(the action executor) does not care much about destroyed session at that
point.

This ensures that we maintain the uniqueness property of the key (name)
for that hash table on insertion.

The alternative was to expose an hash table that could contain
duplicates and force the handling of a set on all lookups.

Known drawbacks
=========

None.

References
==========
[1] https://git.lttng.org/?p=lttng-tools.git;a=commit;h=e32d7f274604b77bcd83c24994e88df3761ed658
[2] https://git.lttng.org/?p=lttng-tools.git;a=commit;h=e1bbf98908a6399f39a9a8bc95bd8b59cecaa816

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2f1d0d6c04ee7210166e9847a850afbe6eaa7609

2 years agoFix: sessiond: snapshot: leak of trace chunk
Jérémie Galarneau [Wed, 3 Nov 2021 02:31:15 +0000 (22:31 -0400)] 
Fix: sessiond: snapshot: leak of trace chunk

Valgrind reports a leak after every snapshot record command:

==827791== 430 (280 direct, 150 indirect) bytes in 1 blocks are definitely lost in loss record 34 of 37
==827791==    at 0x48435FF: calloc (vg_replace_malloc.c:1117)
==827791==    by 0x223D01: zmalloc (macros.h:45)
==827791==    by 0x224B79: lttng_trace_chunk_allocate (trace-chunk.c:387)
==827791==    by 0x224E41: lttng_trace_chunk_create (trace-chunk.c:427)
==827791==    by 0x150B55: session_create_new_trace_chunk (session.c:656)
==827791==    by 0x164A11: snapshot_record (cmd.c:5113)
==827791==    by 0x1651EE: cmd_snapshot_record (cmd.c:5302)
==827791==    by 0x196E74: process_client_msg (client.c:2166)
==827791==    by 0x198AF1: thread_manage_clients (client.c:2742)
==827791==    by 0x18E245: launch_thread (thread.c:66)
==827791==    by 0x4B9E258: start_thread (in /usr/lib/libpthread-2.33.so)
==827791==    by 0x4CB45E2: clone (in /usr/lib/libc-2.33.so)

session_set_trace_chunk() on line 5162 returns a reference to the
current trace chunk which is never released.

This also causes tests/regression/tools/snapshots/test_ust_long to fail
due to a file descriptor exhaustion (presumably from using too many
directory file descriptors) when it is executed by an unprivileged user.

The CI doesn't catch this since the long regression test suite is
executed as root.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4b6e45df48c3daafa2294c80ccd8a2b4d91401e1

2 years agoFix: test: use BABELTRACE_BIN instead of babeltrace
Jonathan Rajotte [Tue, 23 Nov 2021 17:15:42 +0000 (12:15 -0500)] 
Fix: test: use BABELTRACE_BIN instead of babeltrace

Observed issue
==============

The System tests jobs fails on multi-session test since the move to bt2.

Cause
=====

The tests uses `babeltrace` instead of `BABELTRACE_BIN`.

Solution
========

Use `BABELTRACE_BIN`.

Add a babelrace bail out.

While there fix easy shellcheck warning.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I441d736e85c447c5765bffd520ec2f267c86048f

2 years agoFix: action executor: ref count imbalance for session object
Jonathan Rajotte [Thu, 11 Nov 2021 20:02:54 +0000 (15:02 -0500)] 
Fix: action executor: ref count imbalance for session object

Observed issue
==============

The following scenario leads to a hang on `lttng destroy`.

 # Start lttng-sessiond under gdb
 $ gdb lttng-sessiond
     set pagination off
     set non-stop
     start
     break action_executor_snapshot_session_handler

 $ lttng add-trigger --name my_trigger --condition=event-rule-matches --type=user:tracepoint --name=sample_component:message --action=snapshot-session my_snapshot
 $ lttng create --snapshot my_snapshot
 $ lttng enable-event -u -a
 $ lttng start

 $ start an app producing a single sample_component:message

 # gdb should break on thread 6

 # inside gdb
thread 6

 $ lttng destroy my_snapshot
 $ lttng create --snapshot my_snapshot
 $ lttng enable-event -u -a
 $ lttng start

 # inside gdb use `continue`

 $ lttng destroy my_snapshot

  The destroy command hang:

  Destroying session my_snapshot.... ....

Cause
=====

The scenario forces the usage of the following code path:

 if (session->id != LTTNG_OPTIONAL_GET(item->context.session_id)) {
  624├───────────────> DBG("Session id for session `%s` (id: %" PRIu64
  625│                     " is not the same that was sampled (id: %" PRIu64
  626│                     " at the moment the work item was enqueued for %s` action of trigger `%s`",
  627│                                 session_name, session->id,
  628│                                 LTTNG_OPTIONAL_GET(item->context.session_id),
  629│                                 get_action_name(action),
  630│                                 get_trigger_name(work_item->trigger));
  631│                 ret = 0;
  632│                 goto error_unlock_list;
  633│         }

At that point a reference on the session object was taken on line:

 610│         session = session_find_by_name(session_name);

But the reference is never put on `error_unlock_list` resulting in a ref
count problem.

Solution
========

Use `session_put` for the code path.

Note that most of the handler also have the same problem that was
introduced by commit 72365501d3148ca977a09bad8de0ec51b427bdd8 [1]

Known drawbacks
=========

None.

Refs
=========
[1] https://github.com/lttng/lttng-tools/commit/72365501d3148ca977a09bad8de0ec51b427bdd8

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I23c3c089866df74854bbfe64320310c4b28ee41d

2 years agoFix: relayd: `!vsession->current_trace_chunk` assertion failed
Mathieu Desnoyers [Thu, 2 Dec 2021 22:33:55 +0000 (17:33 -0500)] 
Fix: relayd: `!vsession->current_trace_chunk` assertion failed

Observed issue
==============

When performing:

  #!/bin/bash

  lttng create py_syscalls --live

  lttng enable-event -u -a
  lttng enable-event -k -a

  lttng start

  babeltrace2 -i lttng-live net://localhost/host/raton/py_syscalls

The relay daemon hits this assertion:

  Thread 8 (Thread 0x7fffeeffd700 (LWP 167040) "lttng-relayd"):
  #0  0x00007ffff7b1618b in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #1  0x00007ffff7af5859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
  #2  0x00007ffff7af5729 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  #3  0x00007ffff7b06f36 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
  #4  0x00005555555889bb in viewer_session_attach (vsession=0x7fffdc001400, session=session@entry=0x7fffe8001180) at viewer-session.c:80
  #5  0x000055555557bcff in viewer_attach_session (conn=0x7fffd0001140) at live.c:1275
  #6  process_control (conn=0x7fffd0001140, recv_hdr=0x7fffeeffcaf0) at live.c:2341
  #7  thread_worker (data=<optimized out>) at live.c:2515
  #8  0x00007ffff7ccd609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
  #9  0x00007ffff7bf2293 in clone () from /lib/x86_64-linux-gnu/libc.so.6

Cause
=====

This assert appears to be entirely wrong.

It checks that the "viewer session" has a NULL current trace chunk when
attaching a session to a viewer session, but in the case where a viewer
session has multiple sessions (e.g. with kernel and ust tracing
combined), we are attaching each session individually to the viewer
session, and we set the current trace chunk of the viewer session when
we attach the first session to it.

So it is expected to be non-NULL when attaching the second session.

Solution
========

Remove the assertion.

Known limitations
=================

None.

Fixes: #1335
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4d8f5d5347b4588144ddf449976cae5a94b81b3a

2 years agoFix: tests: fix unused-but-set warning in test_fd_tracker.c
Simon Marchi [Wed, 10 Nov 2021 13:42:25 +0000 (08:42 -0500)] 
Fix: tests: fix unused-but-set warning in test_fd_tracker.c

When building with clang-14 on Ubuntu 20.04, I get:

      CC       test_fd_tracker.o
    /home/smarchi/src/lttng-tools/tests/unit/test_fd_tracker.c:169:15: error: variable 'fds_set_to_minus_1' set but not used [-Werror,-Wunused-but-set-variable]
            unsigned int fds_set_to_minus_1 = 0;
                         ^

The compiler seems right, so remove fds_set_to_minus_1.  It might be
that the intention was to assert something using this variable, but I
couldn't figure it out.

Change-Id: I12bfd07bca7829de8d5b85d375d9b52bd84d677a
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoFix: sessiond: fix possible buffer overflow warning
Simon Marchi [Wed, 10 Nov 2021 13:39:22 +0000 (08:39 -0500)] 
Fix: sessiond: fix possible buffer overflow warning

When compiling with clang-14 on Ubuntu 20.04, I get:

      CC       lttng-syscall.lo
    /home/smarchi/src/lttng-tools/src/bin/lttng-sessiond/lttng-syscall.c:70:13: error: 'fscanf' may overflow; destination buffer in argument 4 has size 255, but the corresponding specifier may require size 256 [-Werror,-Wfortify-source]
                                    &index, name, &bitness) == 3) {
                                            ^

I think the compiler is right, we read a string when length up to 255 in
a buffer of size 255.  We need one more byte for the NULL terminator,
fix that.

Change-Id: I6b2eec401af3ef6230dd4b6c8559032de9b54584
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoFix: tests: app unregistering is not guaranteed by app lifetime
Jonathan Rajotte [Mon, 23 Aug 2021 21:12:28 +0000 (17:12 -0400)] 
Fix: tests: app unregistering is not guaranteed by app lifetime

Observed issue
==============

The per-pid timer based rotation tests fail on a minimal ptest
yocto image.

The test suite report that the second archive is not empty as it
expects.

Note that the yocto/OE image is running under QEMU without
KVM.

Cause
=====

Since the image is running under QEMU without KVM, the overall
processing capability of the VM is quite limited.

The test seems to assume that between the first and the second rotation
the app will be unregistered by the time the second rotation is issued.

Note that the observable lifetime of an app is not equal to the
lttng-sessiond/consumerd app visibility since we deal with app
unregistration via a polling mechanism.

Note, that as far as I understand, this is a testing issue only.

It is still relevant in the context of rotation to validate that the second
rotation archive does NOT contain info for a "dead" app under per-pid
configuration.

Solution
========

Move the rotation timer operation after the app is registered and
considered unregistered from the point of view of
lttng-sessiond/lttng-consumerd. This should give us a more robust
approach.

Known drawbacks
=========

None.

References
==========

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie8c542d29ef8bdb325efc05de14e80b179c68754

2 years agoFix: lttng-ctl: tracing_group memory leaks
Jonathan Rajotte [Tue, 19 Oct 2021 19:22:39 +0000 (15:22 -0400)] 
Fix: lttng-ctl: tracing_group memory leaks

Observed issue
==============

liblttng-ctl leaks memory if `lttng_set_tracing_group` is called at least
1 time by an API client.

 joraj@~/lttng/master/lttng-tools-dev [master][]$ valgrind --leak-check=full lttng --group=joraj list
 ==24823== Memcheck, a memory error detector
 ==24823== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
 ==24823== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
 ==24823== Command: lttng --group=joraj list
 ==24823==
 Error: No session daemon is available
 ==24823==
 ==24823== HEAP SUMMARY:
 ==24823==     in use at exit: 8 bytes in 1 blocks
 ==24823==   total heap usage: 55 allocs, 54 frees, 87,023 bytes allocated
 ==24823==
 ==24823== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1
 ==24823==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==24823==    by 0x4BA7DC7: __vasprintf_internal (vasprintf.c:71)
 ==24823==    by 0x4C4B742: __asprintf_chk (asprintf_chk.c:34)
 ==24823==    by 0x48687D9: asprintf (stdio2.h:181)
 ==24823==    by 0x48687D9: lttng_set_tracing_group (lttng-ctl.c:2620)
 ==24823==    by 0x4011B89: call_init.part.0 (dl-init.c:72)
 ==24823==    by 0x4011C90: call_init (dl-init.c:30)
 ==24823==    by 0x4011C90: _dl_init (dl-init.c:119)
 ==24823==    by 0x4001139: ??? (in /usr/lib/x86_64-linux-gnu/ld-2.31.so)
 ==24823==    by 0x2: ???
 ==24823==    by 0x1FFEFFFCFE: ???
 ==24823==    by 0x1FFEFFFD04: ???
 ==24823==    by 0x1FFEFFFD12: ???
 ==24823==
 ==24823== LEAK SUMMARY:
 ==24823==    definitely lost: 8 bytes in 1 blocks
 ==24823==    indirectly lost: 0 bytes in 0 blocks
 ==24823==      possibly lost: 0 bytes in 0 blocks
 ==24823==    still reachable: 0 bytes in 0 blocks
 ==24823==         suppressed: 0 bytes in 0 blocks
 ==24823==
 ==24823== For lists of detected and suppressed errors, rerun with: -s
 ==24823== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Cause
=====

The allocated pointer in the library constructor is not freed on
subsequent assignation.

Solution
========

Free the pointer.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1d4c45df2764a88c74d56de691783df9215633c

2 years agoFix: use <unistd.h> instead of <sys/unistd.h>
Francis Deslauriers [Tue, 2 Nov 2021 13:33:02 +0000 (09:33 -0400)] 
Fix: use <unistd.h> instead of <sys/unistd.h>

Fixes: #1330
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I07cabde5a0295de06f7c6f42dd12de803b57c907

2 years agoFix: Tests: unchecked `close()` return value
Francis Deslauriers [Mon, 1 Nov 2021 19:31:26 +0000 (15:31 -0400)] 
Fix: Tests: unchecked `close()` return value

CID 1465101 (#1 of 1): Unchecked return value (CHECKED_RETURN)
9. check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).

CID 1465100 (#1 of 1): Unchecked return value (CHECKED_RETURN)
4. check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times)

CID 1465099 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).

CID 1465098 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).

CID 1465097 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).

Reported-by: Coverity Scan
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8e2552c75ab7cec5aa3707e2c1c4d9f2484b501a

2 years agoFix: relayd: live: mishandled initial null trace chunk
Jérémie Galarneau [Mon, 1 Nov 2021 19:43:55 +0000 (15:43 -0400)] 
Fix: relayd: live: mishandled initial null trace chunk

Observed issue
==============

As reported in #1323 (https://bugs.lttng.org/issues/1323), crashes of
the relay daemon are observed when running the user space clear tests.

The crash occurs with the following stack trace:
  #0  0x000055fbb861d6ae in urcu_ref_get_unless_zero (ref=0x28) at /usr/local/include/urcu/ref.h:85
  #1  lttng_trace_chunk_get (chunk=0x0) at trace-chunk.c:1836
  #2  0x000055fbb86051e2 in make_viewer_streams (relay_session=relay_session@entry=0x7f6ea002d540, viewer_session=<optimized out>, seek_t=seek_t@entry=LTTNG_VIEWER_SEEK_BEGINNING, nb_total=nb_total@entry=0x7f6ea9607b00, nb_unsent=nb_unsent@entry=0x7f6ea9607aec, nb_created=nb_created@entry=0x7f6ea9607ae8, closed=<optimized out>) at live.c:405
  #3  0x000055fbb86061d9 in viewer_get_new_streams (conn=0x7f6e94000fc0) at live.c:1155
  #4  process_control (conn=0x7f6e94000fc0, recv_hdr=0x7f6ea9607af0) at live.c:2353
  #5  thread_worker (data=<optimized out>) at live.c:2515
  #6  0x00007f6eae86a609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
  #7  0x00007f6eae78f293 in clone () from /lib/x86_64-linux-gnu/libc.so.6

The race window during which this occurs seems very small as it can take
hours to reproduce this crash. However, a minimal reproducer could be
identified, as stated in the bug report.

Essentially, the same crash can be reproduced by attaching a live viewer
to a session that has seen events being produced, been stopped and been
cleared.

Cause
=====

The crash occurs as an attempt is made to take a reference to a viewer
session’s trace chunk as viewer streams are created. The crux of the
problem is that the code doesn’t expect a viewer session’s trace chunk
to be NULL.

The viewer session’s current trace chunk is initially set, when a viewer
attaches to the viewer session, to a copy the corresponding
relay_session’s current trace chunk.

A live session always attempts to "catch-up" to the newest available
trace chunk. This means that when a viewer reaches the end of a trace
chunk, the viewer session may not transition to the "next" one: it jumps
to the most recent trace chunk available (the one being produced by the
relay_session). Hence, if the producer performs multiple rotations
before a viewer completes the consumption of a trace chunk, it will skip
over those "intermediary" trace chunks.

A viewer session updates its current trace chunk when:
  1) new viewer streams are created,
  2) a new index is requested,
  3) metadata is requested.

Hence, as a general principle, the viewer session will reference the
most recent trace chunk available _even if its streams do not point to
it_. It indicates which trace chunk viewer streams should transition to
when the end of their current trace chunk is reached.

The live code properly handles transitions to a null chunk. This can be
verified by attaching a viewer to a live session, stopping the session,
clearing it (thus entering a null trace chunk), and resuming tracing.

The only issue is that the case where the first trace chunk of a viewer
session is "null" (no active trace chunk) is mishandled in two places:
  1) in make_viewer_streams(), where the crash is observed,
  2) in viewer_get_metadata().

Solution
========

In make_viewer_streams(), it is assumed that a viewer session will have
a non-null trace chunk whenever a rotation is not ongoing. This is
reflected by the fact that a reference is always acquired on the viewer
session’s trace chunk.

That code is one of the three places that can cause a viewer session’s
trace chunk to be updated. We still want to update the viewer session to
the most recently seen trace chunk (null, in this case). However, there
is no reference to acquire and the trace chunk to use for the creation
of the viewer stream is NULL. This is properly handled by
viewer_stream_create().

The second site to change is viewer_get_metadata() which doesn’t handle
a viewer metadata stream not having an active trace chunk at all.
Thankfully, the protocol allows us to express this condition by
returning the LTTNG_VIEWER_NO_NEW_METADATA status code when a viewer
metadata stream doesn’t have an open file and doesn’t have a current
trace chunk.

Surprisingly, this bug didn’t trigger in the case where a transition to
a null chunk occurred _after_ attaching to a viewer session.

This is because viewers will typically ask for metadata as a result of an
LTTNG_VIEWER_FLAG_NEW_METADATA reply to the GET_NEXT_INDEX command. When
a session is stopped and all data was consumed, this command returns
that no new data is available, causing the viewers to wait and ask again
later.

However, when attaching, babeltrace2 (at least, and probably babeltrace 1.x)
always asks for an initial segment of metadata before asking for an
index.

Known drawbacks
===============

None.

Fixes: #1323
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I516fca60755e6897f6b7170c12d706ef57ad61a5

2 years agoFix: configure.ac: reporting SDT uprobe as a UST feature
Francis Deslauriers [Thu, 30 Sep 2021 18:43:11 +0000 (14:43 -0400)] 
Fix: configure.ac: reporting SDT uprobe as a UST feature

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I86638d6a148b04e7131e4af7ec830c5e56817fdc

2 years agoFix: Tests: leaking epoll fd
Francis Deslauriers [Thu, 7 Oct 2021 18:52:27 +0000 (14:52 -0400)] 
Fix: Tests: leaking epoll fd

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5ec4fcdb87159f35932c20e7314cda764d14967c

2 years agoTypo: occurences -> occurrences
Francis Deslauriers [Mon, 25 Oct 2021 15:32:24 +0000 (11:32 -0400)] 
Typo: occurences -> occurrences

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I719e26febd639f3b047b6aa6361fc6734088e871

3 years agoUpdate version to v2.13.1
Jérémie Galarneau [Mon, 18 Oct 2021 21:09:27 +0000 (17:09 -0400)] 
Update version to v2.13.1

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: ust: app stuck on recv message during UST comm timeout scenario
Jonathan Rajotte [Thu, 8 Jul 2021 18:17:51 +0000 (14:17 -0400)] 
Fix: ust: app stuck on recv message during UST comm timeout scenario

Observed issue
==============

The following scenario lead to the UST thread to be "stuck" on recvmsg
on the notify socket.
The problem manifest itself when an application is unresponsive during
the ustctl_start_session call. Note that the default timeout for ust
communication is 5 seconds.

  # Start an instrumented app
  ./app
  gdb lttng-sessiond
  # put a breakpoint on ustctl_start_session
  lttng create my_session
  lttng enable-event -u -a
  lttng start
  # The tracepoint should hit. Do not continue.
  kill -s SIGSTOP $(pgrep app)
  # Continue lttng-sessiond.
  sleep 5 # This make sure lttng-sessiond unregister the app from its point of view
  kill -s SIGCONT $(pgrep app)
  gdb -p $(pgrep app)
  thread apply all bt

App stack trace:

  Thread 3 (Thread 0x7fe2c6f58700 (LWP 48172)):
  #0  __libc_recvmsg (flags=0, msg=0x7fe2c6f56ac0, fd=4) at ../sysdeps/unix/sysv/linux/recvmsg.c:28
  #1  __libc_recvmsg (fd=fd@entry=4, msg=msg@entry=0x7fe2c6f56ac0, flags=flags@entry=0) at ../sysdeps/unix/sysv/linux/recvmsg.c:25
  #2  0x00007fe2c7a010ba in ustcomm_recv_unix_sock (sock=sock@entry=4, buf=buf@entry=0x7fe2c6f56ea0, len=len@entry=48) at lttng-ust-comm.c:308
  #3  0x00007fe2c7a037c3 in ustcomm_register_channel (sock=4, session=session@entry=0x7fe2c0000ba0, session_objd=<optimized out>, channel_objd=<optimized out>, nr_ctx_fields=nr_ctx_fields@entry=0, ctx_fields=<optimized out>, chan_id=0x7fe2
  c6f5716c, header_type=0x7fe2c0012b18) at lttng-ust-comm.c:1544
  #4  0x00007fe2c7a10787 in lttng_session_enable (session=0x7fe2c0000ba0) at lttng-events.c:444
  #5  0x00007fe2c7a0b785 in lttng_session_cmd (objd=1, cmd=128, arg=140611977311672, uargs=0x7fe2c6f57800, owner=0x7fe2c7a5da00 <local_apps>) at lttng-ust-abi.c:576
  #6  0x00007fe2c7a07d6d in handle_message (lum=0x7fe2c6f57590, sock=3, sock_info=0x7fe2c7a5da00 <local_apps>) at lttng-ust-comm.c:1003
  #7  ust_listener_thread (arg=0x7fe2c7a5da00 <local_apps>) at lttng-ust-comm.c:1712
  #8  0x00007fe2c7993609 in start_thread (arg=<optimized out>) at pthread_create.c:477
  #9  0x00007fe2c78ba293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

  ...

Cause
=====

When the app continues after the timeout from lttng-sessiond side, the
actual start_session message is received on the application side then
UST, app side, send commands on the notify socket. On lttng-sessiond
side, the command is received but no reply is sent.

This is due to the fact that the lookup against the
ust_app_ht_by_notify_sock hash table (find_app_by_notify_sock)
return nothing since the app is unregistered at this point and the hash
table node was removed on unregistration.

Solution
========

When the app lookup fails, return an error that will trigger the cleanup
of the notify socket.

Known drawbacks
=========
None

Note
=========
Subsequent error path in reply_ust_register_channel,
add_event_ust_registry, and add_enum_ust_registry might lead to the same
type of problem since no reply is sent to the app. Still, for those
cases the complete application/notify socket should not be destroyed
since the error path relate to either a session or a sub object of a
session.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Iea0dc027ca1ee772e84c7e545114f1be69fd1f63
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: ust: UST communication can return -EAGAIN
Jonathan Rajotte [Wed, 23 Jun 2021 02:17:03 +0000 (22:17 -0400)] 
Fix: ust: UST communication can return -EAGAIN

Observed issue
==============

The following scenario lead to an abort on event creation. The
problem manifest itself when an application is unresponsive. Note that
the default timeout for ust communication is 5 seconds.

  # Start an instrumented app
  ./app
  gdb lttng-sessiond
  # put a breakpoint on ustctl_create_event.
  lttng create my_session
  lttng enable-event -u -a
  lttng start
  # The tracepoint should hit. Do not continue.
  kill -s SIGSTOP $(pgrep app)
  # Continue lttng-sessiond.
  # lttng-sessiond will abort.

Note that for UST this is not an expected behaviour. Expected
communication failure with a single app should not invalidate the
complete channel, compromise its setup or result in an abort.

Note that a similar scenario for the following ustctl call sites also
lead to scenario where failure of a single app lead to error reporting
and/or error propagation to upper level object.

Problematic callsites:
   ustctl_set_exclusion
   ustctl_set_filter
   ustctl_disable_channel

These callsites are also fixed by this patch.

Cause
=====

For an unresponsive application, EAGAIN is returned and is treated as an
"unknown" hard error.

In this particular case the abort() call was introduced by commit:
88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. It is not clear if this is
a leftover from debugging session since this is the only callsite where
an abort is issued on communication failure via ustctl.

Solution
========

Handle EAGAIN coming from ustctl_* and treat it the same way a
dying application is handled. The only minor difference is that we WARN
on communication time out. Albeit not the most useful thing for a CLI
client, it could help overall user of lttng-sessiond in time out
situation.

Most call site already handled "unknown" error correctly. For those call
site we simply end up bringing more info in regards to the timeout
issue instead of mentioning that "-11" was returned.

Note, the reclamation of "app" is handled by the poll loop and
ust_app_unregister since the socket is shutdown by lttng-ust internally
on error, including EAGAIN.

Note that the application will try to register itself back to the
lttng-sessiond based on its configuration.

Known drawbacks
=========
None

Note
==========

Some logging call sites used the ppid of the app instead of the pid.
Those have been changed to pid.

References
==========
[1] https://github.com/lttng/lttng-tools/commit/88e3c2f5610b9ac89b0923d448fee34140fc46fb

Fixes: #1384
Change-Id: If364b5d48e7fd2b664276a0fb1b7eec2c45ed683
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: ust: segfault on lttng start on filter bytecode copy
Jonathan Rajotte [Mon, 12 Jul 2021 20:44:38 +0000 (16:44 -0400)] 
Fix: ust: segfault on lttng start on filter bytecode copy

Observed issue
==============

A segmentation fault is observed for multiple UST timeout scenarios.

Backtrace:

 #0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
 #1  0x0000557fe0395df9 in copy_filter_bytecode (orig_f=0x7f9c5802b790) at ust-app.c:1196
 #2  0x0000557fe0397702 in shadow_copy_event (ua_event=0x7f9c58025ff0, uevent=0x7f9c58033560) at ust-app.c:1824
 #3  0x0000557fe039ac46 in create_ust_app_event (ua_sess=0x7f9c5802ec20, ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, app=0x7f9c5c001da0) at ust-app.c:3192
 #4  0x0000557fe03a054d in ust_app_channel_synchronize_event (ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, ua_sess=0x7f9c5802ec20, app=0x7f9c5c001da0) at ust-app.c:5096
 #5  0x0000557fe03a0772 in ust_app_synchronize (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5173
 #6  0x0000557fe03a0a70 in ust_app_global_update (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5255
 #7  0x0000557fe03a00e0 in ust_app_start_trace_all (usess=0x7f9c580074a0) at ust-app.c:4987
 #8  0x0000557fe0355c6a in cmd_start_trace (session=0x7f9c5800a190) at cmd.c:2668
 #9  0x0000557fe0382e70 in process_client_msg (cmd_ctx=0x7f9c58003d70, sock=0x7f9c74bf44e0, sock_error=0x7f9c74bf44e4) at client.c:1527
 #10 0x0000557fe03848a2 in thread_manage_clients (data=0x557fe06d9440) at client.c:2200
 #11 0x0000557fe037d1cb in launch_thread (data=0x557fe06d94b0) at thread.c:75
 #12 0x00007f9c796af609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #13 0x00007f9c795b6293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The scenario:

  # Start an instrumented app
  ./app
  gdb lttng-sessiond
  # put a breakpoint on ustctl_set_filter
  lttng create my_session
  lttng enable-event -u tp:tp_test
  lttng start
  lttng enable-event -u __dummy --filter 'my_field == "user34"'
  # The tracepoint should hit. Do not continue.
  kill -s SIGSTOP $(pgrep app)
  # Continue lttng-sessiond.
  # enable-event will return an error. This a bug in itself, still let's
  # continue with the current bug.
  lttng stop
  # Start a new app that will register.
  ./app &
  sleep 1
  lttng start
  # lttng-sessiond should segfault.

Cause
=====

During the "lttng enable-event" command, the timeout error bubbles up
all the way to event_ust_enable_tracepoint and is different from
LTTNG_UST_ERR_EXIST. `trace_ust_destroy_event` is called and frees the
`uevent` object. Note that contrary to the comment `uevent` is added to
the channel event hash table at this point.

On the next `lttng start` command, the event node is still present in
the hash table and is iterated on. lttng-sessiond segfault on the first
data access of the previously freed memory.

The problem was introduced by commit
88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. Which essentially move the
callsite of `add_unique_ust_event` before `ust_app_*_event_glb` calls.

Solution
========

Go to `end` label to prevent freeing of the uevent object.

Note that app synchronization should not force an error at the channel
level, since a single app can fail but the whole channel should not.

The `error` label is now obsolete.

Known drawbacks
=========

None.

References
==========

[1] https://github.com/lttng/lttng-tools/commit/88e3c2f5610b9ac89b0923d448fee34140fc46fb

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Ifaf3f4c71bb2da869c7b441aaa4b367f8f7cbdd6
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: sessiond: previously created channel cannot be enabled
Jonathan Rajotte [Thu, 7 Oct 2021 20:19:41 +0000 (16:19 -0400)] 
Fix: sessiond: previously created channel cannot be enabled

Observed issue
==============

A previously created channel cannot be enabled back once a session is
started.

Cause
=====

The check validating that the session was started is to early in the
`cmd_enable_channel` function.

Solution
========

Move the check at the creation code path when the channel is not found.

Known drawbacks
=========

None.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I8e7d62b7e97246e65f1cf9022270293a6dd34cc9
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoBuild fix: Missing message in LTTNG_DEPRECATED invocation
Jérémie Galarneau [Fri, 15 Oct 2021 21:03:38 +0000 (17:03 -0400)] 
Build fix: Missing message in LTTNG_DEPRECATED invocation

Coverity scan build jobs fail since LTTNG_DEPRECATED expects a string
and none is provided at the lttng_metadata_regenerate use site.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7e6701abd24c679f578b0adead771ac93b6566cd

3 years agoFix: notification-thread: handling event from a removed tracer event src
Francis Deslauriers [Mon, 27 Sep 2021 13:42:54 +0000 (09:42 -0400)] 
Fix: notification-thread: handling event from a removed tracer event src

Issue
=====
The issue is caused by a race condition where the `lttng_poll_wait()`
returns a _REMOVE_TRACER_EVENT_SOURCE event followed by an actual
notification event on the removed event source fd.

This causes the notification thread to remove the fd from the potential
notification sources list and later fail to find that same fd in the
next iteration.

This race condition can lead to the notification thread to hang
indefinitely or to failed assertions within the `fini_thread_state()`
function.

Fix
===
When removing an tracer event source, force the notification thread
`lttng_poll_wait()` loop to restart to ignore events from the removed
fd.

Use the `restart_poll` for that purpose (see note below).

Reproducer
==========
It's easy to reproduce this issue by adding a `usleep(5000)` just before
the `lttng_poll_wait()` call in the notification thread.

Note
====
It's the second time that I fix this issue.

It was first fixed by this commit by adding the `restart_poll` flag:
  commit 8b5240601e4ddf6127e4291b7194dd5179cb35b5
  Author: Francis Deslauriers <francis.deslauriers@efficios.com>
  Date:   Thu Dec 10 15:41:29 2020 -0500

    notification-thread: drain all tracer notification on removal

and later, that other commit refactored that code but accidently removed
the use of the `restart_poll`:
  commit 34bf4f69e49d8a69331a6aa6826ef1f155e20ede
  Author: Francis Deslauriers <francis.deslauriers@efficios.com>
  Date:   Wed May 26 16:05:16 2021 -0400

    notification-thread: remove fd from pollset on LPOLLHUP and friends

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6da0ed4374b612934adc72fb88d5c142505c5d53

3 years agoinclude: add missing "extern"
Simon Marchi [Wed, 6 Oct 2021 15:41:19 +0000 (11:41 -0400)] 
include: add missing "extern"

Change-Id: I37574b25adede7c639a04c508f6e4be8256339d9
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoinclude: remove spurious spaces in condition/session-rotation.h
Simon Marchi [Wed, 6 Oct 2021 14:57:24 +0000 (10:57 -0400)] 
include: remove spurious spaces in condition/session-rotation.h

Change-Id: Ia525d24c3b4098dff5c50fb2c5d93c16f6e08f5c
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agotests: fix header of regression/ust/getcpu-override/run-getcpu-override
Simon Marchi [Tue, 5 Oct 2021 20:10:18 +0000 (16:10 -0400)] 
tests: fix header of regression/ust/getcpu-override/run-getcpu-override

The "SPDX-License-Identifier:" header is not in a comment, so is
interpreted as a bash command.  This is harmless, but it appears in the
test output:

    ok 13 - Start tracing for session sequence-cpu
    # Launching app with getcpu-plugin wrapper
    ./tests/regression/ust/getcpu-override//run-getcpu-override: 2: SPDX-License-Identifier:: not found
    ok 14 - Application with wrapper done

Fix that, and add a proper copyright notice, based on the other files
that were added at the same time as this one.

Change-Id: Icdf5e2fd5aec4080b2e5cad10cca4813bad26394
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agofix: wrong define used for GCC version check
Michael Jeanson [Thu, 5 Aug 2021 20:48:51 +0000 (16:48 -0400)] 
fix: wrong define used for GCC version check

As far as I can tell, the __GNUC_MAJOR__ define has never existed, the
proper define for the major version is __GNUC__. See
https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html for
more details.

Change-Id: I0d47d524e7efd204fd2f8976311c62e872eb6170
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: userspace-probe: unreported error on string copy error
Jérémie Galarneau [Mon, 4 Oct 2021 16:41:51 +0000 (12:41 -0400)] 
Fix: userspace-probe: unreported error on string copy error

Issue
=====

String copy errors, either due to the length or an allocation failure,
are not reported by
lttng_userspace_probe_location_tracepoint_create_from_payload
and don't log a clear error message.

This allowed truncation bugs like the one fixed in b45a296 to go
unnoticed.

Fix
===

Return an "invalid" status code and log a more descriptive error
message.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia07cac7cba315ea79337262e9082dd06eb60950f

3 years agoFix: userspace-probe: truncating binary path for SDT
Francis Deslauriers [Fri, 1 Oct 2021 20:10:24 +0000 (16:10 -0400)] 
Fix: userspace-probe: truncating binary path for SDT

Issue
=====
This issue was uncovered when we enabled the testing of the SDT
userspace probe instrumentation on the CI, where the paths to file are
specially long.

The reported error is:
  -    rule: ma-probe-sdt (type: kernel:uprobe, location type: SDT, location: /root/workspace/dev_gerrit_lttng-tools_rootbuild/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/amd64-rootnode/test_type/base/src/lttng-tools/tests/utils/testapp/userspace-probe-sdt-binary/.libs/userspace-probe-sdt-binary:foobar:tp1)
  +    rule: ma-probe-sdt (type: kernel:uprobe, location type: SDT, location: /root/workspace/dev_gerrit_lttng-tools_rootbuild/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/amd64-rootnode/test_type/base/src/lttng-tools/tests/utils/testapp/userspace-probe-sdt-binary/.libs/userspace-probe-s:foobar:tp1)

The important part to notice is that the path to the binary is truncated
compared to was is expected by the test case.

The problem is caused by the
`lttng_userspace_probe_location_tracepoint_create_from_payload()`
function that strdup() the path string using the wrong defined value.

Fix
===
Use LTTNG_PATH_MAX rather then LTTNG_SYMBOL_NAME_LEN to copy the binary
path.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I24cbf413baba405bf4c4b534ccbc2b18f8d5d43f

3 years agoFix: lttng: add-trigger: don't provide a default event rule type
Jérémie Galarneau [Thu, 1 Jul 2021 19:50:47 +0000 (15:50 -0400)] 
Fix: lttng: add-trigger: don't provide a default event rule type

There is no reason for an event rule to have a default type. The
--type parameter is required.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic7f03453fac410c96ca6bb3b3ca0bdfb297a10d1

3 years agoFix: statements with side-effects in assert statements
Francis Deslauriers [Thu, 19 Aug 2021 21:14:46 +0000 (17:14 -0400)] 
Fix: statements with side-effects in assert statements

Background
==========
When building with the NDEBUG definition the `assert()` statements are
removed.

Issue
=====
Currently, a few `assert()` statements in the code base contain
statements that have side effects and removing them changes the
behavior for the program.

Fix
===
Extract the statements with side effects out of the `assert()`
statements.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0b11c8e25c3380563332b4c0fad15f70b09a7335

3 years agoFix: lttng_trace_archive_location_serialize is called on freed memory
Jonathan Rajotte [Thu, 16 Sep 2021 15:20:07 +0000 (11:20 -0400)] 
Fix: lttng_trace_archive_location_serialize is called on freed memory

Observed issue
==============

The following backtrace have been reported [1].

 #0  __GI_raise (sig=sig@entry=6) at /usr/src/debug/glibc/2.31+gitAUTOINC+f84949f1c4-r0/git/sysdeps/unix/sysv/linux/raise.c:50
 #1  0x0000003123025528 in __GI_abort () at /usr/src/debug/glibc/2.31+gitAUTOINC+f84949f1c4-r0/git/stdlib/abort.c:79
 #2  0x0000000000419884 in lttng_trace_archive_location_serialize (location=0x7f1c9c001160, buffer=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/location.c:230
 #3  0x00000000004c8f06 in lttng_evaluation_session_rotation_serialize (evaluation=0x7f1cb000a7f0, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/conditions/session-rotation.c:539
 #4  0x00000000004a80fa in lttng_evaluation_serialize (evaluation=0x7f1cb000a7f0, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/evaluation.c:42
 #5  0x00000000004bc24f in lttng_notification_serialize (notification=0x7f1cb961c310, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/notification.c:63
 #6  0x0000000000458b7d in notification_client_list_send_evaluation (client_list=0x7f1cb0008f90, trigger=0x7f1ca40113d0, evaluation=<optimized out>, source_object_creds=0x7f1cb000a874, client_report=0x475840 <client_handle_transmission_status>, user_data=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/notification-thread-events.c:4379
 #7  0x0000000000476586 in action_executor_generic_handler (item=0x7f1cb0009600, work_item=0x7f1cb000a820, executor=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:696
 #8  action_work_item_execute (work_item=0x7f1cb000a820, executor=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:715
 #9  action_executor_thread (_data=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:797
 #10 0x0000000000462327 in launch_thread (data=0x7f1cb00060b0) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/thread.c:66
 #11 0x0000003123408ea4 in start_thread (arg=<optimized out>) at /usr/src/debug/glibc/2.31+gitAUTOINC+f84949f1c4-r0/git/nptl/pthread_create.c:477
 #12 0x00000031230f8dcf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This can be easily reproduced with the following session and trigger
configuration:

 lttng create test
 lttng enable-event -u -a
 lttng start
 # Register two similar triggers via a dummy C program since rotation
 # completed condition is not exposed on the CLI for now. Yielding the
 # following triggers:
 lttng list-triggers
 - name: trigger0
   owner uid: 1000
   condition: session rotation completed
     session name: test
     errors: none
  action:notify
   errors: none
 - name: trigger1
   owner uid: 1000
   condition: session rotation completed
     session name: test
     errors: none
  action:notify
   errors: none

  lttng rotate <- abort happens here.

Cause
=====

The problem lies in how the location (`lttng_trace_archive_location`)
object is assigned to the `lttng_evaluation` objects. A single location
object can end up being shared between multiple `lttng_evaluation` objects
since we iterate over all triggers and create an `lttng_evaluation` object
with the location each time as needed.

See `src/bin/lttng-sessiond/notification-thread-events.c:1956`.

The location object is then freed when the first notification is
completely serialized. The second serialization end up having a
reference to a freed `lttng_trace_archive_location` object.

Solution
========

Implement ref counting for the lttng_trace_archive_location object.

Note
=======

This also fixes a leak that was present in `cmd_destroy_session_reply`.

The location is created by `session_get_trace_archive_location` and is
never `destroyed`/`put`.

Known drawbacks
=========

None.

References
==========

[1] https://bugs.lttng.org/issues/1325

Fixes: #1325
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I99dc595ee5b0288c727b193ed061f5273752bd24
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: sessiond: ust session is inactive during ust_app_global_update
Jonathan Rajotte [Mon, 13 Sep 2021 20:49:48 +0000 (16:49 -0400)] 
Fix: sessiond: ust session is inactive during ust_app_global_update

Observed issue
==============

The following scenario leads to an abort of lttng-sessiond.

 lttng-sessiond (with kernel tracing available)
 lttng create system-trace --snapshot -U /tmp/snapshot
 lttng enable-channel -k system-trace --subbuf-size=4k --num-subbuf=256
 lttng enable-event -c system-trace -k 'sched_wak*' -s system-trace
 lttng start system-trace

 lttng enable-event -u -a

Fails as expected with:
 Error: Events: The command tried to enable an event in a new domain for
 a session that has already been started once. (channel channel0,
 session system-trace)

Launch any ust app such as easy_ust from the lttng-ust repository.

The following backtrace is generated:

 (gdb) bt
 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007ffff7af0859 in __GI_abort () at abort.c:79
 #2  0x00007ffff7af0729 in __assert_fail_base (fmt=0x7ffff7c86588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55555564b765 "usess->active", file=0x555555649a60 "ust-app.c", line
 #3  0x00007ffff7b01f36 in __GI___assert_fail (assertion=0x55555564b765 "usess->active", file=0x555555649a60 "ust-app.c", line=5123, function=0x55555564ecf0 <__PRETTY_FUNCTION__.14199> "ust_
 #4  0x00005555555d1f5e in ust_app_global_update (usess=0x7fffe001fb90, app=0x7fffac000b80) at ust-app.c:5123
 #5  0x00005555555b60d4 in update_ust_app (app_sock=82) at dispatch.c:71
 #6  0x00005555555b7025 in thread_dispatch_ust_registration (data=0x5555556a07f0) at dispatch.c:409
 #7  0x00005555555ad5ab in launch_thread (data=0x5555556a0810) at thread.c:65
 #8  0x00007ffff7ce6609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #9  0x00007ffff7bed293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This also happens for the track command. You can replace the `lttng
enable-event -u -a` with `lttng track --userspace --vuid=0` then launch
an app and the same backtrace gets generated.

Cause
=====

During `process_client_msg` the `create_ust_session` function is called
and a ust session is assigned to the "system_trace" session with a
state of `active` set to 0 (false). This is not a problem.

The problem seems to lie with a single call site for
`ust_app_global_update` in `update_ust_app`. The status of the ust
session is not checked before calling the `ust_app_global_update`. It is
important to note that all `ust_app_global_update_all` callsites guard
the call with a check against the status of the session.

Solution
========

Guard the call to `ust_app_global_update` with a check of the ust
session active state.

Known drawbacks
=========

None.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I14d25d99d0609689247cdfa86130bd0219613581

3 years agoFix: common: error query for trigger action protocol error
Jonathan Rajotte [Tue, 14 Sep 2021 20:10:36 +0000 (16:10 -0400)] 
Fix: common: error query for trigger action protocol error

Observed issue
==============

When listing a trigger with a single non-list action the CLI reports an
error in the protocol resulting in an output with no error accounting
for the action.

 $ lttng list-triggers
  - name: trigger0
    owner uid: 1000
    condition: session rotation ongoing
      session name: test
      errors: none
   action:notify
  Error: Failed to query errors of trigger 'trigger0' (owner uid: 1000): Protocol error occurred

Cause
=====

The `action_path` associated with the query has an index count of 0 as
it should considering that the single root element action element is not
a `list` object.

Inside `lttng_action_path_create_from_payload` a payload view is
initialized with a `len` of 0 since `header->index_count` is 0 as it
should.

The payload view is then validated and is considered invalid since the
validation check for `len` > 0. The error then bubbles up.

Solution
========

Since that the payload view is considered invalid when it is equal to
zero simply handle this special case and call directly
`lttng_action_path_create` with the appropriate parameter.

Known drawbacks
=========

None.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8f302c3aa78835342c665793908dc02f0a9dece4

3 years agoFix: common: un-hide two rate policy functions
Simon Marchi [Tue, 21 Sep 2021 14:31:55 +0000 (10:31 -0400)] 
Fix: common: un-hide two rate policy functions

These functions are part of the liblttng-ctl API/ABI, they should not be
hidden.

Change-Id: Ic04bb4e7a0bfd0c7d661228b7ccf5d17dccfd9ba
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: include: remove unneeded declaration of lttng_session_descriptor_get_session_name
Simon Marchi [Tue, 21 Sep 2021 13:30:09 +0000 (09:30 -0400)] 
Fix: include: remove unneeded declaration of lttng_session_descriptor_get_session_name

There is a declaration of lttng_session_descriptor_get_session_name in
both session-descriptor.h and session-descriptor-internal.h.  Since this
is a function exposed by the API, the one in -internal.h is not needed,
remove it.

Since the removed declaration had LTTNG_HIDDEN, this has the effect of
making the lttng_session_descriptor_get_session_name symbol of
liblttng-ctl exported / part of the ABI. I think it was a mistake that
it wasn't previously exported.

Change-Id: I79d383f012d161a6df42240c6849b1b3af109def
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: Tests: race condition in test_ns_contexts_change
Francis Deslauriers [Wed, 8 Sep 2021 14:16:23 +0000 (10:16 -0400)] 
Fix: Tests: race condition in test_ns_contexts_change

Issue
=====
The test script doesn't wait for the test application to complete before
stopping the tracing session. The race is that depending on the
scheduling the application is not always done generating events when the
session is stopped.

Fix
===
Make the test script wait for the termination of the test app before
stopping the session.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I29d9b41d2a2ed60a6c42020509c2067442ae332c

3 years agoFix: Tests: race condition in test_event_tracker
Francis Deslauriers [Tue, 7 Sep 2021 21:10:31 +0000 (17:10 -0400)] 
Fix: Tests: race condition in test_event_tracker

Background
==========
The `test_event_tracker` file contains test cases when the event
generating app in executed in two distinct steps. Those two steps are
preparation and execution.
  1. the preparation is the launching the app in the background, and
  2. the execution is actually generating the event that should or
     should not be traced depending on the test case.

This is useful to test the tracker feature since we want to ensure that
already running apps are notified properly when changing their tracking
status.

Issue
=====
The `test_event_vpid_track_untrack` test case suffers from a race
condition that is easy to reproduce on Yocto.

The issue is that sometimes events are end up the trace when none is
expected.

This is due to the absence of synchronization point at the launch of the
app which leads to the app being scheduled in-between the track-untrack
calls leading to events being recorded to the trace.

It's easy to reproduce this issue on my machine by adding a `sleep 5`
between the track and untrack calls and setting the `NR_USEC_WAIT`
variable to 1.

Fix
===
Using the testapp `--sync-before-last-event-touch` flag to make the app
create a file when all but the last event are executed. We then have the
app wait until we create a file (`--sync-before-last-event`) to generate
that last event. This way, we are sure no event will be generated when
running the track and untrack commands.

Notes
=====
- This issue affects other test cases in this file.
- This commit fixes a typo in the test header.
- This commit adds `diag` calls to help tracking to what test the output
  relates to when reading the log.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia2b68128dc9a805526f9748f31ec2c2d95566f31

3 years agoFix: man: lttng-rotate: trace file count/size limitation does not apply
Jérémie Galarneau [Fri, 13 Aug 2021 15:15:15 +0000 (11:15 -0400)] 
Fix: man: lttng-rotate: trace file count/size limitation does not apply

Reported-by: Zach Kramer <Zach.Kramer@cognex.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I337fd06a12d145bdd97c14b4b1894e3676945f63

3 years agoFix: runas: less-than-zero comparison of an unsigned value
Francis Deslauriers [Fri, 6 Aug 2021 13:40:20 +0000 (09:40 -0400)] 
Fix: runas: less-than-zero comparison of an unsigned value

Fixes two defects found by Coverity related to unsigned integers being
treated as signed.

Reported by Coverity:
    CID 1461333:  Control flow issues  (NO_EFFECT)
    This less-than-zero comparison of an unsigned value is never true. "buf_size < 0UL".

    CID 1461332:  Integer handling issues  (NEGATIVE_RETURNS)
    "buf_size" is passed to a parameter that cannot be negative.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id6d4a71960f2ef34f14c05e66ef5d934b7a3e524

3 years agoFix: runas: supplementary groups are ignored on lttng save
Francis Deslauriers [Fri, 23 Jul 2021 20:27:00 +0000 (16:27 -0400)] 
Fix: runas: supplementary groups are ignored on lttng save

Observed issue
==============

On `lttng save` the following is reported to the user:

 $ sudo -u my_user lttng save -o /tmp/my_dir my_session_name
 Error: Permission denied

Note that:
  * the running lttng-sessiond is root,
  * "my_user" is part of the tracing group,
  * "my_user" primary group is "my_user" and is part of group "my_dummy_group"
  * The "/tmp/my_dir" has the following permissions:

    drwxrwx--- 2 root my_dummy_group 4096 Jul 26 16:39 /tmp/my_dir/

Cause
=====

The supplementary groups are not initialized when the run-as process
demote itself to the user "my_user" to perform the recursive mkdir
required by the `lttng save` command.

From the point of the view the kernel, at the moment of performing the
mkdir call the permissions looks like this:

 euid: uid of "my_user"
 egid: primary gid of "my_user"
 supplementary group list: "root"

Note that the kernel does not treat the presence of the root group in
the supplementary group list in any special way. Since "root gid" !=
"my_dummy_group gid" the directory creation is refused.

Solution
========

Use initgroups(3) to initialize the supplementary group list.

Known drawbacks
=========

None.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I58656a3107e4f7b59a2391a4759988401cad7a2b

3 years agoDocs: lttng-event-rule(7): --exclude does not exist, use --exclude-name
Jérémie Galarneau [Tue, 3 Aug 2021 18:27:03 +0000 (14:27 -0400)] 
Docs: lttng-event-rule(7): --exclude does not exist, use --exclude-name

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I92bb8e1b362d121172368897e6a9d4f538d4c68d

3 years agosessiond: logging typo: {triger, triggger} -> trigger
Francis Deslauriers [Thu, 8 Jul 2021 19:46:02 +0000 (15:46 -0400)] 
sessiond: logging typo: {triger, triggger} -> trigger

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ida8faafc4c12f9817d3ee097bb648c10bd5ff854

3 years agoFix: lttng: free sessions in cmd_destroy
Simon Marchi [Mon, 2 Aug 2021 01:02:39 +0000 (21:02 -0400)] 
Fix: lttng: free sessions in cmd_destroy

When doing `lttng destroy`, I get:

    Direct leak of 4385 byte(s) in 1 object(s) allocated from:
        #0 0x7f74ae025459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x7f74add4129a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
        #2 0x7f74add42b9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
        #3 0x7f74add42f9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
        #4 0x7f74add41714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
        #5 0x7f74add41747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
        #6 0x7f74add4a922 in lttng_list_sessions /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2105
        #7 0x56472bcbdf80 in cmd_destroy /home/simark/src/lttng-tools/src/bin/lttng/commands/destroy.c:330
        #8 0x56472bd00764 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
        #9 0x56472bd01218 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
        #10 0x56472bd0151a in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
        #11 0x7f74ad963b24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

This is due to cmd_destroy not free'ing the result of
lttng_list_sessions. Fix that.

Change-Id: Iff2e75e6ec1cdcd0bdfdbbc3d5099422e592905b
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: lttng: free domains and channels in get_session_stats_str
Simon Marchi [Mon, 2 Aug 2021 00:33:23 +0000 (20:33 -0400)] 
Fix: lttng: free domains and channels in get_session_stats_str

When doing `lttng stop`, I get:

    Direct leak of 656 byte(s) in 1 object(s) allocated from:
        #0 0x7f970719e459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x7f9706eba29a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
        #2 0x7f9706ebbb9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
        #3 0x7f9706ebbf9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
        #4 0x7f9706eba714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
        #5 0x7f9706eba747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
        #6 0x7f9706ec4604 in lttng_list_channels /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2262
        #7 0x55837235c4e7 in get_session_stats_str /home/simark/src/lttng-tools/src/bin/lttng/utils.c:499
        #8 0x55837235bf73 in print_session_stats /home/simark/src/lttng-tools/src/bin/lttng/utils.c:445
        #9 0x55837231cc12 in stop_tracing /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:138
        #10 0x55837231d062 in cmd_stop /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:229
        #11 0x55837235e63e in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
        #12 0x55837235f0f2 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
        #13 0x55837235f3f4 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
        #14 0x7f9706adcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

    Direct leak of 308 byte(s) in 1 object(s) allocated from:
        #0 0x7f970719e459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x7f9706eba29a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
        #2 0x7f9706ebbb9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
        #3 0x7f9706ebbf9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
        #4 0x7f9706eba714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
        #5 0x7f9706eba747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
        #6 0x7f9706ec421c in lttng_list_domains /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2220
        #7 0x55837235c3d3 in get_session_stats_str /home/simark/src/lttng-tools/src/bin/lttng/utils.c:484
        #8 0x55837235bf73 in print_session_stats /home/simark/src/lttng-tools/src/bin/lttng/utils.c:445
        #9 0x55837231cc12 in stop_tracing /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:138
        #10 0x55837231d062 in cmd_stop /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:229
        #11 0x55837235e63e in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
        #12 0x55837235f0f2 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
        #13 0x55837235f3f4 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
        #14 0x7f9706adcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

This is due to the get_session_stats_str function not free'ing the
results of lttng_list_channels and lttng_list_domains.  Fix that.

Change-Id: I4c200d3df41bf09bdce8eadb000abbff7fe5a751
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoUpdate version to v2.13.0
Jérémie Galarneau [Mon, 2 Aug 2021 20:49:46 +0000 (16:49 -0400)] 
Update version to v2.13.0

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoTests fix: unix socket: leaked socket of connection to child
Jérémie Galarneau [Mon, 19 Jul 2021 21:21:17 +0000 (17:21 -0400)] 
Tests fix: unix socket: leaked socket of connection to child

The child_connection socket is only used by the parent in the
credentials passing test. The teardown assumes the reverse which causes
the socket to be leaked.

1458471 Resource leak

The system resource will not be reclaimed and reused, reducing the
future availability of the resource.

In test_creds_passing: Leak of memory or pointers to system
resources (CWE-404)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2ead9abbfc189ffbdd71a27f6376d0b001cdc2a3

3 years agoFix: sessiond: notification: missing unlock on client skip
Jérémie Galarneau [Mon, 19 Jul 2021 21:17:39 +0000 (17:17 -0400)] 
Fix: sessiond: notification: missing unlock on client skip

Skipping a client must be performed by using the dedicated "skip_client"
label which will unlock the client's lock before continuing the loop
rather than using 'continue' directly.

Currently, a client will remain locked when an hidden trigger emits
a notification to which it is subscribed.

1458230 Missing unlock

May result in deadlock if there is another attempt to acquire the lock.

In notification_client_list_send_evaluation: Missing a release of a lock
on a path (CWE-667)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8b69395b91b0ea59ae5e0beadebd9099db623121

3 years agoUpdate version to v2.13.0-rc3
Jérémie Galarneau [Fri, 16 Jul 2021 18:57:49 +0000 (14:57 -0400)] 
Update version to v2.13.0-rc3

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoliblttng-ctl: hide logger_thread_name
Jérémie Galarneau [Fri, 16 Jul 2021 18:47:53 +0000 (14:47 -0400)] 
liblttng-ctl: hide logger_thread_name

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4eb5a86029c6220ad4f48d382ec26126fd82e443

3 years agoliblttng-ctl: hide MI trigger command variables
Jérémie Galarneau [Fri, 16 Jul 2021 18:42:24 +0000 (14:42 -0400)] 
liblttng-ctl: hide MI trigger command variables

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I45eec5bb0fd3353c8f1257b3c94ef08440114b21

3 years agoCleanup: rename `get_domain_str()` -> `lttng_domain_type_str()`
Francis Deslauriers [Tue, 25 May 2021 19:57:59 +0000 (15:57 -0400)] 
Cleanup: rename `get_domain_str()` -> `lttng_domain_type_str()`

Both functions currently exist in the code base and accomplish the same
goal. Let's keep only one of them.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2254b846f0b5bdc883c86d970fde7daffa9e6155

3 years ago.gitignore: Add hidden trigger test
Jérémie Galarneau [Fri, 16 Jul 2021 17:29:07 +0000 (13:29 -0400)] 
.gitignore: Add hidden trigger test

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iab0fe77c0d4607d5469a7aa57d6bd784d47d8609

3 years agoTest: unix socket: test credential passing
Jérémie Galarneau [Thu, 15 Jul 2021 00:44:44 +0000 (20:44 -0400)] 
Test: unix socket: test credential passing

Since the credential passing over UNIX sockets now makes use of the pid,
the compatiblity wrappers have become more complex as each platform
appears to define its own way of accessing this information.

This new test:
  - creates a named unix socket,
  - forks,
  - gets the parents and child to connect,
  - sends the child's credentials as a data payload and as credentials
    verified by the kernel
  - the parent checks that the two sets of credentials are equal.

This is more of a sanity check for the compatibility wrappers used on
non-Linux platforms.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic0a6213afca7cc95a00617b052e7a145fc88625c

3 years agoBuild fix: retrieve unix socket peer PID on non-unix platforms
Jérémie Galarneau [Wed, 14 Jul 2021 19:19:15 +0000 (15:19 -0400)] 
Build fix: retrieve unix socket peer PID on non-unix platforms

The previous attempt at extending the credential retrieval wrapper was
broken and didn't build on FreeBSD, macOS, and cygwin.

A platform-specific way of retrieving the PID of a unix peer is
implemented for FreeBSD (getsockopt using LOCAL_PEERCRED, note that the
cr_pid field is only available from FreeBSD 13 and up),
macOS (getsockopt using LOCAL_PEERPID, macOS 10.8+), and
Solaris (getpeerucreds).

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifcf522c70ee4c2e0799293ae0961f41aebff5056

3 years agoFix: sessiond: notification: find_tracer_event_source returns NULL
Jérémie Galarneau [Mon, 12 Jul 2021 22:42:57 +0000 (18:42 -0400)] 
Fix: sessiond: notification: find_tracer_event_source returns NULL

Due to a bad edit of the original patch (my bad!)
find_tracer_event_source_element always returns NULL.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7febee1d803034a06d5063a2cc9179c4edef4809

3 years agoTests: MI: add `diag` statements to test functions
Francis Deslauriers [Thu, 8 Jul 2021 16:35:58 +0000 (12:35 -0400)] 
Tests: MI: add `diag` statements to test functions

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie56e23a3d0796d1edb07e2fd7cdc259816ac0133

3 years agoCleanup: fix comments in `duplicate_{stream,channel}_object()`
Francis Deslauriers [Thu, 21 Jan 2021 17:00:03 +0000 (12:00 -0500)] 
Cleanup: fix comments in `duplicate_{stream,channel}_object()`

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5089d09880d21842bf264f6c30ec7fd5e72b93df

3 years agoTests: add hidden trigger visibility test
Jérémie Galarneau [Fri, 9 Jul 2021 17:00:48 +0000 (13:00 -0400)] 
Tests: add hidden trigger visibility test

Add a regression test for the previous commit that verifies that
internal triggers used by the session daemon to implement various
features (automatic session rotations based on their consumed size, in
this instance) are not visible to users of liblttng-ctl.

The test is written in C to use the library directly. This is needed
since the `lttng` client filters-out anonymous triggers and thus, would
not allow us to see those triggers since they are anonymous by default.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1b8fca648953b8cba49a9888593b3486457d01b2

3 years agoFix: sessiond: list-triggers: don't return internal triggers
Jérémie Galarneau [Fri, 9 Jul 2021 17:00:56 +0000 (13:00 -0400)] 
Fix: sessiond: list-triggers: don't return internal triggers

The session daemon uses triggers internally. For instance, the trigger
and notification subsystem is used to implement the automatic rotation
of sessions based on a size threshold.

Currently, a user of the C API will see those internal triggers if it is
running as the same user as the session daemon. This can be unexpected
by user code that assumes it will be alone in creating triggers.
Moreover, it is possible for external users to unregister those triggers
which would cause bugs.

As the triggers gain more capabilities, it is likely that the session
daemon will keep using them to implement features internally. Thus,
an internal "is_hidden" property is introduced in lttng_trigger.

A "hidden" trigger is a trigger that is not returned by the listings.
It is used to hide triggers that are used internally by the session
daemon so that they can't be listed nor unregistered by external
clients.

This is a property that can only be set internally by the session
daemon. As such, it is not serialized nor set by a
"create_from_buffer" constructor.

The hidden property is preserved by copies.

Note that notifications originating from an "hidden" trigger will not
be sent to clients that are not within the session daemon's process.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I61b7949075172fcd428289e2eb670d03c19bdf71

3 years agounix: receive pid on non-linux platforms
Jérémie Galarneau [Thu, 8 Jul 2021 21:57:45 +0000 (17:57 -0400)] 
unix: receive pid on non-linux platforms

Add a `pid` to the lttng_sock_cred structure definition used on
non-Linux platforms and receive the peer's PID when receiving
credentials.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9c92f6dda6441deca58f9cc85f846f5031cceb6e

3 years agoClean-up: sessiond: return an lttng_error_code from list_triggers
Jérémie Galarneau [Thu, 8 Jul 2021 18:39:59 +0000 (14:39 -0400)] 
Clean-up: sessiond: return an lttng_error_code from list_triggers

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5d44b508a2a5211894c0cc7b6d51a9a03dc8b3f2

3 years agonotification-thread: remove fd from pollset on LPOLLHUP and friends
Francis Deslauriers [Wed, 26 May 2021 20:05:16 +0000 (16:05 -0400)] 
notification-thread: remove fd from pollset on LPOLLHUP and friends

When an app dies, it's possible that the notification thread gets an
epoll event (`LPOLLHUP`) that the socket was closed before it gets the
_REMOVE_TRACER_SOURCE command for that source.

In such cases, the notification thread should simply remove the file
descriptor from the pollset and drain the notification on that file
descriptor. It should _not_ remove the _source_element object from the
list.

The removal from the list should only be done when it receives the
_REMOVE_TRACER_SOURCE command.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9525315f9e92d0f6ae5e84e26b83a6b7207dce54

3 years agoTests: fix: list triggers: bc missing on system
Jérémie Galarneau [Wed, 7 Jul 2021 18:59:02 +0000 (14:59 -0400)] 
Tests: fix: list triggers: bc missing on system

`bc` is not part of the test suite's dependancies and can be replaced,
in this instance, by a use of `printf`.

This use of `bc` caused a number of failures on the CI's Lava workers.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1a1b24a23325754c26ebedfdb6b7728378381d97

3 years agoClean-up: event-expr: remove unreachable code
Jérémie Galarneau [Mon, 5 Jul 2021 18:25:40 +0000 (14:25 -0400)] 
Clean-up: event-expr: remove unreachable code

1452699 Logically dead code

The indicated dead code may have performed some action; that action will
never occur.

In lttng_event_expr_array_field_element_create: Code can never be
reached because of a logical contradiction (CWE-561)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I301e73c8e0cc7b9c4fb889e5bf7ef30d6ecf7d9f

3 years agoFix: lttng: remove-trigger: null dereference on MI initialization error
Jérémie Galarneau [Mon, 5 Jul 2021 18:21:19 +0000 (14:21 -0400)] 
Fix: lttng: remove-trigger: null dereference on MI initialization error

Failures to create an MI writer instance will result in a dereference of
the MI writer when attempting to close the command's output element.

1457842 Dereference after null check

Either the check against null is unnecessary, or there may be a null
pointer dereference.

In cmd_add_trigger: Pointer is checked against null but then
dereferenced anyway (CWE-476)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0bc71bf6c83df7d9d938cf93a12d5f6cf6d7ae36

3 years agoFix: lttng: list-trigger: leak of error query in query callbacks
Jérémie Galarneau [Mon, 5 Jul 2021 18:18:27 +0000 (14:18 -0400)] 
Fix: lttng: list-trigger: leak of error query in query callbacks

1457841 Resource leak

The system resource will not be reclaimed and reused, reducing the
future availability of the resource.

In mi_error_query_trigger_callback: Leak of memory or pointers to system
resources (CWE-404)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4e2cde41d77e5299d1758e8c9387b0a1c63efd17

3 years agoFix: lttng: add-trigger: null dereference on MI initialization error
Jérémie Galarneau [Mon, 5 Jul 2021 18:16:00 +0000 (14:16 -0400)] 
Fix: lttng: add-trigger: null dereference on MI initialization error

Failures to create an MI writer instance will result in a dereference of
the MI writer when attempting to close the command's output element.

1457842 Dereference after null check

Either the check against null is unnecessary, or there may be a null
pointer dereference.

In cmd_add_trigger: Pointer is checked against null but then
dereferenced anyway (CWE-476)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I98b844d2f1c7abd43bd42ee472759de57b34484e

3 years agolttng: add-trigger: print generated trigger name
Jérémie Galarneau [Wed, 30 Jun 2021 22:41:24 +0000 (18:41 -0400)] 
lttng: add-trigger: print generated trigger name

Print the generated trigger name when `add-trigger` succeeds.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id858880260513b9a10c4ce5022a95c476e3e32aa

3 years agosessiond: generate trigger name: name triggers with the 'trigger' prefix
Jérémie Galarneau [Wed, 30 Jun 2021 22:47:52 +0000 (18:47 -0400)] 
sessiond: generate trigger name: name triggers with the 'trigger' prefix

Generated trigger names currently have the form TN, where N is the
number of generated trigger names over the lifetime of the session
daemon.

The form 'triggerN' seems more in line with autogenerated names such
as channel names (e.g. 'channel0').

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id61cd4716bb4c080d9242853c366e14542f60f7c

3 years agoRevert "lttng: add-trigger: print generated trigger name"
Jérémie Galarneau [Thu, 1 Jul 2021 13:12:21 +0000 (09:12 -0400)] 
Revert "lttng: add-trigger: print generated trigger name"

This reverts commit 8310270a50784aced2af5b21ab23bc7bd9dee47f.

This change is still under review.

Change-Id: If75aa02e2e5daa0bfbcf30bea0a2b54c4aca1fd4
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agolttng: add-trigger: print generated trigger name
Jérémie Galarneau [Wed, 30 Jun 2021 22:41:24 +0000 (18:41 -0400)] 
lttng: add-trigger: print generated trigger name

Print the generated trigger name when `add-trigger` succeeds. Also,
no message is emited when a trigger is successfully registered as
the command will print an error message if any error occurs.

There is also no need to parrot the trigger's name if it was specified
by the user.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9607fbd358298b036bd533834143eb5e9d185cd0

3 years agoMI: xsd: bump to 4.1
Jonathan Rajotte [Mon, 7 Jun 2021 22:11:01 +0000 (18:11 -0400)] 
MI: xsd: bump to 4.1

No breaking change were done to the xsd. Only objects related to
triggers, event-rules, actions, condition, and error-query were added.
They do not interfere with the current MI.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia057c0fbea34f8e5c48cb8d8d307f004acc95a00

3 years agoTests: trigger: mi: use utils.sh xsd versions for xml diff
Jonathan Rajotte [Mon, 7 Jun 2021 22:03:17 +0000 (18:03 -0400)] 
Tests: trigger: mi: use utils.sh xsd versions for xml diff

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic1536218b468d300ceb3d16ca160b8a8b891edfc

3 years agoTests: utils: regroup xml utils to utils.sh
Jonathan Rajotte [Mon, 7 Jun 2021 21:56:37 +0000 (17:56 -0400)] 
Tests: utils: regroup xml utils to utils.sh

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idfa0f05d1bde75f4b02c903699281a86494b435f

3 years agoTests: MI: {add, list, remove}-trigger
Jonathan Rajotte [Wed, 26 May 2021 22:08:14 +0000 (18:08 -0400)] 
Tests: MI: {add, list, remove}-trigger

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ica66a759d961cc122c1a1b81ce69fa54b0e78c78

3 years agoMI: xsd: add objects type definition related to trigger
Jonathan Rajotte [Thu, 27 May 2021 01:53:19 +0000 (21:53 -0400)] 
MI: xsd: add objects type definition related to trigger

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If28306f8aaf24890a6d834e9ff69bd00de3da295

3 years agoMI: xsd: sort output_type
Jonathan Rajotte [Thu, 27 May 2021 01:51:26 +0000 (21:51 -0400)] 
MI: xsd: sort output_type

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If2206e4a1c7a54d6d6bc4887c1925b12f035232b

3 years agoMI: xsd: sort command_string_type
Jonathan Rajotte [Thu, 27 May 2021 01:48:20 +0000 (21:48 -0400)] 
MI: xsd: sort command_string_type

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8df9d69aeaf93050c405ff876ad697efac7c4021

3 years agoAdd pretty_xml utils
Jonathan Rajotte [Wed, 26 May 2021 21:17:02 +0000 (17:17 -0400)] 
Add pretty_xml utils

This util reads on stdin and outputs an indented/formatted xml.
It is equivalent to "xmllint --format -".

It will be used for MI trigger testing. For testing we will essentially
diff the output of the command against the expected output. While a
nicely formatted multi-line output is not necessary for a machine to
do the diff, the human that will have to debug it will surely appreciate
it.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1597644941c55ce3e59f7ff16f196ac36325179

3 years agoMove xml utils from mi subfolder to xml-utils folder
Jonathan Rajotte [Wed, 26 May 2021 20:39:12 +0000 (16:39 -0400)] 
Move xml utils from mi subfolder to xml-utils folder

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I268dc544bf4f72f61a701ac3efd0b12488cc2f64

3 years agoFix: lttng_triggers count is not equal to the size of the sorted trigger array
Jonathan Rajotte [Fri, 28 May 2021 18:32:37 +0000 (14:32 -0400)] 
Fix: lttng_triggers count is not equal to the size of the sorted trigger array

Since anonymous triggers can be present in the original lttng_triggers
and that we do not add them to the sorting list, the count to be used
while iterating on the sorted list must be the size of the list itself
and not that of lttng_triggers.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifb1802345199cb20fbb6d401f316be918b8a6443

3 years agoMI: {add, list, remove} trigger
Jonathan Rajotte [Wed, 26 May 2021 17:04:41 +0000 (13:04 -0400)] 
MI: {add, list, remove} trigger

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie16c5c3a894b921e032a99ed3deda4ed5da17e78

3 years agoMI: implement all objects related to trigger machine interface
Jonathan Rajotte [Fri, 7 May 2021 01:26:17 +0000 (21:26 -0400)] 
MI: implement all objects related to trigger machine interface

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idb2045135b1ba87853d6214b149afbe27bb7a1ca

3 years agoMove event-expr-to-bytecode to event-expr
Jonathan Rajotte [Thu, 11 Feb 2021 15:40:39 +0000 (10:40 -0500)] 
Move event-expr-to-bytecode to event-expr

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I74a4b823ae7bbcbb062dbb9a2a0f84785bca287a

3 years agoMove event-expr from liblttng-ctl to libcommon
Jonathan Rajotte [Thu, 11 Feb 2021 15:18:38 +0000 (10:18 -0500)] 
Move event-expr from liblttng-ctl to libcommon

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I31c65cd7f63fa4e1c918285b02ab2ab2e82549f6

3 years agoMI: support double element
Jonathan Rajotte [Thu, 4 Feb 2021 20:57:56 +0000 (15:57 -0500)] 
MI: support double element

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I97411fea238d8b1275028d3d04a6f4f376624001

3 years agoFix: rotation client example: leak of handle on error
Jérémie Galarneau [Wed, 16 Jun 2021 19:08:21 +0000 (15:08 -0400)] 
Fix: rotation client example: leak of handle on error

1452927 Resource leak

The system resource will not be reclaimed and reused, reducing the
future availability of the resource.

In setup_session: Leak of memory or pointers to system
resources (CWE-404)

CID 1452927 (#1 of 1): Resource leak (RESOURCE_LEAK)8. leaked_storage:
Variable chan_handle going out of scope leaks the storage it points to

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4c215ac4a86f9f70fd5c9d3aa13f944d3d7a2cc7

3 years agoSilence warnings on GCC 4.8 with -Wmaybe-uninitialized
Michael Jeanson [Mon, 14 Jun 2021 15:18:19 +0000 (11:18 -0400)] 
Silence warnings on GCC 4.8 with -Wmaybe-uninitialized

We still build on SLES12 with GCC 4.8 in which '-Wmaybe-uninitialized'
doesn't seem to be the sharpest tool in the shed. Add explicit
initialization of 'ret' to silence the warnings.

Change-Id: I1f9de535b6be48357735af106ff555ab9eceb730
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agodoc/man/common-footer.txt: add missing non-breaking space
Philippe Proulx [Tue, 15 Jun 2021 03:07:32 +0000 (23:07 -0400)] 
doc/man/common-footer.txt: add missing non-breaking space

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibefd4e7448920f0f346697eea5e1b5d250a93d1f

3 years agoRename "tracing session" -> "recording session"
Philippe Proulx [Tue, 15 Jun 2021 02:52:02 +0000 (22:52 -0400)] 
Rename "tracing session" -> "recording session"

Starting from LTTng 2.13, _tracing_ is defined as attempting to execute
one or more actions when emitting an event, which is very close to the
trigger definition.

To highlight that a tracing session is only about event recording,
rename this concept to _recording session_.

This patch mostly changes the manual pages, although I also updated some
C source and other files which contain user-facing text to use the new
term.

I didn't update logging messages because debugging scripts could still
refer to "tracing sessions".

The lttng-concepts(7) manual page mentions that the "recording session"
term was "tracing session" before LTTng 2.13.

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I620d6b6be9e0f1dac14c0bc5e26094c3b3711c75

3 years agodoc/man: use double quotes when referring to internal section
Philippe Proulx [Mon, 14 Jun 2021 17:05:37 +0000 (13:05 -0400)] 
doc/man: use double quotes when referring to internal section

This patch adds double quotes to all the manual page internal section
references using their full name. Those references often have the
following AsciiDoc form:

    See the <<id,Full section name>> section below.

With this patch, this would be converted to:

    See the ``<<id,Full section name>>'' section below.

In the rendered manual page, before this patch:

    See the Full section name section below.
            ¯¯¯¯ ¯¯¯¯¯¯¯ ¯¯¯¯
With this patch:

    See the “Full section name” section below.

The purpose of this patch is, thanks to the change in
`doc/man/manpage.xsl`, to remove the italic style for the text of
internal links. Because there's no way to create dynamic internal links
in a manual page, this style causes internal links to look weird when
they're not a full section name, for example:

    Note that the trigger doesn't need to [...]
                  ¯¯¯¯¯¯¯
The HTML rendering of LTTng-tools manual pages can still benefit from
internal links. This patch makes it possible to add more internal links
without degrading the visual style of manual pages when rendered in a
terminal.

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1a5ef7eab7ff1e66c137e16b51a9c9074e43f583

3 years agodoc/man: update type/domain options for common event rule spec.
Philippe Proulx [Tue, 18 May 2021 14:14:47 +0000 (10:14 -0400)] 
doc/man: update type/domain options for common event rule spec.

This patch updates manual pages to follow the recent `--type` and
`--domain` option changes of the lttng-add-trigger(1) command, which
accepts the common event rule specification options of
lttng-event-rule(7).

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6064734534e773bf4f03b5f1e849b57134583039

3 years ago.gitreview: Set default branch to 'stable-2.13'
Michael Jeanson [Tue, 15 Jun 2021 19:05:16 +0000 (15:05 -0400)] 
.gitreview: Set default branch to 'stable-2.13'

Change-Id: Ia321edf68795a2a560b38947d5d888536cae08fa
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
3 years agoFix: use of uninitialised bytes valgrind warning
Francis Deslauriers [Wed, 16 Jun 2021 16:10:42 +0000 (12:10 -0400)] 
Fix: use of uninitialised bytes valgrind warning

Issue
=====

Valgrind reports usage of uninitialised stack allocated memory:
  ==2961363== Thread 9 Client manageme:
  ==2961363== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
  ==2961363==    at 0x521418D: __libc_sendmsg (sendmsg.c:28)
  ==2961363==    by 0x521418D: sendmsg (sendmsg.c:25)
  ==2961363==    by 0x53411B: lttcomm_send_unix_sock (unix.c:294)
  ==2961363==    by 0x48AA8C: send_unix_sock (client.c:896)
  ==2961363==    by 0x484F45: thread_manage_clients (client.c:2865)
  ==2961363==    by 0x480FB4: launch_thread (thread.c:66)
  ==2961363==    by 0x5208608: start_thread (pthread_create.c:477)
  ==2961363==    by 0x5346292: clone (clone.S:95)
  ==2961363==  Address 0x7575389 is 25 bytes inside a block of size 16,384 alloc'd
  ==2961363==    at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==2961363==    by 0x4EB618: lttng_dynamic_buffer_set_capacity (dynamic-buffer.c:166)
  ==2961363==    by 0x4EB52C: lttng_dynamic_buffer_append (dynamic-buffer.c:55)
  ==2961363==    by 0x48CBA1: setup_lttng_msg (client.c:125)
  ==2961363==    by 0x48AD70: setup_lttng_msg_no_cmd_header (client.c:860)
  ==2961363==    by 0x489825: process_client_msg (client.c:2253)
  ==2961363==    by 0x484A97: thread_manage_clients (client.c:2807)
  ==2961363==    by 0x480FB4: launch_thread (thread.c:66)
  ==2961363==    by 0x5208608: start_thread (pthread_create.c:477)
  ==2961363==    by 0x5346292: clone (clone.S:95)
  ==2961363==  Uninitialised value was created by a stack allocation
  ==2961363==    at 0x485FE4: process_client_msg (client.c:928)

After some digging, I found that this warning was caused by the padding
of the `struct lttng_session_list_schedules_return` during the
`LTTNG_SESSION_LIST_ROTATION_SCHEDULES` command.

All the fields are of the stack allocated struct are initialised by the
designated initializer but the padding is not.

These padding bytes are reported by Valgrind as being used
uninitialised.

Fix
===

Remove the padding by adding the LTTNG_PACKED attribute to the nested
structs in `struct lttng_session_list_schedules_return`.

Notes
=====

In light of the actual root cause, this is stacktrace is not really
useful.

The realloc call to grow the buffer makes it hard to find what is the
actual uninitialised stack allocation because Valgrind reports the
realloc call as the problematic site.

I was able to track this issue by adding a "consuming" step in the
`lttng_dynamic_buffer_append()` function. This consuming step would sum
all the bytes of the `buf` parameter so as to force Valgrind to check
each byte and not wait until the `sendmsg()` call. This way, I was able
to get a more precise location of the root cause of the issue.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib4a729575e9117cf95716ad25e1417c833f4232b

3 years agoFix: build: libcommon fd-tracker dependency is not available
Jonathan Rajotte [Mon, 7 Jun 2021 18:21:06 +0000 (14:21 -0400)] 
Fix: build: libcommon fd-tracker dependency is not available

Observed issue
==============

A build configured with:

  ./configure -disable-bin-lttng --disable-bin-lttng-crash --disable-bin-lttng-sessiond --disable-bin-lttng-relayd

Fails at build time with:

  make[3]: *** No rule to make target '../../src/common/fd-tracker/libfd-tracker.la', needed by 'libcommon.la'.  Stop.
  make[3]: *** Waiting for unfinished jobs....
  CC       lttng-elf.lo

Cause
=====

fd-tracker is required by libcommon. This is introduced by commit
8bb66c3cd60938352927ee865759433387324250 [1]

Build of libfd-tracker is disabled at the configure level by
build_lib_fd_tracker which in turn is enabled/disabled by the
--enable/disable-bin-* options.

For the observed issue, the --enable-bin-lttng-consumerd alone does not
enable the build of libfd-tracker.

Solution
========

All dependencies for libcommon are now always built. All bins require
libcommon to be present anyway.

This patch also fix a problem where the examples under the doc are build
even if liblttng-ctl is not built.

Known drawbacks
=========

None.

References
==========

[1]
http://git.lttng.org/?p=lttng-tools.git;a=commit;h=8bb66c3cd60938352927ee865759433387324250

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I94f5d7cdadcb4f8ff9c2617a675659c1f9eb4709

3 years agoClean-up: mark lttng_error_query communication header as const
Jérémie Galarneau [Wed, 9 Jun 2021 22:04:28 +0000 (18:04 -0400)] 
Clean-up: mark lttng_error_query communication header as const

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I166ef90aee0d4d7da9ce1002cbbe2a35eea88757

3 years agoAdd condition-targeting error query
Jérémie Galarneau [Wed, 9 Jun 2021 21:58:45 +0000 (17:58 -0400)] 
Add condition-targeting error query

Notifications discarded by the tracers are reported at the level of a
trigger. As those errors are specific to triggers with an "event-rule
matches" condition, they should be reported through a condition-specific
error query.

Note that a condition error query is created from a trigger: there is no
ambiguity since, unlike actions, conditions cannot be nested.

Given the proximity of the final 2.13 release, the code which populated
trigger error query results is simply used to populate the condition
error query results when the condition is of type "event-rule matches".

No trigger-scope errors can be reported for the moment. However, such
error reports will be added in the future.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1ac3668142041beb6fd61574ccef506707c55b2

3 years agoaction list: missing renames from previous name "group"
Francis Deslauriers [Tue, 8 Jun 2021 21:21:30 +0000 (17:21 -0400)] 
action list: missing renames from previous name "group"

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4373984c2bea96dc67880b1bbb361fb8fbc014ca

This page took 0.059684 seconds and 5 git commands to generate.