Mathieu Desnoyers [Wed, 9 Sep 2015 15:56:36 +0000 (11:56 -0400)]
Enhance relayd error reporting
relay_process_data has error cases that don't print any error to the
console. Add those cases, and enhance the information provided by error
output within handle_index_data().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 9 Sep 2015 15:56:34 +0000 (11:56 -0400)]
Fix: relayd: handle consumerd crashes without leak
We can be clever about indexes partially received in cases where we
received the data socket part, but not the control socket part: since
we're currently closing the stream on behalf of the control socket, we
*know* there won't be any more control information for this socket.
Therefore, we can destroy all indexes for which we have received only
the file descriptor (from data socket). This takes care of consumerd
crashes between sending the data and control information for a packet.
Since those are sent in that order, we take care of consumerd crashes.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 9 Sep 2015 15:56:33 +0000 (11:56 -0400)]
Fix: LPOLLHUP and LPOLLERR when there is still data in pipe/socket
The event mask returned by poll/epoll is a bitwise mask made of all the
events observed. On bidirectional sockets, there are cases where
combinations of LPOLLHUP/LPOLLERR and LPOLLIN/LPOLLPRI can be raised at
the same time.
Currently the overall behavior in sessiond, consumerd and relayd is to
handle LPOLLHUP or LPOLLERR immediately, whether or not there is still
data to read in the socket. Unfortunately, this behavior may discard the
last information made available on the pipe or socket.
Audit all uses of LPOLLHUP and LPOLLERR on sockets on which we expect
data to ensure that we deal with LPOLLIN or LPOLLPRI, and catch the
hangup when read or recvmsg returns 0. Keep the LPOLLHUP and LPOLLERR
handling, but only when LPOLLIN is not raised, just in case some
unforeseen error happens when sending the reply.
This is one correct case where we can handle LPOLLHUP and LPOLLERR
directly without caring about LPOLLIN: sockets where we are expected to
write and then read the reply (e.g. command sockets). It is then OK
for a dedicated thread to watch for LPOLLHUP and LPOLLERR.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 8 Sep 2015 22:32:12 +0000 (18:32 -0400)]
Fix: double RCU unlock on event_agent_disable_all
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 7 Sep 2015 14:36:09 +0000 (10:36 -0400)]
Fix: unbalanced RCU read-side lock in enable event command
The event validation fails, an unpaired RCU unlock is performed, thus
underflowing the RCU nesting counter.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 7 Sep 2015 14:36:08 +0000 (10:36 -0400)]
Add rcu_read_ongoing() assertions around process_client_msg
process_client_msg ensures that RCU read-side lock should not be held
when calling it. Validate this using rcu_read_ongoing() at the entry and
exit points of this function. This allows us to catch unbalanced RCU
read-side lock within commands quickly.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 6 Sep 2015 23:40:42 +0000 (19:40 -0400)]
Clean-up and simplify event_agent_disable_all
event_agent_disable_all contains comments which make no sense since
they were blindly copy-pasted from event_agent_enable_all.
Also add an error_unlock label instead of open coding the unlock
on error.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 6 Sep 2015 23:40:13 +0000 (19:40 -0400)]
Document locking assumption of agent_find_event()
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 2 Sep 2015 16:55:47 +0000 (12:55 -0400)]
Fix: disable agent events by name
The event_agent_disable() function only disables the first
agent event matching a given name. However, if multiple agent
events exist with different loglevels, but share the same name,
we want all of them to be disabled at once.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 2 Sep 2015 15:31:35 +0000 (11:31 -0400)]
sessiond: add loglevels_match()
UST and agent event loglevel matching algorithm is the same
so factor out this code into a common utility.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 2 Sep 2015 05:54:23 +0000 (01:54 -0400)]
Fix: include loglevel type in agent event's primary key
Refs: #913
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 2 Sep 2015 02:53:30 +0000 (22:53 -0400)]
Fix: include loglevel type in UST event's primary key
Refs: #913
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 2 Sep 2015 01:52:40 +0000 (21:52 -0400)]
sessiond: use `loglevel_value` and `loglevel_type` names
By using the `loglevel_value` and `loglevel_type` names instead
of `loglevel` for one or the other, some unsettling
inconsistencies are exposed.
This patch only changes the names to show the weird stuff, e.g.:
key.loglevel_type = loglevel_value;
A future patch will fix this.
The only `loglevel` names left untouched are those in public headers
as well as those in the tools<->UST ABI.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 4 Sep 2015 23:53:19 +0000 (19:53 -0400)]
Tests: kernel wildcards
Fixes #920
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 6 Sep 2015 17:52:00 +0000 (13:52 -0400)]
Tests: fix wildcard test path
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Tue, 1 Sep 2015 22:04:51 +0000 (18:04 -0400)]
doc: document untrack command in lttng(1)
Refs: #917
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Philippe Proulx [Tue, 1 Sep 2015 22:00:49 +0000 (18:00 -0400)]
doc: document track command in lttng(1)
Refs: #917
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Jérémie Galarneau [Sun, 6 Sep 2015 03:51:33 +0000 (23:51 -0400)]
Remove dot after enable-event message
The other domains' enable event confirmation messages don't have
a trailing dot. Removing this one for consistency.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Wed, 2 Sep 2015 00:21:00 +0000 (20:21 -0400)]
Fix: don't print the default channel name when enabling agent events
Enabling an event in the python domain erroneously reported the
channel as being the default `channel0`. Instead, don't report the
channel name when enabling an event in an agent domain.
Fixes: #910
Signed-off-by: Antoine Busque <abusque@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Tue, 1 Sep 2015 23:48:43 +0000 (19:48 -0400)]
Fix: fail gracefully on --exclude on unsupported domains
Trying to use event name exclusions on unsupported domains other than
kernel (i.e. log4j, jul, and python) would hang the client. Instead,
report the error appropriately.
Fixes: #909
Signed-off-by: Antoine Busque <abusque@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Tue, 1 Sep 2015 23:12:28 +0000 (19:12 -0400)]
Fix: initialize live_timer to 0 for snapshot session
The live timer was being initialized to -1 for snapshot sessions,
instead of the expected default value of 0 used elsewhere in the code.
Fixes #879
Signed-off-by: Antoine Busque <abusque@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Antoine Busque [Tue, 1 Sep 2015 22:53:57 +0000 (18:53 -0400)]
Fix: correct mismatched function signatures
The extern declaration of `_lttng_create_session_ext` in `create.c`
had a superfluous `live_timer` parameter not present in the actual
function definition in `lttng_ctl.c`. The -1 value with which it was
called was therefore unused.
Signed-off-by: Antoine Busque <abusque@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 6 Sep 2015 03:03:51 +0000 (23:03 -0400)]
Clearer error reporting when failing to launch session daemon
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 27 Aug 2015 15:52:33 +0000 (11:52 -0400)]
Daemonize sessiond on `lttng create`
Since the session daemon forked by `lttng create` shares its
standard output/error FDs when not using `--daemonize`, redirecting
the standard output/error of this command to another program "hangs"
because the session daemon never terminates.
Example that's not working (when sessiond is not running):
lttng create | wc
or:
lttng 2>&1 | wc
Using sessiond's `--daemonize` option makes it close its FDs. This
option also ensures that when the sessiond process exits, it has forked
itself as a daemon and is ready to accept commands. Therefore we don't
need to catch SIGCHLD and SIGUSR1; just waitpid() on sessiond's PID and
make sure it exited normally and with an exit status of 0 to continue.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sat, 5 Sep 2015 23:58:29 +0000 (19:58 -0400)]
Fix: consumer signal handling race
If a signal comes in after ctx has been destroyed, it will try to use a
closed file descriptor.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Wed, 26 Aug 2015 17:40:18 +0000 (13:40 -0400)]
Fix: list_ust_events(): dangling pointer
Fixes #908
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Sat, 22 Aug 2015 06:47:53 +0000 (02:47 -0400)]
Fix: MI: close domain when listing multiple agent domains
Without this patch, each agent domain gets nested under
the previous one.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 4 Sep 2015 23:53:18 +0000 (19:53 -0400)]
Tests: expand UST wildcard tests, move to regression/tools
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 4 Sep 2015 23:00:14 +0000 (19:00 -0400)]
Tests: kernel filtering
Requires the new lttng-test.ko lttng-modules test module.
Fixes #921
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 4 Sep 2015 22:02:48 +0000 (18:02 -0400)]
Fix: use pid element instead of process element
v2: Include change to xsd. Looks like I forgot to squash it. I'll have to make
an offering to the git reflog god on this one.
For stable 2.7
This revert part of changes introduced by [1] and [2].
The use of process element break the existing MI xml api.
[1]
46ef4d0715faeef52cd2242b5b895c74507e223a
[2]
a585578f837d992f00eba4f090c8ba251d9de94e
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 3 Sep 2015 21:48:36 +0000 (17:48 -0400)]
Fix: race between kconsumerd and sessiond on tear down
v2: minimize indentation by using return on condition.
Kconsumerd and sessiond both have reference on lttng-module. This can lead to a race
on modprobe_remove_lttng_all which might fail to unload modules due to
certain modules not having a ref count equal to zero at the time.
waitpid is used to force a synchronization on the child (kconsumer) termination.
This also have been applied to ust consumers for the sake of consistency.
Fixes: #878
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 19:35:42 +0000 (15:35 -0400)]
Fix: Buggy string comparison in ust registry ht_match_event
The second strncmp compares the first "strlen(event->signature) != 0"
characters of the event signatures because of a missing parenthesis.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 19:33:54 +0000 (15:33 -0400)]
Fix: Bad cast of lttng_kernel_instrumentation to lttng_event_type
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 19:23:32 +0000 (15:23 -0400)]
Fix: Implicit cast from lttng_loglevel_type to lttng_ust_loglevel_type
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 3 Sep 2015 21:52:07 +0000 (17:52 -0400)]
Fix: lttng-crash: remove tmp working directory
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 17:55:51 +0000 (13:55 -0400)]
Clean up: Coding style conformance adjustments in lttng-crash.c
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 17:53:31 +0000 (13:53 -0400)]
Fix: lttng-crash: DIR leak in delete_trace() on error
Implement a single return point in delete_trace() which ensures
that trace_dir is not leaked on error.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 16:02:10 +0000 (12:02 -0400)]
Fix: Possible passing of NULL pointer to memcpy()
_cmd_enable_event() will not jump to the error label in case of
memory allocation of the filter bytecode copy. This causes the NULL
return of zmalloc to be used by memcpy() directly.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 15:57:52 +0000 (11:57 -0400)]
Fix: Overwrite of ret in relay_recv_metadata
relay_recv_metadata() interchangeably uses ret and size_ret.
This causes ret to take various (positive) values in case
of success, most often corresponding to the size of the metadata
padding which was written during the call.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 15:44:01 +0000 (11:44 -0400)]
Silence undefined return value warning
clang-analyzer complains that "ret" may be returned uninitialized
which can't happen with a valid session configuration. For this to
occur, either libxml2 would have to return a bogus ChildElementCount
(return non-zero when there a actually no child nodes) _or_ the node
would have children of an unexpected type, which would be catched by
the validation performed against the XSD.
Nonetheless, the value is initialized here to silence this warning.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 15:35:43 +0000 (11:35 -0400)]
Silence use-after-free static analysis warning
clang-analyzer complains that cds_list_for_each_entry_safe()
makes use of "wait_node" after free. However, wait_node is only
used in __typeof__().
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 5 Sep 2015 02:04:12 +0000 (22:04 -0400)]
Fix: Wait for in-flight data before closing a stream
A stream's closing conditions are evaluated in three places:
1) When a close command is received
2) When the control connection owning it is closed
3) The stream has received all of its data following
a close request.
These checks are performed in try_stream_close().
A known downside of this approach is that a stream will never
be closed if it has not received all of its data.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Fri, 4 Sep 2015 19:44:23 +0000 (15:44 -0400)]
Fix: unpublish stream on close
Fixes a race where data connection can still add indexes after close,
preventing graceful teardown of the stream.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 3 Sep 2015 21:52:06 +0000 (17:52 -0400)]
Fix: lttng-crash: fd leak
At the same time, make sure to have a single exit-point in copy_file().
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 4 Sep 2015 21:34:44 +0000 (17:34 -0400)]
Fix: Invalid parameter error reported when untracking PID
The LTTng client reports an "Invalid parameter" error if a PID is
untracked from the userspace domain before any PID is tracked in that
domain.
Fixes #931
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 2 Sep 2015 14:45:07 +0000 (10:45 -0400)]
Fix: kernel track/untrack error handling
Fixes #918
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 24 Aug 2015 15:08:29 +0000 (08:08 -0700)]
Fix: Python agent tests are always skipped
v2: Change the configure report section to emulate the java ust agent test
reporting.
Introduce --enable-test-python{2,3}-agent and --enable-test-python-agent-all
flag on configure.
Configure searches for the python agent for both Python 2 and 3 and
enables or skips their associated tests based on the result.
When using the --enable-test-python{2,3}-agent & --enable-test-python-agent-all
flags, a strict checks on tests dependancies is performed and fails the
configure instead of simply disabling the tests.
--disable* flags can be used to force tests skipping.
Also fixes a minor bug in agent test on enabling event with filtering.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 4 Sep 2015 01:45:10 +0000 (21:45 -0400)]
Tests: Fix flaky live test client
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 4 Sep 2015 01:43:26 +0000 (21:43 -0400)]
Fix: Announce empty streams on live attach
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 21:17:30 +0000 (17:17 -0400)]
Fix: relayd: file rotation and live read
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 21:17:29 +0000 (17:17 -0400)]
Fix: relay: viewer_get_next_index handle null vstream
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 21:17:28 +0000 (17:17 -0400)]
Fix: relayd: make viewer streams consider metadata sent
The metadata stream does not use prev seq, and is therefore not sent to
viewers if we depend on prev seq. Use the metadata_received field
instead to achieve the same purpose: if a viewer try to attach to a
session that has not received metadata yet, it will get and error
(metadata stream cannot be found when attaching).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 18:49:34 +0000 (14:49 -0400)]
Fix: don't expose empty streams
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 04:38:26 +0000 (00:38 -0400)]
Fix: relayd: don't check new metadata on get packet
We only care about this when we get the next index. There may be new
metadata that appears between get next index and get packet, but it
should not matter.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 13:26:30 +0000 (09:26 -0400)]
Fix: relayd: don't check for new streams in get packet
Only needed in get next index.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 13:00:50 +0000 (09:00 -0400)]
Fix: ask new streams HUP
Test closed streams for content to check for HUP.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 11:53:07 +0000 (07:53 -0400)]
Fix: reply error if get packet vstream fails
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 11:47:08 +0000 (07:47 -0400)]
Fix: relayd reply error to client if cannot find viewer stream
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 11:43:16 +0000 (07:43 -0400)]
Fix: relayd reply with error if cannot find metadata
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 25 Aug 2015 02:50:09 +0000 (22:50 -0400)]
Fix: ust-app: protect app socket protocol with lock
Many threads can access the application socket (cmd handling thread and
application handling thread) concurrently. Therefore, we need to protect
it with a mutex.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 24 Aug 2015 19:04:03 +0000 (15:04 -0400)]
Cleanup: privatize consumer_allocate_relayd_sock_pair
Only used in a single compilation unit.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 24 Aug 2015 19:03:35 +0000 (15:03 -0400)]
Fix: add missing rcu_barrier at end of sessiond main
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 24 Aug 2015 19:03:20 +0000 (15:03 -0400)]
Fix: add missing rcu_barrier at end of consumer main
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Mon, 24 Aug 2015 18:41:42 +0000 (14:41 -0400)]
Fix: app cmd leak on sessiond exit
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 23 Aug 2015 19:52:26 +0000 (12:52 -0700)]
Fix: relayd live don't send incomplete stream list
Sample the "closed" flag with session lock held.
Also, if the session connection is closed while we attach to it, reply
with HUP, because there are risks of attaching to an incomplete session
(e.g. the metadata stream could be already closed).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 23 Aug 2015 15:58:50 +0000 (08:58 -0700)]
Fix: consumer timer misses RCU thread registration
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 23 Aug 2015 05:01:36 +0000 (22:01 -0700)]
Fix: sessiond consumer thread should register as RCU thread
Fixes RCU race where objects are accessed by this thread under RCU
read-side lock after free. Since this thread is not a registered RCU
reader, the read-side lock has no effect.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 23 Aug 2015 03:14:44 +0000 (20:14 -0700)]
Fix: don't chain RCU free
We only do a single rcu_barrier() on teardown of sessiond. Therefore, if
we chain call_rcu, they may not all be executed before exit.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 23 Aug 2015 02:58:09 +0000 (19:58 -0700)]
Fix: free metadata cache after grace period in consumer
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 19 Aug 2015 21:44:59 +0000 (14:44 -0700)]
Fix: sessiond vs consumerd push/get metadata deadlock
We need to unlock the registry while we push metadata to break a
circular dependency between the consumerd metadata lock and the sessiond
registry lock. Indeed, pushing metadata to the consumerd awaits that it
gets pushed all the way to relayd, but doing so requires grabbing the
metadata lock. If a concurrent metadata request is being performed by
consumerd, this can try to grab the registry lock on the sessiond while
holding the metadata lock on the consumer daemon. Those push and pull
schemes are performed on two different bidirectionnal communication
sockets.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 19 Aug 2015 21:13:48 +0000 (14:13 -0700)]
Fix: streamline ret/errno of run_as()
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 03:01:21 +0000 (23:01 -0400)]
Fix: Double unlock on error path
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 02:59:20 +0000 (22:59 -0400)]
Data pending comment clarification in session daemon
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Thu, 3 Sep 2015 02:57:40 +0000 (22:57 -0400)]
Fix: Relay daemon ownership and reference counting
The ownership and reference counting of the relay daemon is unclear and
buggy in many ways. It is the cause of memory corruptions, double-free,
leaks, segmentation faults, observed in various conditions.
Fix this situation by introducing a clear ownership and reference
counting scheme for this daemon.
See doc/relayd-architecture.txt for details.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 3 Sep 2015 19:09:00 +0000 (15:09 -0400)]
Accept uid and gid parameters in utils_mkdir()/utils_mkdir_recursive()
utils_mkdir* utils may now be use in immediate or "run_as" mode.
This is done since some of the code shared between daemons calls
run_as directly, which doesn't support negative uid/gid (which we use
to mean "run as current user").
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Wed, 19 Aug 2015 07:29:52 +0000 (00:29 -0700)]
Fix: reference counting of consumer output
The UST app session has a reference on the consumer output object, but
it belongs to the UST session. Implement a refcounting scheme to ensure
it is not freed before all users are done using it.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 18 Aug 2015 01:47:53 +0000 (18:47 -0700)]
Fix: sessiond add missing socket close
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 16 Aug 2015 21:56:57 +0000 (17:56 -0400)]
Fix: sessiond should not error on channel creation vs app exit
We should not report an error when creating a channel if the application
is exiting concurrently.
Also, remove an inappropriate assert() in ust_app_create_event_glb: it
is possible to have a channel lookup fail if channel/event creation
occurs concurrently with an application exit.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Sun, 16 Aug 2015 21:10:22 +0000 (17:10 -0400)]
Fix: sessiond ust-app session teardown race
Add a deleted flag within the ust app session which is raised (with ust
app session lock held) at delete, and checked within each RCU traversal,
again with ust app session lock held.
This takes care of races between teardown of an application (unregister)
and execution of commands which are accessing the app session
concurrently.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 1 Sep 2015 15:58:38 +0000 (11:58 -0400)]
Only display agent loglevel if the loglevel type is not ALL
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 31 Aug 2015 22:53:51 +0000 (18:53 -0400)]
Initialize default log level of events on load
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 31 Aug 2015 22:32:16 +0000 (18:32 -0400)]
Don't assume that Log4j and JUL share the same log level mappings
We explicitly set the log level of Log4j events to
LTTNG_LOGLEVEL_LOG4J_ALL instead of relying on the fact that
LTTNG_LOGLEVEL_LOG4J_ALL and LTTNG_LOGLEVEL_JUL_ALL are mapped to the
same value.
The resulting additional branch does not seem to incur a significant
performance penalty and, as such, is deemed acceptable.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 31 Aug 2015 20:00:22 +0000 (16:00 -0400)]
Allow the creation of JUL, Log4j and Python channels
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 22:55:52 +0000 (18:55 -0400)]
Fix: Save tracker as part of UST and Kernel domains only
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 22:50:39 +0000 (18:50 -0400)]
Fix: Memory leak of agent
agent_destroy() has a comment which indicates that it does _not_
destroy the pointer passed to it and it seems that agents are
never realeased under any code path whatsoever.
There does not seem to be an instance where an agent is allocated on
the stack.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 22:32:47 +0000 (18:32 -0400)]
Fix: Memory leak of agent event internals
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 21:44:41 +0000 (17:44 -0400)]
Save filter expression as part of agent events and save them
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 21:43:45 +0000 (17:43 -0400)]
Fix: UTF-8 characters may be stored on up to 4 bytes
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 21:39:58 +0000 (17:39 -0400)]
Remove unneeded hash table existence check in agent_destroy
This function can never be called if the agent failed to initialize
its hash table. If such a failure occurs, the agent constructor will
return NULL and its caller should handle the error.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 21:39:23 +0000 (17:39 -0400)]
Remove unnecessary RCU read lock
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 21:38:25 +0000 (17:38 -0400)]
Use type directly in sizeof instead of a dereferenced pointer
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 30 Aug 2015 21:37:41 +0000 (17:37 -0400)]
Prevent the addition of UST events to agent channels
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 28 Aug 2015 18:53:26 +0000 (14:53 -0400)]
Don't save log level in session configuration when unneeded
Saving the log level of events in session configurations when "ALL" log
levels are enabled may confuse both users and programs working with
session configurations.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Aug 2015 21:30:37 +0000 (17:30 -0400)]
Remove unneeded RCU lock
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Aug 2015 21:30:21 +0000 (17:30 -0400)]
Remove unneeded RCU lock
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Aug 2015 21:26:58 +0000 (17:26 -0400)]
Fix: Propagate filter status of kernel events to client
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Aug 2015 20:53:24 +0000 (16:53 -0400)]
Fix: Save kernel event filter when saving session configuration
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Aug 2015 16:55:04 +0000 (12:55 -0400)]
Docs: there is no need to SHOUT in comments
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 27 Aug 2015 15:52:34 +0000 (11:52 -0400)]
Fix: Mention Python as part of enable-event's usage()
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 26 Aug 2015 16:05:03 +0000 (12:05 -0400)]
Grammar fix in comment
give -> given
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.045305 seconds and 5 git commands to generate.