From ce9dbd47eea9fe023691518ef46b58c03b89c236 Mon Sep 17 00:00:00 2001 From: =?utf8?q?J=C3=A9r=C3=A9mie=20Galarneau?= Date: Tue, 14 Dec 2021 14:52:24 -0500 Subject: [PATCH] Fix: relayd comm: improperly packed rotate streams command header MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit Observed issue ============== The implicit rotation of a session performed during its destruction fails on LTTng 2.12 (and upwards) with the following errors: Error: Relayd rotate streams replied error 152 Error: Relayd rotate stream failed. Cleaning up relayd 2 Error: Rotate channel failed Failed to find relay daemon socket: relayd_id = 2 Error: Failed to perform a quiet rotation as part of the destruction of session "my_session": Rotation failure on consumer Cause ===== Error 152 matches the LTTNG_ERR_INVALID_PROTOCOL error, which implies that the consumer daemon sent an unexpected command to the relay daemon. It was determined that the RELAYD_ROTATE_STREAMS command header is not properly packed since the LTTNG_PACKED annotation was omitted from its `new_chunk_id` optional field. The documentation of LTTNG_OPTIONAL_COMM duly indicates that this is required. Without the use of LTTNG_PACKED, various lengths of padding (3 or 7 bytes) are inserted between new_chunk_id's `is_set` and `value` field to align `value`, which results in an incorrect interpretation of the command's arguments. The relay daemon catches the protocol error when it is built in a configuration that inserts 7 bytes of padding, while the consumer only inserts three. Solution ======== The solution proposed here is not perfect, see "Known drawbacks". First, if we were to annotate the field, patched consumer daemons would send unintelligible command headers to unpatched relay daemons. Leaving it as is is the least of all evils, see "Known drawbacks" for more details. From the relay daemon end, we can easily anticipate the padding problem by reading the `stream_count` field and use it to determine the expected size of the payload. The difference between the actual size of the payload and the expected size allows us to determine the padding size and use the appropriate declaration of the structure to interpret the command's arguments. Known drawbacks =============== While this fix causes the relay daemon to handle all improperly packed command headers received from an unpatched consumer daemon, the reverse is not completely true. The following tables show which cases are known to work and which are known to be broken when patched and unpatched versions of the relay and consumer daemons are mixed, with the various alignment constraints. Note that here: - 4 byte alignement implies "daemon running on an architecture on which uint64_t is aligned on an 4-byte boundary" (e.g. x86), - 8 byte alignement implies "daemon running on an architecture on which uint64_t is aligned on an 8-byte boundary" (e.g. x86-64). Scenario 1 - Unpatched relay daemon and unpatched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Fails | | 8 byte alignment relay | Fails | Works | ----------------------------------------------------------------------------------- Scenario 2 - Patched relay daemon and unpatched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Works | | 8 byte alignment relay | Works | Works | ----------------------------------------------------------------------------------- Scenario 3 - Unpatched relay daemon and patched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Fails | | 8 byte alignment relay | Fails | Works | ----------------------------------------------------------------------------------- Scenario 4 - Patched relay daemon and patched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Works | | 8 byte alignment relay | Works | Works | ----------------------------------------------------------------------------------- Note that Scenarios 1 and 3 are the same since this fix doesn't change the behaviour of the consumer daemon. Also note that packing the `new_chunk_id` field would break the two working cases of scenario 3 which are, in all likelyhood, the more common cases. A new command using a properly packed version of the command's header could be implemented in future versions, but this presents no benefit as part of this fix. Signed-off-by: Jérémie Galarneau Change-Id: I3b822a9766ea5be51996a771a0d4a90efe29ce0b --- src/bin/lttng-relayd/main.cpp | 141 ++++++++++++++++++++++++++---- src/common/sessiond-comm/relayd.h | 53 +++++++++++ 2 files changed, 176 insertions(+), 18 deletions(-) diff --git a/src/bin/lttng-relayd/main.cpp b/src/bin/lttng-relayd/main.cpp index 07d79adee..adbf3f3b4 100644 --- a/src/bin/lttng-relayd/main.cpp +++ b/src/bin/lttng-relayd/main.cpp @@ -2513,6 +2513,125 @@ end_no_session: return ret; } +static ssize_t relay_unpack_rotate_streams_header( + const struct lttng_buffer_view *payload, + struct lttcomm_relayd_rotate_streams *_rotate_streams) +{ + struct lttcomm_relayd_rotate_streams rotate_streams; + /* + * Set to the smallest version (packed) of `lttcomm_relayd_rotate_streams`. + * This is the smallest version of this structure, but it can be larger; + * this variable is updated once the proper size of the structure is known. + * + * See comment at the declaration of this structure for more information. + */ + ssize_t header_len = sizeof(struct lttcomm_relayd_rotate_streams_packed); + size_t expected_payload_size_no_padding, + expected_payload_size_3_bytes_padding, + expected_payload_size_7_bytes_padding; + + if (payload->size < header_len) { + ERR("Unexpected payload size in \"relay_rotate_session_stream\": expected >= %zu bytes, got %zu bytes", + header_len, payload->size); + goto error; + } + + /* + * Some versions incorrectly omitted the LTTNG_PACKED annotation on the + * `new_chunk_id` optional field of struct lttcomm_relayd_rotate_streams. + * + * We start by "unpacking" `stream_count` to figure out the padding length + * emited by our peer. + */ + memcpy(&rotate_streams.stream_count, payload->data, + sizeof(rotate_streams.stream_count)); + rotate_streams = (typeof(rotate_streams)) { + .stream_count = be32toh(rotate_streams.stream_count), + }; + + /* + * Payload size expected given the possible padding lengths in + * `struct lttcomm_relayd_rotate_streams`. + */ + expected_payload_size_no_padding = (rotate_streams.stream_count * + sizeof(*rotate_streams.rotation_positions)) + + sizeof(lttcomm_relayd_rotate_streams_packed); + expected_payload_size_3_bytes_padding = (rotate_streams.stream_count * + sizeof(*rotate_streams.rotation_positions)) + + sizeof(lttcomm_relayd_rotate_streams_3_bytes_padding); + expected_payload_size_7_bytes_padding = (rotate_streams.stream_count * + sizeof(*rotate_streams.rotation_positions)) + + sizeof(lttcomm_relayd_rotate_streams_7_bytes_padding); + + if (payload->size == expected_payload_size_no_padding) { + struct lttcomm_relayd_rotate_streams_packed packed_rotate_streams; + + /* + * This handles cases where someone might build with + * -fpack-struct or any other toolchain that wouldn't produce + * padding to align `value`. + */ + DBG("Received `struct lttcomm_relayd_rotate_streams` with no padding"); + + header_len = sizeof(packed_rotate_streams); + memcpy(&packed_rotate_streams, payload->data, header_len); + + /* Unpack the packed structure to the natively-packed version. */ + *_rotate_streams = (typeof(*_rotate_streams)) { + .stream_count = be32toh(packed_rotate_streams.stream_count), + .new_chunk_id = (typeof(_rotate_streams->new_chunk_id)) { + .is_set = !!packed_rotate_streams.new_chunk_id.is_set, + .value = be64toh(packed_rotate_streams.new_chunk_id.value), + } + }; + } else if (payload->size == expected_payload_size_3_bytes_padding) { + struct lttcomm_relayd_rotate_streams_3_bytes_padding padded_rotate_streams; + + DBG("Received `struct lttcomm_relayd_rotate_streams` with 3 bytes of padding (4-byte aligned peer)"); + + header_len = sizeof(padded_rotate_streams); + memcpy(&padded_rotate_streams, payload->data, header_len); + + /* Unpack the 3-byte padded structure to the natively-packed version. */ + *_rotate_streams = (typeof(*_rotate_streams)) { + .stream_count = be32toh(padded_rotate_streams.stream_count), + .new_chunk_id = (typeof(_rotate_streams->new_chunk_id)) { + .is_set = !!padded_rotate_streams.new_chunk_id.is_set, + .value = be64toh(padded_rotate_streams.new_chunk_id.value), + } + }; + } else if (payload->size == expected_payload_size_7_bytes_padding) { + struct lttcomm_relayd_rotate_streams_7_bytes_padding padded_rotate_streams; + + DBG("Received `struct lttcomm_relayd_rotate_streams` with 7 bytes of padding (8-byte aligned peer)"); + + header_len = sizeof(padded_rotate_streams); + memcpy(&padded_rotate_streams, payload->data, header_len); + + /* Unpack the 7-byte padded structure to the natively-packed version. */ + *_rotate_streams = (typeof(*_rotate_streams)) { + .stream_count = be32toh(padded_rotate_streams.stream_count), + .new_chunk_id = (typeof(_rotate_streams->new_chunk_id)) { + .is_set = !!padded_rotate_streams.new_chunk_id.is_set, + .value = be64toh(padded_rotate_streams.new_chunk_id.value), + } + }; + + header_len = sizeof(padded_rotate_streams); + } else { + ERR("Unexpected payload size in \"relay_rotate_session_stream\": expected %zu, %zu or %zu bytes, got %zu bytes", + expected_payload_size_no_padding, + expected_payload_size_3_bytes_padding, + expected_payload_size_7_bytes_padding, + payload->size); + goto error; + } + + return header_len; +error: + return -1; +} + /* * relay_rotate_session_stream: rotate a stream to a new tracefile for the * session rotation feature (not the tracefile rotation feature). @@ -2530,11 +2649,11 @@ static int relay_rotate_session_streams( struct lttcomm_relayd_rotate_streams rotate_streams; struct lttcomm_relayd_generic_reply reply = {}; struct relay_stream *stream = NULL; - const size_t header_len = sizeof(struct lttcomm_relayd_rotate_streams); struct lttng_trace_chunk *next_trace_chunk = NULL; struct lttng_buffer_view stream_positions; char chunk_id_buf[MAX_INT_DEC_LEN(uint64_t)]; const char *chunk_id_str = "none"; + ssize_t header_len; if (!session || !conn->version_check_done) { ERR("Trying to rotate a stream before version check"); @@ -2548,24 +2667,12 @@ static int relay_rotate_session_streams( goto end_no_reply; } - if (payload->size < header_len) { - ERR("Unexpected payload size in \"relay_rotate_session_stream\": expected >= %zu bytes, got %zu bytes", - header_len, payload->size); + header_len = relay_unpack_rotate_streams_header(payload, &rotate_streams); + if (header_len < 0) { ret = -1; goto end_no_reply; } - memcpy(&rotate_streams, payload->data, header_len); - - /* Convert header to host endianness. */ - rotate_streams = (typeof(rotate_streams)) { - .stream_count = be32toh(rotate_streams.stream_count), - .new_chunk_id = (typeof(rotate_streams.new_chunk_id)) { - .is_set = !!rotate_streams.new_chunk_id.is_set, - .value = be64toh(rotate_streams.new_chunk_id.value), - } - }; - if (rotate_streams.new_chunk_id.is_set) { /* * Retrieve the trace chunk the stream must transition to. As @@ -2603,7 +2710,7 @@ static int relay_rotate_session_streams( chunk_id_str); stream_positions = lttng_buffer_view_from_view(payload, - sizeof(rotate_streams), -1); + header_len, -1); if (!stream_positions.data || stream_positions.size < (rotate_streams.stream_count * @@ -2662,8 +2769,6 @@ end_no_reply: return ret; } - - /* * relay_create_trace_chunk: create a new trace chunk */ diff --git a/src/common/sessiond-comm/relayd.h b/src/common/sessiond-comm/relayd.h index fe7d2936d..36a91070c 100644 --- a/src/common/sessiond-comm/relayd.h +++ b/src/common/sessiond-comm/relayd.h @@ -239,17 +239,70 @@ struct lttcomm_relayd_stream_rotation_position { uint64_t rotate_at_seq_num; } LTTNG_PACKED; +/* + * For certain releases, the LTTNG_PACKED annotation was missing on the + * `new_chunk_id` field which causes padding to be added between the + * "optional" structure's `is_set` and `value` fields. + * + * Three alignment cases are handled: + * - `value` is aligned to the next byte boundary after `is_set` + * no padding is produced, see + * `struct lttcomm_relayd_rotate_streams_packed`, + * - `value` is aligned to the next 4-byte boundary after `is_set` + * (e.g. x86), 3 bytes of padding are produced, see + * `struct lttcomm_relayd_rotate_streams_3_bytes_padding`, + * - `value` is aligned to the next 8-byte boundary after `is_set` + * (e.g. x86-64), 7 bytes of padding are produced, see + * `struct lttcomm_relayd_rotate_streams_7_bytes_padding`. + * + * Note that since this structure's advertised size is used to determine + * the size of the padding it includes, it can't be extended with new + * optional fields. A new command would be needed. + */ struct lttcomm_relayd_rotate_streams { uint32_t stream_count; /* * Streams can be rotated outside of a chunk but not be parented to * a new chunk. + * + * Improperly packed, but left as-is for backwards compatibility + * with unpatched relay daemons. */ LTTNG_OPTIONAL_COMM(uint64_t) new_chunk_id; /* `stream_count` positions follow. */ struct lttcomm_relayd_stream_rotation_position rotation_positions[]; } LTTNG_PACKED; +struct lttcomm_relayd_rotate_streams_packed { + uint32_t stream_count; + LTTNG_OPTIONAL_COMM(uint64_t) LTTNG_PACKED new_chunk_id; + struct lttcomm_relayd_stream_rotation_position rotation_positions[]; +} LTTNG_PACKED; + +struct lttcomm_relayd_rotate_streams_3_bytes_padding { + uint32_t stream_count; + struct { + union { + uint8_t is_set; + uint32_t padding; + }; + uint64_t value; + } LTTNG_PACKED new_chunk_id; + struct lttcomm_relayd_stream_rotation_position rotation_positions[]; +} LTTNG_PACKED; + +struct lttcomm_relayd_rotate_streams_7_bytes_padding { + uint32_t stream_count; + struct { + union { + uint8_t is_set; + uint64_t padding; + }; + uint64_t value; + } LTTNG_PACKED new_chunk_id; + struct lttcomm_relayd_stream_rotation_position rotation_positions[]; +} LTTNG_PACKED; + struct lttcomm_relayd_create_trace_chunk { uint64_t chunk_id; /* Seconds since EPOCH. */ -- 2.34.1