Update libside RFC
[libside.git] / doc / rfc-libside.txt
CommitLineData
1d80273a 1
d0f0b507 2RFC - SIDE ABI
1d80273a 3
f6380ed8 4[ This document is under heavy construction. Please beware of the
1d80273a
MD
5 potholes as you wander through it. ]
6
7* Introduction
8
d0f0b507 9The purpose of the SIDE ABI is to allow a kernel tracer and many
1d80273a
MD
10user-space tracers to attach to static and dynamic instrumentation of
11user-space applications.
12
d0f0b507
MD
13The SIDE ABI expresses the instrumentation description as data (no
14generated code). Instrumentation arguments are passed on the stack as an
15array of typed items, along with a reference to the instrumentation
16description.
1d80273a 17
d0f0b507
MD
18The following ABIs are introduced to let applications declare their
19instrumentation and insert instrumentation calls:
20
21- an event description ABI,
22- a type description ABI,
23- an event and type attribute ABI, which allows associating key-value
24 tuples to events and types,
25- an ABI defining how applications provide arguments to instrumentation
26 calls.
27
28The combination of the type description and type argument ABIs is later
29refered to as the SIDE type system.
30
31The ABI exposed to kernel and user-space tracers allow them to list
1d80273a
MD
32and connect to the instrumentation, and conditionally enables
33instrumentation when at least one tracer is using it.
34
d0f0b507
MD
35The type description and type argument ABIs include support for
36statically known types and dynamic types. Nested structures, arrays, and
37variable-length arrays are supported.
1d80273a 38
d0f0b507
MD
39The libside C API is a reference implementation of the SIDE ABI for
40instrumentation of C/C++ applications by the Linux kernel through the
41User Events ABI and by user-space tracers following the default calling
42convention (System V ELF ABI on Linux, MS ABI on Windows).
43
44A set of macros is provided with the libside C API for convenience of
45C/C++ application instrumentation.
1d80273a
MD
46
47
48* Genesis
49
d0f0b507
MD
50The SIDE ABI and libside library learn from the user feedback about
51experience with LTTng-UST and Linux kernel tracepoints, and therefore
52they introduce significant changes (and vast simplifications) to the way
53instrumentation is done compared to LTTng-UST and Linux kernel
54tracepoints.
55
1d80273a
MD
56- Linux kernel User Events ABI
57 - Exposes a stable ABI allowing applications to register their event
58 names/field types to the kernel,
59 - Can be expected to have a large effect on application instrumentation,
60 - My concerns:
61 – Should be co-designed with a userspace instrumentation API/ABI rather than only
62 focusing on the kernel ABI,
63 – Should allow purely userspace tracers to use the same instrumentation as userspace
64 tracers implemented within the Linux kernel,
65 – Tracers can target their specific use-cases, but infrastructure should be shared,
66 – Limit fragmentation of the instrumentation ecosystem.
67
68- Improvements over tracepoints:
69 - Improve compiler error reporting vs tracepoints
70 - API uses standard header inclusion practices
71 - share ABI across runtimes (no need to reimplement tracepoints for
72 each language, or to use string only payloads)
d0f0b507 73
1d80273a
MD
74- Improvements over SDT: allow expressing additional event semantic
75 (e.g. user attributes, versioning, nested and compound data types)
76 - libside has less impact on control flow when disabled (no stack setup)
77 - SDT ABI is focused on architecture calling conventions, libside ABI
78 is easier to use from runtime environments which have an ABI
79 different from the native architecture (golang, rust, python, java).
80 libside instrumentation ABI calls a small fixed set of functions.
d0f0b507 81
1d80273a
MD
82- Comparison with ETW
83 - similar to libside in terms of array of arguments,
84 - does not support pre-registration of events (static typing)
85 - type information received at runtime from the instrumentation
86 callsite.
87
d0f0b507 88
1d80273a
MD
89* Desiderata
90
91- Common instrumentation for kernel and purely userspace tracers,
92 - Instrumentation is self-described,
93 - Support compound and nested types,
94 - Support pre-registration of events,
95 - Do not rely on compiled event-specific code,
96 - Independent from ELF,
97 - Simple ABI for instrumented code, kernel, and user-space tracers,
98 - Support concurrent tracers,
99 - Expose API to allow dynamic instrumentation libraries to register
100 their events/payloads.
101
102- Support statically typed instrumentation
103
104- Support dynamically typed instrumentation
105 - Natively cover dynamically-typed languages
106 - The support for events with dynamic fields allows lessening the number
107 of statically declared events in situation where an application
108 possesses seldom-used events with a large variety of parameter types.
109 - The support for mixed static and dynamic event fields allows
110 implementation of post-processing string formatting along with a
111 variadic payload, while keeping trace data in a structured format.
112
113- Performance considerations for userspace tracers.
114 - Maintain performance characteristics comparable to existing
115 userspace tracers.
116 - Low overhead, good scalability when used by userspace tracers.
117
118- Allows tracing user-space through a kernel tracer. Even through it is
119 an approach that adds more overhead, it has the benefit of not
120 requiring agent threads to be deployed into applications, which is
121 useful to trace locked-down processes.
122
123- Instrumentation registration APIs
124 - Instrumentation can be generated at runtime
125 - dynamic patching,
126 - JIT
127 - Instrumentation can be declared statically (static instrumentation)
128 - Instrumentation can be enabled dynamically.
129 - Very low overhead when not in use.
130
131- libside must be extensible in the future.
132 - Extension scheme should allow adding new types in the future without
133 requiring complex logic to future-proof tracers.
134 - Exposed types are invariant,
135 - libside ABI and API can be extended by adding new types.
136
137- the side ABI should allow multiple instances and versions within
138 a process (e.g. libside for C/C++, Java side ABI, Python side ABI...).
139
140- Both event description and payload are data (no generated text).
141 - It allows tracers to directly interpret the event payload from their
142 description, removing the need for code generation. This lessens the
143 instruction cache pollution compared to code generation approaches.
144 - Tracer interpreter for filtering and field capture can directly use
145 the instrumentation data, without need for setting up a structured
146 argument layout on the stack within the tracer.
147
148- Validation of argument vs event description coherence.
149
150- Passing arguments to events should be:
151 - Conveniently express application data structures to be expected as
152 instrumentation input.
153 - Flexible,
154 - Efficient,
155 - If all are not possible combined, specialize types for each purpose.
156
157- Allow tracers to passively collect application state transitions.
158
159- Allow tracers to actively sample the current state of an application.
160
161- Error messages generated when misusing the API should be easy to
162 comprehend and resolve.
163
164- Allow expressing additional custom semantic augmenting events and
165 types.
166
167
168* Design / Architecture
169
1d80273a
MD
170- Compiler error messages are easy to understand because it is a simple
171 header file without any repeated inclusion tricks.
172
1d80273a
MD
173- Variadic events.
174
1d80273a
MD
175- Instrumentation API/ABI:
176 – Type system,
177 - Type visitor callbacks
178 - (perfetto)
179 - Stack-copy types
180 - Data-gathering types
181 - Dynamic types.
182 – Helper macros for C/C++,
183 – Express instrumentation description as data,
184 – Instrumentation arguments are passed on the stack as a data array
185 (similar to iovec) along with a reference to instrumentation
186 description,
187 – Instrumentation is conditionally enabled when at least one tracer is
188 registered to it.
189
190- Tracer-agnostic API/ABI:
191 – Available events notifications,
192 – Conditionally enabling instrumentation,
193 – Synchronize registered user-space tracer callbacks with RCU,
194 – Co-designed to interact with User Events.
195
196- Application state dump
197 - How are applications/libraries meant to provide state information ?
198 - How are tracers meant to interact with state dump ?
199 - statedump mode polling
200 - statedump mode agent thread
201
202- RCU to synchronize userspace tracers registration vs invocation
203
204- How tracers are meant to interact with libside ?
205
206- How is C/C++ language instrumentation is meant to be used ?
207
208- How are dynamic instrumentation facilities meant to interact with
209 libside ?
210
211- How is a kernel tracer meant to interact with libside ?
212
213- How is gdb (ptrace) meant to interact with libside ?
214
215- Validation that instrumentation arguments match event description
216 fields cannot be done by the compiler, requires either:
217 - run time check,
218 - static checker (only for static instrumentation).
219
220- Event attributes.
221
222- Type attributes.
This page took 0.030323 seconds and 4 git commands to generate.