From: Mathieu Desnoyers Date: Fri, 5 Jun 2020 18:33:37 +0000 (-0400) Subject: Update man page based on Michael Kerrisk's comments X-Git-Url: http://drtracing.org/?a=commitdiff_plain;h=da5633b496c50a2c95b70a8d75ad21f261c79707;p=librseq.git Update man page based on Michael Kerrisk's comments Signed-off-by: Mathieu Desnoyers --- diff --git a/doc/man/rseq.2 b/doc/man/rseq.2 index 0ac1a28..edec132 100644 --- a/doc/man/rseq.2 +++ b/doc/man/rseq.2 @@ -32,6 +32,32 @@ rseq \- Restartable sequences and cpu number cache .BI "int rseq(struct rseq * " rseq ", uint32_t " rseq_len ", int " flags ", uint32_t " sig "); .sp .SH DESCRIPTION + +A restartable sequence is a sequence of instructions guaranteed to be executed +atomically with respect to other threads and signal handlers on the current +CPU. If its execution does not complete atomically, the kernel changes the +execution flow by jumping to an abort handler defined by user-space for that +restartable sequence. + +Using restartable sequences requires to register a +.BR __rseq_abi +thread-local storage data structure (struct rseq) through the +.BR rseq () +system call. Only one +.BR __rseq_abi +can be registered per thread, so user-space libraries and applications must +follow a user-space ABI defining how to share this resource. The ABI defining +how to share this resource between applications and libraries is defined by the +C library. + +The +.BR __rseq_abi +contains a +.I rseq_cs +field which points to the currently executing critical section. For each +thread, a single rseq critical section can run at any given point. Each +critical section need to be implemented in assembly. + The .BR rseq () ABI accelerates user-space operations on per-cpu data by defining a @@ -41,7 +67,10 @@ It allows user-space to perform update operations on per-cpu data without requiring heavy-weight atomic operations. The term CPU used in this documentation refers to a hardware execution -context. +context. For instance, each CPU number returned by +.BR sched_getcpu () +is a CPU. The current CPU means to the CPU on which the registered thread is +running. Restartable sequences are atomic with respect to preemption (making it atomic with respect to other threads running on the same CPU), as well @@ -49,11 +78,11 @@ as signal delivery (user-space execution contexts nested over the same thread). They either complete atomically with respect to preemption on the current CPU and signal delivery, or they are aborted. -It is suited for update operations on per-cpu data. +Restartable sequences are suited for update operations on per-cpu data. -It can be used on data structures shared between threads within a -process, and on data structures shared between threads across different -processes. +Restartable sequences can be used on data structures shared between threads +within a process, and on data structures shared between threads across +different processes. .PP Some examples of operations that can be accelerated or improved @@ -93,18 +122,31 @@ is as follows: This structure is aligned on 32-byte boundary. .TP .B Structure size -This structure is extensible. Its size is passed as parameter to the +This structure is fixed-size (32 bytes). Its size is passed as parameter to the rseq system call. +.PP +.in +8n +.EX +struct rseq { + __u32 cpu_id_start; + __u32 cpu_id; + union { + /* Edited out for conciseness. [...] */ + } rseq_cs; + __u32 flags; +} __attribute__((aligned(32))); +.EE .TP .B Fields .TP .in +4n .I cpu_id_start -Optimistic cache of the CPU number on which the current thread is +Optimistic cache of the CPU number on which the registered thread is running. Its value is guaranteed to always be a possible CPU number, -even when rseq is not initialized. The value it contains should always -be confirmed by reading the cpu_id field. +even when rseq is not registered. Its value should always be confirmed by +reading the cpu_id field before user-space performs any side-effect (e.g. +storing to memory). This field is an optimistic cache in the sense that it is always guaranteed to hold a valid CPU number in the range [ 0 .. @@ -113,6 +155,9 @@ used as an offset in per-cpu data structures without having to check whether its value is within the valid bounds compared to the number of possible CPUs in the system. +Initialized by user-space to a possible CPU number (e.g., 0), updated +by the kernel for threads registered with rseq. + For user-space applications executed on a kernel without rseq support, the cpu_id_start field stays initialized at 0, which is indeed a valid CPU number. It is therefore valid to use it as an offset in per-cpu data @@ -121,36 +166,53 @@ number by comparing it with the cpu_id field within the rseq critical section. If the kernel does not provide rseq support, that cpu_id field stays initialized at -1, so the comparison always fails, as intended. -It is then up to user-space to use a fall-back mechanism, considering -that rseq is not available. +It is up to user-space to implement a fall-back mechanism for scenarios where +rseq is not available. .in .TP .in +4n .I cpu_id -Cache of the CPU number on which the current thread is running. --1 if uninitialized. +Cache of the CPU number on which the registered thread is running. Initialized +by user-space to -1, updated by the kernel for threads registered with rseq. .in .TP .in +4n .I rseq_cs The rseq_cs field is a pointer to a struct rseq_cs. Is is NULL when no -rseq assembly block critical section is active for the current thread. +rseq assembly block critical section is active for the registered thread. Setting it to point to a critical section descriptor (struct rseq_cs) marks the beginning of the critical section. + +Initialized by user-space to NULL. + +Updated by user-space, which sets the address of the currently +active rseq_cs at the beginning of assembly instruction sequence +block, and set to NULL by the kernel when it restarts an assembly +instruction sequence block, as well as when the kernel detects that +it is preempting or delivering a signal outside of the range +targeted by the rseq_cs. Also needs to be set to NULL by user-space +before reclaiming memory that contains the targeted struct rseq_cs. + +Read and set by the kernel. .in .TP .in +4n .I flags -Flags indicating the restart behavior for the current thread. This is -mainly used for debugging purposes. Can be either: +Flags indicating the restart behavior for the registered thread. This is +mainly used for debugging purposes. Can be a combination of: .IP \[bu] -RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT +RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT: Inhibit instruction sequence block restart +on preemption for this thread. .IP \[bu] -RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL +RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL: Inhibit instruction sequence block restart +on signal delivery for this thread. .IP \[bu] -RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE +RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE: Inhibit instruction sequence block restart +on migration for this thread. .in +Initialized by user-space, used by the kernel. + .PP The layout of .B struct rseq_cs @@ -161,25 +223,39 @@ This structure is aligned on 32-byte boundary. .TP .B Structure size This structure has a fixed size of 32 bytes. +.PP +.in +8n +.EX +struct rseq_cs { + __u32 version; + __u32 flags; + __u64 start_ip; + __u64 post_commit_offset; + __u64 abort_ip; +} __attribute__((aligned(32))); +.EE .TP .B Fields .TP .in +4n .I version -Version of this structure. +Version of this structure. Should be initialized to 0. .in .TP .in +4n .I flags -Flags indicating the restart behavior of this structure. Can be -a combination of: +Flags indicating the restart behavior of this structure. Can be a combination +of: .IP \[bu] -RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT +RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT: Inhibit instruction sequence block restart +on preemption for this critical section. .IP \[bu] -RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL +RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL: Inhibit instruction sequence block restart +on signal delivery for this critical section. .IP \[bu] -RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE +RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE: Inhibit instruction sequence block restart +on migration for this critical section. .TP .in +4n .I start_ip @@ -234,20 +310,17 @@ structure. No more than one rseq structure address can be registered per thread at a given time. .PP -Memory of a registered rseq object must not be freed before the thread -exits. Reclaim of rseq object's memory must only be done after either an -explicit rseq unregistration is performed or after the thread exits. Keep -in mind that the implementation of the Thread-Local Storage (C language -__thread) lifetime does not guarantee existence of the TLS area up until -the thread exits. +Reclaim of rseq object's memory must only be done after either an +explicit rseq unregistration is performed or after the thread exits. .PP In a typical usage scenario, the thread registering the rseq structure will be performing loads and stores from/to that structure. It is however also allowed to read that structure from other threads. The rseq field updates performed by the kernel provide relaxed atomicity -semantics, which guarantee that other threads performing relaxed atomic -reads of the cpu number cache will always observe a consistent value. +semantics (atomic store, without memory ordering), which guarantee that other +threads performing relaxed atomic reads (atomic load, without memory ordering) +of the cpu number cache will always observe a consistent value. .SH RETURN VALUE A return value of 0 indicates success. On error, \-1 is returned, and @@ -263,7 +336,7 @@ contains an invalid value, or .I rseq contains an address which is not appropriately aligned, or .I rseq_len -contains a size that does not match the size received on registration. +contains an incorrect size. .TP .B ENOSYS The