[deliverable/linux.git] / Documentation / RCU / NMI-RCU.txt

Using RCU to Protect Dynamic NMI Handlers


Although RCU is usually used to protect read-mostly data structures,
it is possible to use RCU to provide dynamic non-maskable interrupt
handlers, as well as dynamic irq handlers.  This document describes
how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
work in "arch/x86/oprofile/nmi_timer_int.c" and in
"arch/x86/kernel/traps.c".

The relevant pieces of code are listed below, each followed by a
brief explanation.

	static int dummy_nmi_callback(struct pt_regs *regs, int cpu)
	{
		return 0;
	}

The dummy_nmi_callback() function is a "dummy" NMI handler that does
nothing, but returns zero, thus saying that it did nothing, allowing
the NMI handler to take the default machine-specific action.

	static nmi_callback_t nmi_callback = dummy_nmi_callback;

This nmi_callback variable is a global function pointer to the current
NMI handler.

	void do_nmi(struct pt_regs * regs, long error_code)
	{
		int cpu;

		nmi_enter();

		cpu = smp_processor_id();
		++nmi_count(cpu);

		if (!rcu_dereference_sched(nmi_callback)(regs, cpu))
			default_do_nmi(regs);

		nmi_exit();
	}

The do_nmi() function processes each NMI.  It first disables preemption
in the same way that a hardware irq would, then increments the per-CPU
count of NMIs.  It then invokes the NMI handler stored in the nmi_callback
function pointer.  If this handler returns zero, do_nmi() invokes the
default_do_nmi() function to handle a machine-specific NMI.  Finally,
preemption is restored.

In theory, rcu_dereference_sched() is not needed, since this code runs
only on i386, which in theory does not need rcu_dereference_sched()
anyway.  However, in practice it is a good documentation aid, particularly
for anyone attempting to do something similar on Alpha or on systems
with aggressive optimizing compilers.

Quick Quiz:  Why might the rcu_dereference_sched() be necessary on Alpha,
	     given that the code referenced by the pointer is read-only?


Back to the discussion of NMI and RCU...

	void set_nmi_callback(nmi_callback_t callback)
	{
		rcu_assign_pointer(nmi_callback, callback);
	}

The set_nmi_callback() function registers an NMI handler.  Note that any
data that is to be used by the callback must be initialized up -before-
the call to set_nmi_callback().  On architectures that do not order
writes, the rcu_assign_pointer() ensures that the NMI handler sees the
initialized values.

	void unset_nmi_callback(void)
	{
		rcu_assign_pointer(nmi_callback, dummy_nmi_callback);
	}

This function unregisters an NMI handler, restoring the original
dummy_nmi_handler().  However, there may well be an NMI handler
currently executing on some other CPU.  We therefore cannot free
up any data structures used by the old NMI handler until execution
of it completes on all other CPUs.

One way to accomplish this is via synchronize_sched(), perhaps as
follows:

	unset_nmi_callback();
	synchronize_sched();
	kfree(my_nmi_data);

This works because synchronize_sched() blocks until all CPUs complete
any preemption-disabled segments of code that they were executing.
Since NMI handlers disable preemption, synchronize_sched() is guaranteed
not to return until all ongoing NMI handlers exit.  It is therefore safe
to free up the handler's data as soon as synchronize_sched() returns.

Important note: for this to work, the architecture in question must
invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.


Answer to Quick Quiz

	Why might the rcu_dereference_sched() be necessary on Alpha, given
	that the code referenced by the pointer is read-only?

	Answer: The caller to set_nmi_callback() might well have
		initialized some data that is to be used by the new NMI
		handler.  In this case, the rcu_dereference_sched() would
		be needed, because otherwise a CPU that received an NMI
		just after the new handler was set might see the pointer
		to the new NMI handler, but the old pre-initialized
		version of the handler's data.

		This same sad story can happen on other CPUs when using
		a compiler with aggressive pointer-value speculation
		optimizations.

		More important, the rcu_dereference_sched() makes it
		clear to someone reading the code that the pointer is
		being protected by RCU-sched.
Commit	Line	Data
19306059 PM	1	Using RCU to Protect Dynamic NMI Handlers
	2
	3
	4	Although RCU is usually used to protect read-mostly data structures,
	5	it is possible to use RCU to provide dynamic non-maskable interrupt
	6	handlers, as well as dynamic irq handlers. This document describes
	7	how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
25eb650a WG	8	work in "arch/x86/oprofile/nmi_timer_int.c" and in
25eb650a WG	9	"arch/x86/kernel/traps.c".
19306059 PM	10
	11	The relevant pieces of code are listed below, each followed by a
	12	brief explanation.
	13
	14	static int dummy_nmi_callback(struct pt_regs *regs, int cpu)
	15	{
	16	return 0;
	17	}
	18
	19	The dummy_nmi_callback() function is a "dummy" NMI handler that does
	20	nothing, but returns zero, thus saying that it did nothing, allowing
	21	the NMI handler to take the default machine-specific action.
	22
	23	static nmi_callback_t nmi_callback = dummy_nmi_callback;
	24
	25	This nmi_callback variable is a global function pointer to the current
	26	NMI handler.
	27
b5606c2d	28	void do_nmi(struct pt_regs * regs, long error_code)
19306059 PM	29	{
	30	int cpu;
	31
	32	nmi_enter();
	33
	34	cpu = smp_processor_id();
	35	++nmi_count(cpu);
	36
50aec002	37	if (!rcu_dereference_sched(nmi_callback)(regs, cpu))
19306059 PM	38	default_do_nmi(regs);
	39
	40	nmi_exit();
	41	}
	42
	43	The do_nmi() function processes each NMI. It first disables preemption
	44	in the same way that a hardware irq would, then increments the per-CPU
	45	count of NMIs. It then invokes the NMI handler stored in the nmi_callback
	46	function pointer. If this handler returns zero, do_nmi() invokes the
	47	default_do_nmi() function to handle a machine-specific NMI. Finally,
	48	preemption is restored.
	49
50aec002 PM	50	In theory, rcu_dereference_sched() is not needed, since this code runs
	51	only on i386, which in theory does not need rcu_dereference_sched()
	52	anyway. However, in practice it is a good documentation aid, particularly
	53	for anyone attempting to do something similar on Alpha or on systems
	54	with aggressive optimizing compilers.
19306059	55
50aec002	56	Quick Quiz: Why might the rcu_dereference_sched() be necessary on Alpha,
19306059 PM	57	given that the code referenced by the pointer is read-only?
	58
	59
	60	Back to the discussion of NMI and RCU...
	61
	62	void set_nmi_callback(nmi_callback_t callback)
	63	{
	64	rcu_assign_pointer(nmi_callback, callback);
	65	}
	66
	67	The set_nmi_callback() function registers an NMI handler. Note that any
	68	data that is to be used by the callback must be initialized up -before-
	69	the call to set_nmi_callback(). On architectures that do not order
	70	writes, the rcu_assign_pointer() ensures that the NMI handler sees the
	71	initialized values.
	72
	73	void unset_nmi_callback(void)
	74	{
	75	rcu_assign_pointer(nmi_callback, dummy_nmi_callback);
	76	}
	77
	78	This function unregisters an NMI handler, restoring the original
	79	dummy_nmi_handler(). However, there may well be an NMI handler
	80	currently executing on some other CPU. We therefore cannot free
	81	up any data structures used by the old NMI handler until execution
	82	of it completes on all other CPUs.
	83
	84	One way to accomplish this is via synchronize_sched(), perhaps as
	85	follows:
	86
	87	unset_nmi_callback();
	88	synchronize_sched();
	89	kfree(my_nmi_data);
	90
	91	This works because synchronize_sched() blocks until all CPUs complete
	92	any preemption-disabled segments of code that they were executing.
	93	Since NMI handlers disable preemption, synchronize_sched() is guaranteed
	94	not to return until all ongoing NMI handlers exit. It is therefore safe
	95	to free up the handler's data as soon as synchronize_sched() returns.
	96
32300751	97	Important note: for this to work, the architecture in question must
b15a2e7d	98	invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.
32300751	99
19306059 PM	100
	101	Answer to Quick Quiz
	102
50aec002	103	Why might the rcu_dereference_sched() be necessary on Alpha, given
19306059 PM	104	that the code referenced by the pointer is read-only?
	105
	106	Answer: The caller to set_nmi_callback() might well have
50aec002 PM	107	initialized some data that is to be used by the new NMI
	108	handler. In this case, the rcu_dereference_sched() would
	109	be needed, because otherwise a CPU that received an NMI
	110	just after the new handler was set might see the pointer
	111	to the new NMI handler, but the old pre-initialized
	112	version of the handler's data.
	113
	114	This same sad story can happen on other CPUs when using
	115	a compiler with aggressive pointer-value speculation
	116	optimizations.
	117
	118	More important, the rcu_dereference_sched() makes it
	119	clear to someone reading the code that the pointer is
	120	being protected by RCU-sched.