Early context tracking patch set: fixing perf & ftrace losing events

Some time ago, while using perf to check the automaton model, I noticed that perf was losing events. The same was reproducible with ftrace.

Steve pointed to a problem in the identification of the context execution used by the recursion control.

Currently, recursion control uses the preempt_counter to identify the current context. The NMI/HARD/SOFT IRQ counters are set in the preempt_counter in the irq_enter/exit functions.

In a trace, they are set like this:

 0)   ==========> |
 0)               |  do_IRQ() {		/* First C function */
 0)               |    irq_enter() {
 0)               |      		/* set the IRQ context. */
 0)   1.081 us    |    }
 0)               |    handle_irq() {
 0)               |     		/* IRQ handling code */
 0) + 10.290 us   |    }
 0)               |    irq_exit() {
 0)               |      		/* unset the IRQ context. */
 0)   6.657 us    |    }
 0) + 18.995 us   |  }
 0)   <========== |

As one can see, functions (and events) that take place before the set and after unset the preempt_counter are identified in the wrong context, causing the miss interpretation that recursion is taking place. When this happens, events are dropped.

To resolve this problem, the set/unset of the IRQ/NMI context needs to be done before the execution of the first C execution, and after its return. By doing so, and using this method to identify the context in the trace recursion protection, no more events are lost.

A possible solution is to use a per-cpu variable set and unset in the entry point of NMI/IRQs, before calling the C handler.

This possible solution is presented in this patch series as a proof of concept, for x86_64. Let’s see what kind of comments we will receive!