deliverable/lttng-modules.git
6 years agoUpdate: kvm instrumentation for 3.16.52 and 3.2.97
Mathieu Desnoyers [Tue, 2 Jan 2018 16:07:05 +0000 (11:07 -0500)] 
Update: kvm instrumentation for 3.16.52 and 3.2.97

Starting from 3.16.52 and 3.2.97, the 3.16 and 3.2 stable kernel
branches backport a kvm instrumentation change introduced in 4.15 which
affects the prototype of the kvm_mmio event.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: kvm instrumentation for 4.15
Mathieu Desnoyers [Wed, 27 Dec 2017 14:07:30 +0000 (09:07 -0500)] 
Fix: kvm instrumentation for 4.15

Incorrect version range.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoUpdate sock instrumentation for 4.15
Mathieu Desnoyers [Tue, 26 Dec 2017 14:47:36 +0000 (09:47 -0500)] 
Update sock instrumentation for 4.15

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoUpdate kvm instrumentation for 4.15
Mathieu Desnoyers [Tue, 26 Dec 2017 14:47:22 +0000 (09:47 -0500)] 
Update kvm instrumentation for 4.15

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: ACCESS_ONCE() removed in kernel 4.15
Michael Jeanson [Tue, 19 Dec 2017 20:06:42 +0000 (15:06 -0500)] 
Fix: ACCESS_ONCE() removed in kernel 4.15

The ACCESS_ONCE() macro was removed in kernel 4.15 and should be
replaced by READ_ONCE and WRITE_ONCE which were introduced in kernel
3.19.

This commit replaces all calls to ACCESS_ONCE() with the appropriate
READ_ONCE or WRITE_ONCE and adds compatibility macros for kernels that
have them.

See this upstream commit:

  commit b03a0fe0c5e4b46dcd400d27395b124499554a71
  Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
  Date:   Mon Oct 23 14:07:25 2017 -0700

    locking/atomics, mm: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE()

    For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
    preference to ACCESS_ONCE(), and new code is expected to use one of the
    former. So far, there's been no reason to change most existing uses of
    ACCESS_ONCE(), as these aren't currently harmful.

    However, for some features it is necessary to instrument reads and
    writes separately, which is not possible with ACCESS_ONCE(). This
    distinction is critical to correct operation.

    It's possible to transform the bulk of kernel code using the Coccinelle
    script below. However, this doesn't handle comments, leaving references
    to ACCESS_ONCE() instances which have been removed. As a preparatory
    step, this patch converts the mm code and comments to use
    {READ,WRITE}_ONCE() consistently.

    ----
    virtual patch

    @ depends on patch @
    expression E1, E2;
    @@

    - ACCESS_ONCE(E1) = E2
    + WRITE_ONCE(E1, E2)

    @ depends on patch @
    expression E;
    @@

    - ACCESS_ONCE(E)
    + READ_ONCE(E)
    ----

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: sched instrumentation on stable RT kernels
Michael Jeanson [Mon, 18 Dec 2017 19:35:55 +0000 (14:35 -0500)] 
Fix: sched instrumentation on stable RT kernels

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agotimer API transition for kernel 4.15
Michael Jeanson [Wed, 29 Nov 2017 22:03:21 +0000 (17:03 -0500)] 
timer API transition for kernel 4.15

The timer API changes starting from kernel 4.15.0.

There's an interresting LWN article on this subject:

  https://lwn.net/Articles/735887/

Check these upstream commits for more details:

  commit 686fef928bba6be13cabe639f154af7d72b63120
  Author: Kees Cook <keescook@chromium.org>
  Date:   Thu Sep 28 06:38:17 2017 -0700

    timer: Prepare to change timer callback argument type

    Modern kernel callback systems pass the structure associated with a
    given callback to the callback function. The timer callback remains one
    of the legacy cases where an arbitrary unsigned long argument continues
    to be passed as the callback argument. This has several problems:

    - This bloats the timer_list structure with a normally redundant
      .data field.

    - No type checking is being performed, forcing callbacks to do
      explicit type casts of the unsigned long argument into the object
      that was passed, rather than using container_of(), as done in most
      of the other callback infrastructure.

    - Neighboring buffer overflows can overwrite both the .function and
      the .data field, providing attackers with a way to elevate from a buffer
      overflow into a simplistic ROP-like mechanism that allows calling
      arbitrary functions with a controlled first argument.

    - For future Control Flow Integrity work, this creates a unique function
      prototype for timer callbacks, instead of allowing them to continue to
      be clustered with other void functions that take a single unsigned long
      argument.

    This adds a new timer initialization API, which will ultimately replace
    the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
    named timer_setup() (to mirror hrtimer_setup(), making instances of its
    use much easier to grep for).

    In order to support the migration of existing timers into the new
    callback arguments, timer_setup() casts its arguments to the existing
    legacy types, and explicitly passes the timer pointer as the legacy
    data argument. Once all setup_*timer() callers have been replaced with
    timer_setup(), the casts can be removed, and the data argument can be
    dropped with the timer expiration code changed to just pass the timer
    to the callback directly.

:
    Modern kernel callback systems pass the structure associated with a
    given callback to the callback function. The timer callback remains one
    of the legacy cases where an arbitrary unsigned long argument continues
    to be passed as the callback argument. This has several problems:

    - This bloats the timer_list structure with a normally redundant
      .data field.

    - No type checking is being performed, forcing callbacks to do
      explicit type casts of the unsigned long argument into the object
      that was passed, rather than using container_of(), as done in most
      of the other callback infrastructure.

    - Neighboring buffer overflows can overwrite both the .function and
      the .data field, providing attackers with a way to elevate from a buffer
      overflow into a simplistic ROP-like mechanism that allows calling
      arbitrary functions with a controlled first argument.

    - For future Control Flow Integrity work, this creates a unique function
      prototype for timer callbacks, instead of allowing them to continue to
      be clustered with other void functions that take a single unsigned long
      argument.

    This adds a new timer initialization API, which will ultimately replace
    the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
    named timer_setup() (to mirror hrtimer_setup(), making instances of its
    use much easier to grep for).

    In order to support the migration of existing timers into the new
    callback arguments, timer_setup() casts its arguments to the existing
    legacy types, and explicitly passes the timer pointer as the legacy
    data argument. Once all setup_*timer() callers have been replaced with
    timer_setup(), the casts can be removed, and the data argument can be
    dropped with the timer expiration code changed to just pass the timer
    to the callback directly.

    Since the regular pattern of using container_of() during local variable
    declaration repeats the need for the variable type declaration
    to be included, this adds a helper modeled after other from_*()
    helpers that wrap container_of(), named from_timer(). This helper uses
    typeof(*variable), removing the type redundancy and minimizing the need
    for line wraps in forthcoming conversions from "unsigned data long" to
    "struct timer_list *" in the timer callbacks:

    -void callback(unsigned long data)
    +void callback(struct timer_list *t)
    {
    -   struct some_data_structure *local = (struct some_data_structure *)data;
    +   struct some_data_structure *local = from_timer(local, t, timer);

    Finally, in order to support the handful of timer users that perform
    open-coded assignments of the .function (and .data) fields, provide
    cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
    temporarily. Once conversion has been completed, these can be globally
    trivially removed.

    ...

  commit e99e88a9d2b067465adaa9c111ada99a041bef9a
  Author: Kees Cook <keescook@chromium.org>
  Date:   Mon Oct 16 14:43:17 2017 -0700

    treewide: setup_timer() -> timer_setup()

    This converts all remaining cases of the old setup_timer() API into using
    timer_setup(), where the callback argument is the structure already
    holding the struct timer_list. These should have no behavioral changes,
    since they just change which pointer is passed into the callback with
    the same available pointers after conversion. It handles the following
    examples, in addition to some other variations.

    ...

  commit 185981d54a60ae90942c6ba9006b250f3348cef2
  Author: Kees Cook <keescook@chromium.org>
  Date:   Wed Oct 4 16:26:58 2017 -0700

    timer: Remove init_timer_pinned() in favor of timer_setup()

    This refactors the only users of init_timer_pinned() to use
    the new timer_setup() and from_timer(). Drops the definition of
    init_timer_pinned().

    ...

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: Don't nest get online cpus
Mathieu Desnoyers [Wed, 13 Dec 2017 18:40:42 +0000 (13:40 -0500)] 
Fix: Don't nest get online cpus

Since the cpu hotplug refactoring in the Linux kernel, CPU hotplug
"online cpus" read lock cannot be nested anymore.

Fix this by disabling preemption around the section instead.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: lttng_channel_syscall_mask() bool use in bitfield
Mathieu Desnoyers [Fri, 8 Dec 2017 19:17:21 +0000 (14:17 -0500)] 
Fix: lttng_channel_syscall_mask() bool use in bitfield

gcc 7 warns about using ~ on a bool. Pass a char as input type instead.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: update kmem instrumentation for kernel 4.15
Michael Jeanson [Tue, 28 Nov 2017 21:02:45 +0000 (16:02 -0500)] 
Fix: update kmem instrumentation for kernel 4.15

See upstream commit:

  commit 2d4894b5d2ae0fe1725ea7abd57b33bfbbe45492
  Author: Mel Gorman <mgorman@techsingularity.net>
  Date:   Wed Nov 15 17:37:59 2017 -0800

    mm: remove cold parameter from free_hot_cold_page*

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
6 years agoFix: lttng_kvmalloc helper NULL pointer OOPS
Mathieu Desnoyers [Tue, 7 Nov 2017 21:44:36 +0000 (16:44 -0500)] 
Fix: lttng_kvmalloc helper NULL pointer OOPS

The static function __vmalloc_node is not visible by KALLSYMS_ALL on at
least some kernels, which leads to a call to a NULL function when trying
to perform allocation of lttng buffer memory under memory fragmentation
conditions (kmalloc_node failure).

Use __vmalloc_node_range instead, and check that the returned pointer
is non-NULL to ensure this type of failure does not happen in any
condition.

Fallback to __vmalloc(), even though it is not NUMA-aware, in case
we fail to find __vmalloc_node_range, and print an explicit warning
to the user console about the need to enable KALLSYMS_ALL.

This affects kernels < 4.12. Later kernels provide kvmalloc(), which
we use.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoUpdate version to 2.11.0-pre
Michael Jeanson [Wed, 1 Nov 2017 19:55:58 +0000 (15:55 -0400)] 
Update version to 2.11.0-pre

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: lttng-logger get_user_pages_fast error handling
Mathieu Desnoyers [Tue, 31 Oct 2017 22:23:59 +0000 (18:23 -0400)] 
Fix: lttng-logger get_user_pages_fast error handling

Comparing a signed return value against an unsigned nr_pages performs
the comparison as "unsigned", and therefore mistakenly considers
get_user_pages_fast() errors as success.

By passing an invalid pointer to write() to the /proc/lttng-logger
interface, unprivileged user-space processes can trigger a kernel OOPS.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update block instrumentation for 4.14 kernel
Mathieu Desnoyers [Thu, 5 Oct 2017 18:52:15 +0000 (14:52 -0400)] 
Fix: update block instrumentation for 4.14 kernel

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoRevert "Fix: update block instrumentation for kernel 4.14"
Mathieu Desnoyers [Thu, 5 Oct 2017 18:45:43 +0000 (14:45 -0400)] 
Revert "Fix: update block instrumentation for kernel 4.14"

This reverts commit 49447902967115fe5a07ee7a1df3d17fbf4b1ab8.

It introduces a NULL pointer dereference:

[ 37.862398] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
181.3  [ 37.864108] IP: [<ffffffffa01c41b7>] __event_probe__block_get_rq+0x127/0x4b0 [lttng_probe_block]
181.4  [ 37.864108] PGD 7a402067 PUD 7a4c7067 PMD 0
181.5  [ 37.864108] Oops: 0000 [#1] SMP
181.6  [ 37.864108] Modules linked in: lttng_probe_x86_exceptions(OE) lttng_probe_x86_irq_vectors(OE) lttng_probe_writeback(OE) lttng_probe_workqueue(OE) lttng_probe_vmscan(OE) lttng_probe_udp(OE) lttng_probe_timer(OE) lttng_probe_sunrpc(OE) lttng_probe_statedump(OE) lttng_probe_sock(OE) lttng_probe_skb(OE) lttng_probe_signal(OE) lttng_probe_scsi(OE) lttng_probe_sched(OE) lttng_probe_regulator(OE) lttng_probe_regmap(OE) lttng_probe_rcu(OE) lttng_probe_random(OE) lttng_probe_printk(OE) lttng_probe_power(OE) lttng_probe_net(OE) lttng_probe_napi(OE) lttng_probe_module(OE) lttng_probe_kvm_x86_mmu(OE) lttng_probe_kvm_x86(OE) lttng_probe_kvm(OE) lttng_probe_kmem(OE) lttng_probe_jbd2(OE) lttng_probe_irq(OE) lttng_probe_i2c(OE) lttng_probe_gpio(OE) lttng_probe_ext4(OE) lttng_probe_compaction(OE) lttng_probe_btrfs(OE) lttng_probe_block(OE) lttng_ring_buffer_metadata_mmap_client(OE) lttng_ring_buffer_client_mmap_overwrite(OE) lttng_ring_buffer_client_mmap_discard(OE) lttng_ring_buffer_metadata_client(OE) lttng_ring_buffer_client_overwrite(OE) lttng_ring_buffer_client_discard(OE) lttng_tracer(OE) lttng_statedump(OE) lttng_ftrace(OE) lttng_kprobes(OE) lttng_clock(OE) lttng_lib_ring_buffer(OE) lttng_kretprobes(OE)
181.7  [ 37.864108] CPU: 1 PID: 6 Comm: kworker/u4:0 Tainted: G OE 4.4.90 #1
181.8  [ 37.864108] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
181.9  [ 37.864108] Workqueue: events_freezable_power_ disk_events_workfn
181.10  [ 37.864108] task: ffff88007c861bc0 ti: ffff88007c868000 task.ti: ffff88007c868000
181.11  [ 37.864108] RIP: 0010:[<ffffffffa01c41b7>] [<ffffffffa01c41b7>] __event_probe__block_get_rq+0x127/0x4b0 [lttng_probe_block]
181.12  [ 37.864108] RSP: 0018:ffff88007c86ba98 EFLAGS: 00010246
181.13  [ 37.864108] RAX: 0000000000000000 RBX: ffff880073683348 RCX: ffff8800747d0000
181.14  [ 37.864108] RDX: 00000008d0c5bde9 RSI: 00000000000009f2 RDI: 0000000000400000
181.15  [ 37.864108] RBP: ffff88007c86bba8 R08: 00000000001789ed R09: 0000000000100000
181.16  [ 37.864108] R10: ffffe8ffffd02460 R11: 0000000000000000 R12: 0000000000000000
181.17  [ 37.864108] R13: 0000000000017fe0 R14: ffff88007363c6e8 R15: ffff88007bef83c0
181.18  [ 37.864108] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
181.19  [ 37.864108] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
181.20  [ 37.864108] CR2: 0000000000000008 CR3: 000000007a4d0000 CR4: 00000000000006e0
181.21  [ 37.864108] Stack:
181.22  [ 37.864108] 0000000000000000 ffffffff8115a46b ffff88007c86bbe8 ffff88007bc67e30
181.23  [ 37.864108] ffff880073683348 00000000ffffff01 ffff88007a7a1000 ffff88007c86bab8
181.24  [ 37.864108] 0000000000000028 0000000100000001 ffffe8ffffd02460 0000000000000035
181.25  [ 37.864108] Call Trace:
181.26  [ 37.864108] [<ffffffff8115a46b>] ? ktime_get_mono_fast_ns+0x4b/0x90
181.27  [ 37.864108] [<ffffffff81532849>] ? alloc_request_struct+0x19/0x20
181.28  [ 37.864108] [<ffffffff811e8d8f>] ? mempool_alloc+0x5f/0x150
181.29  [ 37.864108] [<ffffffffa021815c>] ? __event_probe__kmem_alloc+0x1dc/0x2c0 [lttng_probe_kmem]
181.30  [ 37.864108] [<ffffffff810ad85e>] ? kvm_clock_read+0x1e/0x20
181.31  [ 37.864108] [<ffffffff81535f4f>] get_request+0x4af/0x760
181.32  [ 37.864108] [<ffffffff8112c270>] ? wake_atomic_t_function+0x60/0x60
181.33  [ 37.864108] [<ffffffff81536283>] blk_get_request+0x83/0xe0
181.34  [ 37.864108] [<ffffffff81773b5d>] scsi_execute+0x3d/0x1d0
181.35  [ 37.864108] [<ffffffff817758fe>] scsi_execute_req_flags+0x8e/0xf0
181.36  [ 37.864108] [<ffffffff81788f4d>] sr_check_events+0x8d/0x2a0
181.37  [ 37.864108] [<ffffffff81547590>] ? disk_check_events+0x130/0x130
181.38  [ 37.864108] [<ffffffff8181b618>] cdrom_check_events+0x18/0x30
181.39  [ 37.864108] [<ffffffff8178935a>] sr_block_check_events+0x2a/0x30
181.40  [ 37.864108] [<ffffffff815474b1>] disk_check_events+0x51/0x130
181.41  [ 37.864108] [<ffffffff815475a6>] disk_events_workfn+0x16/0x20
181.42  [ 37.864108] [<ffffffff81102b85>] process_one_work+0x165/0x480
181.43  [ 37.864108] [<ffffffff81102eeb>] worker_thread+0x4b/0x4c0
181.44  [ 37.864108] [<ffffffff81102ea0>] ? process_one_work+0x480/0x480
181.45  [ 37.864108] [<ffffffff81108d86>] kthread+0xd6/0xf0
181.46  [ 37.864108] [<ffffffff81108cb0>] ? kthread_create_on_node+0x180/0x180
181.47  [ 37.864108] [<ffffffff81aa690f>] ret_from_fork+0x3f/0x70
181.48  [ 37.864108] [<ffffffff81108cb0>] ? kthread_create_on_node+0x180/0x180
181.49  [ 37.864108] Code: 00 00 00 00 48 89 85 20 ff ff ff 48 8d 85 10 ff ff ff 8b 73 04 48 89 85 28 ff ff ff 49 8b 47 48 ff 50 28 85 c0 0f 88 5d 01 00 00 <49> 8b 44 24 08 48 85 c0 0f 84 3d 03 00 00 8b 00 89 85 08 ff ff
181.50  [ 37.864108] RIP [<ffffffffa01c41b7>] __event_probe__block_get_rq+0x127/0x4b0 [lttng_probe_block]
181.51  [ 37.864108] RSP <ffff88007c86ba98>
181.52  [ 37.864108] CR2: 0000000000000008

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: version check error in btrfs instrumentation
Michael Jeanson [Fri, 29 Sep 2017 20:40:36 +0000 (16:40 -0400)] 
Fix: version check error in btrfs instrumentation

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update btrfs instrumentation for kernel 4.14
Michael Jeanson [Wed, 20 Sep 2017 16:12:41 +0000 (12:12 -0400)] 
Fix: update btrfs instrumentation for kernel 4.14

See upstream commit:

  Author: Jeff Mahoney <jeffm@suse.com>
  Date:   Wed Jun 28 21:56:54 2017 -0600

    btrfs: constify tracepoint arguments

    Tracepoint arguments are all read-only.  If we mark the arguments
    as const, we're able to keep or convert those arguments to const
    where appropriate.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update writeback instrumentation for kernel 4.14
Michael Jeanson [Wed, 20 Sep 2017 16:12:40 +0000 (12:12 -0400)] 
Fix: update writeback instrumentation for kernel 4.14

See upstream commits:

  commit 11fb998986a72aa7e997d96d63d52582a01228c5
  Author: Mel Gorman <mgorman@techsingularity.net>
  Date:   Thu Jul 28 15:46:20 2016 -0700

    mm: move most file-based accounting to the node

    There are now a number of accounting oddities such as mapped file pages
    being accounted for on the node while the total number of file pages are
    accounted on the zone.  This can be coped with to some extent but it's
    confusing so this patch moves the relevant file-based accounted.  Due to
    throttling logic in the page allocator for reliable OOM detection, it is
    still necessary to track dirty and writeback pages on a per-zone basis.

  commit c4a25635b60d08853a3e4eaae3ab34419a36cfa2
  Author: Mel Gorman <mgorman@techsingularity.net>
  Date:   Thu Jul 28 15:46:23 2016 -0700

    mm: move vmscan writes and file write accounting to the node

    As reclaim is now node-based, it follows that page write activity due to
    page reclaim should also be accounted for on the node.  For consistency,
    also account page writes and page dirtying on a per-node basis.

    After this patch, there are a few remaining zone counters that may appear
    strange but are fine.  NUMA stats are still per-zone as this is a
    user-space interface that tools consume.  NR_MLOCK, NR_SLAB_*,
    NR_PAGETABLE, NR_KERNEL_STACK and NR_BOUNCE are all allocations that
    potentially pin low memory and cannot trivially be reclaimed on demand.
    This information is still useful for debugging a page allocation failure
    warning.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update block instrumentation for kernel 4.14
Michael Jeanson [Wed, 20 Sep 2017 16:12:39 +0000 (12:12 -0400)] 
Fix: update block instrumentation for kernel 4.14

See upstream commit:

  commit 74d46992e0d9dee7f1f376de0d56d31614c8a17a
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Wed Aug 23 19:10:32 2017 +0200

    block: replace bi_bdev with a gendisk pointer and partitions index

    This way we don't need a block_device structure to submit I/O.  The
    block_device has different life time rules from the gendisk and
    request_queue and is usually only available when the block device node
    is open.  Other callers need to explicitly create one (e.g. the lightnvm
    passthrough code, or the new nvme multipathing code).

    For the actual I/O path all that we need is the gendisk, which exists
    once per block device.  But given that the block layer also does
    partition remapping we additionally need a partition index, which is
    used for said remapping in generic_make_request.

    Note that all the block drivers generally want request_queue or
    sometimes the gendisk, so this removes a layer of indirection all
    over the stack.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: vmalloc wrapper on kernel < 2.6.38
Michael Jeanson [Tue, 26 Sep 2017 18:16:47 +0000 (14:16 -0400)] 
Fix: vmalloc wrapper on kernel < 2.6.38

Ensure that all probes end up including the vmalloc wrapper through the
lttng-tracer.h header so the trace_*() static inlines are generated
through inclusion of include/trace/events/kmem.h before we define
CREATE_TRACE_POINTS.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: vmalloc wrapper on kernel >= 4.12
Michael Jeanson [Tue, 26 Sep 2017 17:46:30 +0000 (13:46 -0400)] 
Fix: vmalloc wrapper on kernel >= 4.12

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd kmalloc failover to vmalloc
Michael Jeanson [Mon, 25 Sep 2017 14:56:20 +0000 (10:56 -0400)] 
Add kmalloc failover to vmalloc

This patch is based on the kvmalloc helpers introduced in kernel 4.12.

It will gracefully failover memory allocations of more than one page to
vmalloc for systems under high memory pressure or fragmentation.

See Linux kernel commit:
  commit a7c3e901a46ff54c016d040847eda598a9e3e653
  Author: Michal Hocko <mhocko@suse.com>
  Date:   Mon May 8 15:57:09 2017 -0700

    mm: introduce kv[mz]alloc helpers

    Patch series "kvmalloc", v5.

    There are many open coded kmalloc with vmalloc fallback instances in the
    tree.  Most of them are not careful enough or simply do not care about
    the underlying semantic of the kmalloc/page allocator which means that
    a) some vmalloc fallbacks are basically unreachable because the kmalloc
    part will keep retrying until it succeeds b) the page allocator can
    invoke a really disruptive steps like the OOM killer to move forward
    which doesn't sound appropriate when we consider that the vmalloc
    fallback is available.

    As it can be seen implementing kvmalloc requires quite an intimate
    knowledge if the page allocator and the memory reclaim internals which
    strongly suggests that a helper should be implemented in the memory
    subsystem proper.

    Most callers, I could find, have been converted to use the helper
    instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
    in the networking stack which I have converted as well and Eric Dumazet
    was not opposed [2] to convert them as well.

    [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
    [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

    This patch (of 9):

    Using kmalloc with the vmalloc fallback for larger allocations is a
    common pattern in the kernel code.  Yet we do not have any common helper
    for that and so users have invented their own helpers.  Some of them are
    really creative when doing so.  Let's just add kv[mz]alloc and make sure
    it is implemented properly.  This implementation makes sure to not make
    a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
    to not warn about allocation failures.  This also rules out the OOM
    killer as the vmalloc is a more approapriate fallback than a disruptive
    user visible action.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: mmap: caches aliased on virtual addresses
Mathieu Desnoyers [Tue, 19 Sep 2017 16:16:58 +0000 (12:16 -0400)] 
Fix: mmap: caches aliased on virtual addresses

Some architectures (e.g. implementations of arm64) implement their
caches based on the virtual addresses (rather than physical address).
It has the upside of making the cache access faster (no TLB lookup
required to access the cache line), but the downside of requiring
virtual mappings (e.g. kernel vs user-space) to be aligned on the number
of bits used for cache aliasing.

Perform dcache flushing for the entire sub-buffer in the get_subbuf
operation on those architectures, thus ensuring we don't end up with
cache aliasing issues.

An alternative approach we could eventually take would be to create a
kernel mapping for the ring buffer that is aligned with the user-space
mapping.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update ext4 instrumentation for kernel 4.13
Michael Jeanson [Mon, 21 Aug 2017 18:47:08 +0000 (14:47 -0400)] 
Fix: update ext4 instrumentation for kernel 4.13

See this upstream commit :

  commit a627b0a7c15ee4d2c87a86d5be5c8167382e8d0d
  Author: Eric Whitney <enwlinux@gmail.com>
  Date:   Sun Jul 30 22:30:11 2017 -0400

      ext4: remove unused metadata accounting variables

      Two variables in ext4_inode_info, i_reserved_meta_blocks and
      i_allocated_meta_blocks, are unused.  Removing them saves a little
      memory per in-memory inode and cleans up clutter in several tracepoints.
      Adjust tracepoint output from ext4_alloc_da_blocks() for consistency
      and fix a typo and whitespace near these changes.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: Sleeping function called from invalid context
Mathieu Desnoyers [Fri, 21 Jul 2017 12:22:04 +0000 (08:22 -0400)] 
Fix: Sleeping function called from invalid context

It affects system call instrumentation for accept, accept4 and connect,
only on the x86-64 architecture.

We need to use the LTTng accessing functions to touch user-space memory,
which take care of disabling the page fault handler, so we don't preempt
while in preempt-off context (tracepoints disable preemption).

Fixes #1111

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: sched for v4.11.5-rt1
Michael Jeanson [Mon, 10 Jul 2017 22:13:11 +0000 (18:13 -0400)] 
Fix: sched for v4.11.5-rt1

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoMake vim users life easier
Michael Jeanson [Fri, 23 Jun 2017 18:36:19 +0000 (14:36 -0400)] 
Make vim users life easier

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoRename Makefile.ABI.workarounds to Kbuild.common
Michael Jeanson [Fri, 23 Jun 2017 18:29:43 +0000 (14:29 -0400)] 
Rename Makefile.ABI.workarounds to Kbuild.common

This file is now used for code which is common to all Kbuild files.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: handle missing ftrace header on v4.12
Michael Jeanson [Fri, 23 Jun 2017 18:29:42 +0000 (14:29 -0400)] 
Fix: handle missing ftrace header on v4.12

Properly handle the case where we build against the distro headers of a
kernel >= 4.12 and ftrace is enabled but the private header is
unavailable.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: pid tracker should track "pgid"
Mathieu Desnoyers [Thu, 1 Jun 2017 18:24:11 +0000 (14:24 -0400)] 
Fix: pid tracker should track "pgid"

The "pid" notion exposed by LTTng translates to the "pgid" notion in the
Linux kernel. Therefore using "current->pid" as argument to the PID
tracker actually ends up behaving as a "tid" tracker, which does not
match the intent nor the user-space tracer behavior.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoCleanup: typo in lttng pid tracker
Mathieu Desnoyers [Thu, 1 Jun 2017 18:18:33 +0000 (14:18 -0400)] 
Cleanup: typo in lttng pid tracker

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: Build ftrace probe on kernels prior to 4.12
Francis Deslauriers [Tue, 30 May 2017 13:36:31 +0000 (09:36 -0400)] 
Fix: Build ftrace probe on kernels prior to 4.12

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update ftrace probe for kernel 4.12
Michael Jeanson [Thu, 25 May 2017 20:56:52 +0000 (16:56 -0400)] 
Fix: update ftrace probe for kernel 4.12

Follow changes introduced by Linux upstream commits:
  ec19b85913486993d7d6f747beed1a711afd47d8
  bca6c8d0480a8aa5c86f8f416db96c71f6b79e29
  b5f081b563a6cdcb85a543df8c851951a8978275
  6e4443199e5354255e8a4c1e8e5cfc8ef064c3ce

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update block instrumentation for kernel 4.12
Michael Jeanson [Thu, 25 May 2017 20:56:51 +0000 (16:56 -0400)] 
Fix: update block instrumentation for kernel 4.12

Follow changes introduced by Linux upstream commits:
  48b77ad6084481ef9330a5d2bee289966da0975b
  cee4b7ce3f9161c88f7255a3d73c1c4d5bbabea7
  caf7df12272118e0274c8353bcfeaf60c7743a47

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoCalculate context length outside of retry loop
Mathieu Desnoyers [Sat, 27 May 2017 11:28:03 +0000 (13:28 +0200)] 
Calculate context length outside of retry loop

Allow context length calculation to have side-effects (e.g. page faults)
which trigger event tracing by moving the calculation outside of the
buffer space reservation retry loop.

This also paves the way to have dynamically sized contexts, which
would expect to put their size of the internal stack. Note that the
context length calculation is performed *after* the event payload field
length calculation, so the stack needs to be used accordingly.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: Add support for 4.9.27-rt18 kernel
Michael Jeanson [Wed, 24 May 2017 15:19:50 +0000 (11:19 -0400)] 
Fix: Add support for 4.9.27-rt18 kernel

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update btrfs instrumentation for kernel 4.12
Michael Jeanson [Tue, 23 May 2017 19:46:41 +0000 (15:46 -0400)] 
Fix: update btrfs instrumentation for kernel 4.12

See upstream commit 490b54d6fb75f6ffd0471ec58bb38a992e2b40cd

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update ringbuffer for kernel 4.12
Michael Jeanson [Tue, 23 May 2017 19:45:47 +0000 (15:45 -0400)] 
Fix: update ringbuffer for kernel 4.12

flags removed from splice_pipe_desc in 4.12.

See upstream commit f81dc7d7d5a2528f98f26a0b9406e822d0b35011

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update sched instrumentation for kernel 4.12
Michael Jeanson [Tue, 23 May 2017 19:45:18 +0000 (15:45 -0400)] 
Fix: update sched instrumentation for kernel 4.12

See upstream commit b91473ff6e979c0028f02f90e40c844959c736d8

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: ext3 was completely removed from the kernel in v4.3
Michael Jeanson [Tue, 23 May 2017 19:43:25 +0000 (15:43 -0400)] 
Fix: ext3 was completely removed from the kernel in v4.3

Don't display the warning about missing ext3 headers on kernels >= 4.3

See upstream commit e31fb9e00543e5d3c5b686747d3c862bc09b59f3

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: NULL pointer dereference of THIS_MODULE with built-in modules
Francis Deslauriers [Wed, 17 May 2017 21:09:12 +0000 (17:09 -0400)] 
Fix: NULL pointer dereference of THIS_MODULE with built-in modules

THIS MODULE is defined to 0 when a module is built-in the kernel [1].
This caused NULL pointer dereference when booting a kernel with the
lttng-modules built-in.
To fix this issue, add #if guard around the wrapper_lttng_fixup_sig
function checking if the MODULE macro is defined to confirm that this
piece of code will end up in a module and not in the kernel itself.

[1]: linux/include/linux/export.h:32
Fixes: #1107
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: add "flush empty" ioctl for stream intersection
Mathieu Desnoyers [Thu, 11 May 2017 20:50:50 +0000 (16:50 -0400)] 
Fix: add "flush empty" ioctl for stream intersection

Changing the behavior of the "snapshot" lttng command to implicitly do a
buffer "flush" (even when current packet is empty) had unwanted
side-effects: for instance, the snapshot ABI is used by the live timer
to grab the buffer positions, and we don't want to generate useless
empty packets in that scenario.

Therefore, add the "flush empty" behavior as a new ioctl to the ring
buffer. This allows lttng-tools to perform buffer flush (even for empty
packets) when it needs to. Given that this new ioctl is added within
stable branches as well, lttng-tools always need to handle "-ENOSYS"
gracefully.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoRevert "Fix: flush empty packets on snapshot channel"
Mathieu Desnoyers [Thu, 11 May 2017 20:42:46 +0000 (16:42 -0400)] 
Revert "Fix: flush empty packets on snapshot channel"

This reverts commit dc5cd5702b74d72f0db0141c6d888a1d820aed9c.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoRevert "Fix: don't perform extra flush on metadata channel"
Mathieu Desnoyers [Thu, 11 May 2017 20:42:34 +0000 (16:42 -0400)] 
Revert "Fix: don't perform extra flush on metadata channel"

This reverts commit 7cf44d034bdda1896f6b0c6374c90c06d45ee4fd.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoVersion 2.10.0-rc1
Mathieu Desnoyers [Sat, 6 May 2017 01:04:21 +0000 (21:04 -0400)] 
Version 2.10.0-rc1

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: remove CONFIG_KALLSYMS_ALL warning on clean
Michael Jeanson [Fri, 5 May 2017 16:08:07 +0000 (12:08 -0400)] 
Fix: remove CONFIG_KALLSYMS_ALL warning on clean

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd RING_BUFFER_SNAPSHOT_SAMPLE_POSITIONS command
Jérémie Galarneau [Thu, 4 May 2017 21:25:21 +0000 (17:25 -0400)] 
Add RING_BUFFER_SNAPSHOT_SAMPLE_POSITIONS command

There is no need to bump the LTTNG_MODULES_ABI_MINOR_VERSION
since the multiple wildcard feature introduced as part of the 2.10
release already bumps it from 2 to 3.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: Always build vmscan probe
Michael Jeanson [Thu, 20 Apr 2017 19:23:25 +0000 (15:23 -0400)] 
Fix: Always build vmscan probe

The mm/vmscan.c compile unit is a obj-y, even on an old 2.6.36 kernel,
always build the vmscan probe regardless of kernel configuration.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoCleanup: formatting in strutils_star_glob_match explanation
Francis Deslauriers [Fri, 17 Mar 2017 21:06:00 +0000 (17:06 -0400)] 
Cleanup: formatting in strutils_star_glob_match explanation

Replace tabs for spaces in example scenario.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: introduce LTTNG_SIZE_MAX for older kernels
Mathieu Desnoyers [Sat, 11 Mar 2017 18:46:10 +0000 (13:46 -0500)] 
Fix: introduce LTTNG_SIZE_MAX for older kernels

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoUse SIZE_MAX instead of -1ULL for size_t parameter
Mathieu Desnoyers [Sat, 11 Mar 2017 13:44:42 +0000 (08:44 -0500)] 
Use SIZE_MAX instead of -1ULL for size_t parameter

strutils_star_glob_match() receives a size_t. Passing -1ULL truncates
the value implicitly on systems where size_t is 32-bit. It is cleaner to
use SIZE_T.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agofilter: use SIZE_MAX for size_t
Mathieu Desnoyers [Sat, 11 Mar 2017 13:39:22 +0000 (08:39 -0500)] 
filter: use SIZE_MAX for size_t

The backing type is a size_t, so use SIZE_MAX to represent infinity.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: out of bound array access in filter code
Mathieu Desnoyers [Fri, 10 Mar 2017 21:51:17 +0000 (16:51 -0500)] 
Fix: out of bound array access in filter code

Fix ported from lttng-ust, initially found by Coverity.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd support for star globbing patterns in event names
Philippe Proulx [Sun, 19 Feb 2017 01:01:34 +0000 (20:01 -0500)] 
Add support for star globbing patterns in event names

This patch adds support for full star-only globbing patterns used in
the event names (enabler names).

strutils_star_glob_match() is always used to perform the match when
the enabler is LTTNG_ENABLER_STAR_GLOB. This enabler is set when it is
detected that its name contains at least one non-escaped star with
strutils_is_star_glob_pattern().

The match is performed by strutils_star_glob_match(), the same function
that the filter interpreter uses.

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFiltering: add support for star-only globbing patterns
Philippe Proulx [Sun, 19 Feb 2017 01:04:11 +0000 (20:04 -0500)] 
Filtering: add support for star-only globbing patterns

This patch adds the support for "full" star-only globbing patterns to be
used in filter literal strings. A star-only globbing pattern is a
globbing pattern with the star (`*`) being the only special character.
This means `?` and character sets (`[abc-k]`) are not supported here. We
cannot support them without a strategy to differentiate the globbing
pattern because `?` and `[` are not special characters in filter literal
strings right now. The eventual strategy to support them would probably
look like this:

    filename =* "?sys*.[ch]"

The filter bytecode generator in LTTng-tools's session daemon creates
the new FILTER_OP_LOAD_STAR_GLOB_STRING operation when the interpreter
should load a star globbing pattern literal string. Even if both
"plain", or legacy strings and star globbing pattern strings are literal
strings, they do not represent the same thing, that is, the == and !=
operators act differently.

The validation process checks that:

1. There's no binary operator between two
   FILTER_OP_LOAD_STAR_GLOB_STRING operations. It is illegal to compare
   two star globbing patterns, as this is not trivial to implement, and
   completely useless as far as I know.

2. Only the == and != binary operators are allowed between a
   star globbing pattern and a string.

For the special case of star globbing patterns with a star at the end
only, the current behaviour is not changed to preserve a maximum of
backward compatibility. This is also why the ABI version is changed from
2.2 to 2.3, not to 3.0.

== or != operations between REG_STRING and REG_STAR_GLOB_STRING
registers is specialized to FILTER_OP_EQ_STAR_GLOB_STRING and
FILTER_OP_NE_STAR_GLOB_STRING. Which side is the actual globbing pattern
(the one with the REG_STAR_GLOB_STRING type) is checked at execution
time. The strutils_star_glob_match() function is used to perform the
match operation. See the implementation for more details.

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd string utilities
Philippe Proulx [Sun, 19 Feb 2017 01:00:33 +0000 (20:00 -0500)] 
Add string utilities

The new lttng-string-utils.c file has a few utility functions to
manipulate and check strings. See lttng-string-utils.c for more details.

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agolttng-abi.c: cleanup whitespaces
Philippe Proulx [Sun, 19 Feb 2017 01:05:13 +0000 (20:05 -0500)] 
lttng-abi.c: cleanup whitespaces

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: use of uninitialized ret value in lttng_abi_open_metadata_stream
Francis Deslauriers [Wed, 8 Mar 2017 19:32:31 +0000 (14:32 -0500)] 
Fix: use of uninitialized ret value in lttng_abi_open_metadata_stream

Fixes the following compiler warning:

lttng-abi.c: In function ‘lttng_metadata_ioctl’:
lttng-abi.c:971:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  int ret;
      ^

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: kref changes for kernel 4.11
Francis Deslauriers [Wed, 8 Mar 2017 04:37:30 +0000 (23:37 -0500)] 
Fix: kref changes for kernel 4.11

The underlying type of `struct kref` changed in kernel 4.11 from an
atomic_t to a refcount_t. This change was introduced in kernel
commit:10383ae. This commit also added a builtin overflow checks to
`kref_get()` so we use it.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: atomic_add_unless() returns true/false rather than prior value
Francis Deslauriers [Wed, 8 Mar 2017 16:50:38 +0000 (11:50 -0500)] 
Fix: atomic_add_unless() returns true/false rather than prior value

The previous implementation assumed that `atomic_add_unless` returned
the prior value of the atomic counter when in fact it returned if the
addition was performed (true) or not performed (false).
Since `atomic_add_unless` can not return INT_MAX, the `lttng_kref_get`
always returned that the call was successful.

This issue had a low likelihood of being triggered since the two refcounts
of the counters used with this call are both bounded by the maximum
number of file descriptors on the system.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: timers cputime_t arguments replaced by ull in kernel 4.11
Francis Deslauriers [Tue, 7 Mar 2017 16:21:59 +0000 (11:21 -0500)] 
Fix: timers cputime_t arguments replaced by ull in kernel 4.11

cputime_t was changed to ull in the kernel commit: 858cf3a

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update scsi instrumentation for kernel 4.11
Francis Deslauriers [Tue, 7 Mar 2017 16:16:47 +0000 (11:16 -0500)] 
Fix: update scsi instrumentation for kernel 4.11

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: changes to the vm_op fault cb prototype in libringbuffer
Francis Deslauriers [Tue, 7 Mar 2017 15:35:21 +0000 (10:35 -0500)] 
Fix: changes to the vm_op fault cb prototype in libringbuffer

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update btrfs instrumentation for kernel 4.11
Francis Deslauriers [Tue, 7 Mar 2017 15:14:19 +0000 (10:14 -0500)] 
Fix: update btrfs instrumentation for kernel 4.11

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: update mm_vmscan instrumentation for kernel 4.11
Francis Deslauriers [Tue, 7 Mar 2017 14:48:08 +0000 (09:48 -0500)] 
Fix: update mm_vmscan instrumentation for kernel 4.11

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: section mismatch warning caused by __exit annotation
Francis Deslauriers [Tue, 7 Mar 2017 14:12:31 +0000 (09:12 -0500)] 
Fix: section mismatch warning caused by __exit annotation

lttng_logger_exit is used in a non-exit function so it should not be
annotated with `__exit`.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agosocketpair: extend syscall socketpair tracing information
Jan Willeke [Thu, 16 Feb 2017 13:42:51 +0000 (14:42 +0100)] 
socketpair: extend syscall socketpair tracing information

Decode the socketpair vector pointer into two file descriptors.
This exposes the connected file descriptors to analyses.

As sockerpair is a sub syscall of socketcall in x86_32,
sockerpair override must be disabled for x86_32 and x86_compatmode

Signed-off-by: Jan Willeke <jan.willeke@harman.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoRemove events/mainline unused headers
Mathieu Desnoyers [Sat, 25 Feb 2017 08:34:10 +0000 (09:34 +0100)] 
Remove events/mainline unused headers

We can actually diff from Linux kernel headers directly instead of
keeping stale unused copies of those headers.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoupdate event README
Mathieu Desnoyers [Sat, 25 Feb 2017 08:33:42 +0000 (09:33 +0100)] 
update event README

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: nmi-safe clock on 32-bit systems
Mathieu Desnoyers [Fri, 10 Feb 2017 01:46:44 +0000 (20:46 -0500)] 
Fix: nmi-safe clock on 32-bit systems

On 32-bit systems, the algorithm within lttng-modules that ensures the
nmi-safe clock increases monotonically on a CPU assumes to have one
clock read per 32-bit LSB overflow period, which is not guaranteed. It
also has an issue on the first clock reads after module load, because
the initial value for the last LSB is 0. It can cause the time to stay
stuck at the same value for a few seconds at the beginning of the trace,
which is unfortunate for the first trace after module load, because this
is where the offset between realtime and trace_clock is sampled, which
prevents correlation of kernel and user-space traces for that session.

It only affects 32-bit systems with kernels >= 3.17.

Fix this by using the non-nmi-safe clock source on 32-bit systems.

While we are there, remove an implementation-defined c99 behavior
regarding casting u64 to long by using unsigned arithmetic instead:

turn:
  if (((long) now - (long) last) < 0)
into:
  if (U64_MAX / 2 < now - last)

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: only include linux/cpuhotplug.h for kernels >= 4.10
Mathieu Desnoyers [Mon, 23 Jan 2017 20:16:22 +0000 (15:16 -0500)] 
Fix: only include linux/cpuhotplug.h for kernels >= 4.10

Kernels at least <= 4.4 did not have this header file.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: 4.10 hotplug adaptation backward compat
Mathieu Desnoyers [Mon, 23 Jan 2017 17:34:07 +0000 (12:34 -0500)] 
Fix: 4.10 hotplug adaptation backward compat

                 from /home/compudj/git/lttng-modules/lttng-context-perf-counters.c:23:
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c: In function ‘lttng_add_perf_counter_to_ctx’:
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:22: error: ‘cpu’ undeclared (first use in this function)
  for_each_online_cpu(cpu) {
                      ^
./include/linux/cpumask.h:223:8: note: in definition of macro ‘for_each_cpu’
  for ((cpu) = -1;    \
        ^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:2: note: in expansion of macro ‘for_each_online_cpu’
  for_each_online_cpu(cpu) {
  ^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:22: note: each undeclared identifier is reported only once for each function it appears in
  for_each_online_cpu(cpu) {
                      ^
./include/linux/cpumask.h:223:8: note: in definition of macro ‘for_each_cpu’
  for ((cpu) = -1;    \
        ^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:2: note: in expansion of macro ‘for_each_online_cpu’
  for_each_online_cpu(cpu) {
  ^
./include/linux/cpumask.h:224:38: warning: left-hand operand of comma expression has no effect [-Wunused-value]
   (cpu) = cpumask_next((cpu), (mask)), \
                                      ^
./include/linux/cpumask.h:717:36: note: in expansion of macro ‘for_each_cpu’
 #define for_each_online_cpu(cpu)   for_each_cpu((cpu), cpu_online_mask)
                                    ^
/home/compudj/git/lttng-modules/lttng-context-perf-counters.c:353:2: note: in expansion of macro ‘for_each_online_cpu’
  for_each_online_cpu(cpu) {
  ^
scripts/Makefile.build:289: recipe for target '/home/compudj/git/lttng-modules/lttng-context-perf-counters.o' failed
make[2]: *** [/home/compudj/git/lttng-modules/lttng-context-perf-counters.o] Error 1
make[2]: *** Waiting for unfinished jobs....

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: 4.10 btrfs instrumentation update backward compat
Mathieu Desnoyers [Mon, 23 Jan 2017 17:32:17 +0000 (12:32 -0500)] 
Fix: 4.10 btrfs instrumentation update backward compat

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoUpdate btrfs instrumentation for 4.10 kernel
Mathieu Desnoyers [Mon, 23 Jan 2017 17:18:35 +0000 (12:18 -0500)] 
Update btrfs instrumentation for 4.10 kernel

Based on commit 92a1bf76 "Btrfs: add 'inode' for extent map tracepoint"
in the upstream Linux kernel.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdapt lttng-modules to Linux 4.10 cpu hotplug state machine
Mathieu Desnoyers [Tue, 10 Jan 2017 16:19:51 +0000 (11:19 -0500)] 
Adapt lttng-modules to Linux 4.10 cpu hotplug state machine

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agobtrfs instrumentation: update to 4.10 kernel
Mathieu Desnoyers [Tue, 10 Jan 2017 16:41:11 +0000 (11:41 -0500)] 
btrfs instrumentation: update to 4.10 kernel

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agotimer instrumentation: adapt to ktime_t without union
Mathieu Desnoyers [Tue, 10 Jan 2017 16:29:49 +0000 (11:29 -0500)] 
timer instrumentation: adapt to ktime_t without union

Introduced in Linux upstream in 4.10.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd load/unload messages to kernel log
Michael Jeanson [Wed, 21 Dec 2016 22:55:26 +0000 (17:55 -0500)] 
Add load/unload messages to kernel log

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoUpdate version to 2.10.0-pre
Michael Jeanson [Wed, 21 Dec 2016 22:47:18 +0000 (17:47 -0500)] 
Update version to 2.10.0-pre

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: asoc instrumentation for RHEL 7.3
Michael Jeanson [Wed, 7 Dec 2016 19:17:33 +0000 (14:17 -0500)] 
Fix: asoc instrumentation for RHEL 7.3

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFix: SCSI instrumentation for SLES12 SP2
Michael Jeanson [Wed, 7 Dec 2016 16:09:31 +0000 (11:09 -0500)] 
Fix: SCSI instrumentation for SLES12 SP2

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd SUSE Linux Enterprise kernel version tests
Michael Jeanson [Wed, 7 Dec 2016 16:09:30 +0000 (11:09 -0500)] 
Add SUSE Linux Enterprise kernel version tests

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoFilter code relicensing to MIT license
Mathieu Desnoyers [Mon, 28 Nov 2016 17:39:48 +0000 (12:39 -0500)] 
Filter code relicensing to MIT license

Relicense the filtering code to MIT license.

I am the principal author of this code. Julien Desfossez gave the
approval for his modifications.

Acked-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoAdd task cpu in process statedump
Mathieu Desnoyers [Thu, 24 Nov 2016 01:43:49 +0000 (20:43 -0500)] 
Add task cpu in process statedump

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
7 years agoPerformance: add missing unlikely in reserve
Mathieu Desnoyers [Mon, 21 Nov 2016 21:08:22 +0000 (16:08 -0500)] 
Performance: add missing unlikely in reserve

Add missing branch prediction hints within lttng_event_reserve().

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoFix: preemptible and migratable context error handling
Mathieu Desnoyers [Mon, 24 Oct 2016 17:27:01 +0000 (13:27 -0400)] 
Fix: preemptible and migratable context error handling

When built against preempt-rt and preempt kernels, the "return 0" case
means success, but lttng-modules incorrectly prints an error in the
kernel log.

Given that we handle the -ENOSYS error in lttng_context_init, there is
no need to keep the ifdefs in that function.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoFix: bump stable kernel version ranges for clock work-around
Mathieu Desnoyers [Thu, 13 Oct 2016 13:50:21 +0000 (15:50 +0200)] 
Fix: bump stable kernel version ranges for clock work-around

Linux commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), changed the logic to open-code
the timekeeping_get_ns() function, but forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts LTTng.

We expected Linux commit 58bfea9532 "timekeeping: Fix
__ktime_get_fast_ns() regression" to make its way into stable
kernels promptly, but it appears new stable kernel releases were
done before the fix was cherry-picked from the master branch.

We therefore need to bump the version ranges for the work-around
in lttng-modules.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: John Stultz <john.stultz@linaro.org>
8 years agoVersion 2.9.0-rc1
Mathieu Desnoyers [Fri, 7 Oct 2016 19:19:52 +0000 (15:19 -0400)] 
Version 2.9.0-rc1

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoFix: i2c: support kernels < 3.15
Mathieu Desnoyers [Fri, 7 Oct 2016 14:55:16 +0000 (10:55 -0400)] 
Fix: i2c: support kernels < 3.15

i2c instrumentation has only been added in kernel 3.15.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoFix: show warning for broken clock work-around
Mathieu Desnoyers [Thu, 6 Oct 2016 11:45:35 +0000 (07:45 -0400)] 
Fix: show warning for broken clock work-around

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoBump minor ABI version
Mathieu Desnoyers [Wed, 5 Oct 2016 16:47:58 +0000 (12:47 -0400)] 
Bump minor ABI version

Command added: LTTNG_KERNEL_SESSION_STATEDUMP

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoFix: work-around upstream Linux timekeeping bug
Mathieu Desnoyers [Wed, 5 Oct 2016 11:20:32 +0000 (07:20 -0400)] 
Fix: work-around upstream Linux timekeeping bug

Linux commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), changed the logic to open-code
the timekeeping_get_ns() function, but forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts LTTng.

The following kernel versions are affected: 4.8, 4.7.4+, 4.4.20+,
4.1.32+

We expect that the upstream fix will reach the master and stable
branches timely before the next releases, so we use 4.8.1, 4.7.7,
4.4.24, and 4.1.34 as upper bounds (exclusive).

Fall-back to the non-NMI-safe trace clock for those kernel versions.
We simply discard events from NMI context with a in_nmi() check,
as we did before Linux 3.17.

Link: http://lkml.kernel.org/r/1475636148-26539-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoAdd support for i2c tracepoints
Simon Marchi [Tue, 4 Oct 2016 21:07:05 +0000 (17:07 -0400)] 
Add support for i2c tracepoints

This patch teaches lttng-modules about the i2c tracepoints in the Linux
kernel.

It contains the following tracepoints:

  * i2c_write
  * i2c_read
  * i2c_reply
  * i2c_result

I translated the fields and assignments from the kernel's
include/trace/events/i2c.h as well as I could.  I also tried building
this module against a kernel without CONFIG_I2C, and it built fine (the
required types are unconditionally defined).  So I don't think any "#if
CONFIG_I2C" or similar are required.

A module parameter (extract_sensitive_payload) controls the extraction
of possibly sensitive data from events.

[ With edit by Mathieu Desnoyers. ]

Signed-off-by: Simon Marchi <simon.marchi@ericsson.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoCleanup: makefile version checks with single "ge"
Mathieu Desnoyers [Mon, 3 Oct 2016 21:35:27 +0000 (17:35 -0400)] 
Cleanup: makefile version checks with single "ge"

Version checks in makefiles should always be a disjunctive normal form
where the conjunctions consist of one or more "equals" comparisons and
at most a single greater-or-equal comparison.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoPerformance: special-case NULL in lttng_strlen_user_inatomic
Mathieu Desnoyers [Mon, 26 Sep 2016 17:37:50 +0000 (13:37 -0400)] 
Performance: special-case NULL in lttng_strlen_user_inatomic

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoFix: lttng_inline_memcpy does not take a __user argument
Mathieu Desnoyers [Sun, 25 Sep 2016 16:30:00 +0000 (12:30 -0400)] 
Fix: lttng_inline_memcpy does not take a __user argument

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoPerformance: implement lttng_inline_memcpy
Mathieu Desnoyers [Sun, 25 Sep 2016 16:27:01 +0000 (12:27 -0400)] 
Performance: implement lttng_inline_memcpy

Because all length parameters received for serializing data coming from
applications go through a callback, they are never constant, and it
hurts performance to perform a call to memcpy each time.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoPerformance: cache the backend pages pointer in context
Mathieu Desnoyers [Sun, 25 Sep 2016 16:02:25 +0000 (12:02 -0400)] 
Performance: cache the backend pages pointer in context

Getting the backend pages pointer requires pointer chasing through the
ring buffer backend tables. Cache the current value so it can be re-used
for all backend write operations writing fields for the same event.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoCleanup: libringbuffer: remove duplicate pointer chasing in slow paths
Mathieu Desnoyers [Sun, 25 Sep 2016 15:11:48 +0000 (11:11 -0400)] 
Cleanup: libringbuffer: remove duplicate pointer chasing in slow paths

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 years agoPerformance: Only dereference commit index once
Mathieu Desnoyers [Sun, 25 Sep 2016 15:06:10 +0000 (11:06 -0400)] 
Performance: Only dereference commit index once

The commit fast path should not dereference the commit counter index
repeatedly for performance reasons.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.05092 seconds and 5 git commands to generate.