Sleeping Function Called From Invalid Context Bug

A driver that had passed every test on the bench started printing a long warning in dmesg on a customer board, but only sometimes. The message began with “sleeping function called from invalid context”. The driver did not crash, the data still moved, and the same build was clean on our development board. This post walks through how we read that message, found the real cause, and fixed it. The bug is common in driver code, and the steps used to root-cause it apply to most reports of this kind.

One note on the commands below. The driver is cross-compiled on a host workstation, but the warning is produced on the target board, so the verification commands are run on the target itself, over its serial console or an SSH session. The shell prompt shown is therefore the engineer’s login on the board, not on the host.

The symptom

The board ran a sensor configuration tool that issued an ioctl. Our driver binds to the hardware as a platform driver on the platform bus, and in its probe() it also registers a character device. The ioctl arrives through that character device’s unlocked_ioctl handler, and that handler is where the configuration request is processed. Every few runs, the kernel log filled with a backtrace like the one below. File and line numbers, and the exact symbol offsets, vary between kernel versions, so the values shown here are illustrative; the structure of the message is what matters.

[  742.118233] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:333
[  742.118241] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1423, name: sensorcfg
[  742.118245] preempt_count: 1, expected: 0
[  742.118248] CPU: 1 PID: 1423 Comm: sensorcfg Not tainted 6.6.0 #1
[  742.118253] Call trace:
[  742.118255]  dump_backtrace+0x98/0xf0
[  742.118257]  show_stack+0x18/0x28
[  742.118259]  dump_stack_lvl+0x60/0x80
[  742.118261]  __might_resched+0x190/0x2a0
[  742.118263]  kmalloc_trace+0x4c/0x110
[  742.118266]  sensor_set_config+0x44/0x120
[  742.118269]  sensor_ioctl+0x88/0x140
[  742.118271]  vfs_ioctl+0x20/0x48
[  742.118273]  __arm64_sys_ioctl+0xb0/0x100
[  742.118275]  invoke_syscall+0x48/0x114
[  742.118277]  el0_svc_common+0x44/0xf4
[  742.118279]  do_el0_svc+0x24/0x38
[  742.118281]  el0_svc+0x34/0xb8

The driver did not panic because this is a warning, not a fatal error. The kernel printed the report and continued. That is exactly why the bug had survived our test runs: nothing failed loudly, and the warning only appeared when a specific code path ran with debugging enabled.

Reading the message: sleeping function called from invalid context

Every field in the header points to the cause. Read it from the top:

The reported file and line sit inside the memory allocator’s sleep check, reached through __might_resched. That tells you a function that is allowed to sleep was called.
in_atomic(): 1 means preempt_count is non-zero. The CPU was in an atomic section where the scheduler must not switch tasks.
irqs_disabled(): 1 means local interrupts were off as well.
preempt_count: 1, expected: 0 states the count plainly. Something raised it to 1 and had not lowered it before the allocation ran.

The call stack reads from the innermost frame down to the syscall entry: the ioctl syscall reached vfs_ioctl, which called the driver’s sensor_ioctl handler, which called sensor_set_config, which called the allocator, shown here as kmalloc_trace. So a memory allocation that may sleep ran while the code was in atomic context. The next question is what put the code in atomic context.

Reproducing it: the offending code

The configuration handler allocated a node and added it to a list that an interrupt handler also touched, so the list was protected by a spinlock. The code took the lock and then allocated inside the locked region:

static int sensor_set_config(struct sensor_dev *dev,
                             const struct sensor_cfg *cfg)
{
        struct sensor_cfg *copy;
        unsigned long flags;

        spin_lock_irqsave(&dev->lock, flags);

        copy = kmalloc(sizeof(*copy), GFP_KERNEL);   /* this may sleep */
        if (!copy) {
                spin_unlock_irqrestore(&dev->lock, flags);
                return -ENOMEM;
        }
        *copy = *cfg;
        list_add(&copy->node, &dev->cfg_list);

        spin_unlock_irqrestore(&dev->lock, flags);
        return 0;
}

Two facts combine here. First, spin_lock_irqsave raises preempt_count and disables local interrupts. That is the atomic context the message reported, and it is the correct lock to use because the same list is touched from an interrupt handler. Second, kmalloc with GFP_KERNEL is allowed to block while the kernel reclaims memory, so it calls might_sleep() internally. Calling a function that may sleep while holding a spinlock is the exact pattern the check is designed to catch.

Root cause: atomic context and might_sleep()

The warning comes from might_sleep(), which expands to a real check only when CONFIG_DEBUG_ATOMIC_SLEEP is set in the kernel configuration. Our development board ran a kernel without that option, so the same code was silent there. The customer board ran a debug kernel, which is why the message appeared only on that build.

On the target board, confirm the option on the running kernel:

raghu@techveda.org:~$ zcat /proc/config.gz | grep DEBUG_ATOMIC_SLEEP
CONFIG_DEBUG_ATOMIC_SLEEP=y

The silence on the other board did not mean the code was safe. A spinlock busy-waits and disables preemption while it is held. If the holder sleeps, the lock stays held for an unbounded time: another CPU that tries to take the same lock spins and wastes the processor, and in the worst case the system deadlocks. Sleeping while holding a spinlock is therefore unsafe regardless of whether the debug check happens to print the warning. The message was reporting a real defect that was simply quiet without the debug option.

The fix

The allocation does not need the lock. Only the list update does. The correct change is to allocate first, then take the spinlock for the short list operation:

static int sensor_set_config(struct sensor_dev *dev,
                             const struct sensor_cfg *cfg)
{
        struct sensor_cfg *copy;
        unsigned long flags;

        copy = kmalloc(sizeof(*copy), GFP_KERNEL);   /* sleeping is fine here */
        if (!copy)
                return -ENOMEM;
        *copy = *cfg;

        spin_lock_irqsave(&dev->lock, flags);
        list_add(&copy->node, &dev->cfg_list);
        spin_unlock_irqrestore(&dev->lock, flags);
        return 0;
}

The lock is now held only for the list insertion, which does not sleep. There are two other options worth knowing, though they did not fit this case. If an allocation truly must happen inside a spinlock or an interrupt handler, pass GFP_ATOMIC instead of GFP_KERNEL; that flag tells the allocator not to sleep, at the cost of a higher chance of failure under memory pressure. If the data is only ever touched from process context and never from an interrupt handler, a mutex can replace the spinlock, because holding a mutex is allowed to sleep. The first option, allocating outside the lock, is the cleanest when it is possible.

Confirming the fix

We rebuilt the driver on the host, installed the new module on the board, and watched the board’s kernel log while running the configuration tool in a loop on the debug kernel:

raghu@techveda.org:~$ dmesg -w

No further “invalid context” reports appeared across several hundred runs. Keeping CONFIG_DEBUG_ATOMIC_SLEEP enabled on at least one test build is the practical lesson here. It turns a silent, timing-dependent hazard into a clear message with a backtrace, which is far easier to fix before the product ships. Understanding which contexts allow sleeping and which do not is one of the core skills.

Key takeaways

The header decodes the bug: in_atomic() and preempt_count tell you the code was in atomic context, and the call stack names the function that tried to sleep.
Holding a spinlock raises preempt_count and disables preemption, so any function that may sleep, including kmalloc(GFP_KERNEL), mutex_lock, and copy_from_user, must not be called while the lock is held.
The check only fires when CONFIG_DEBUG_ATOMIC_SLEEP is set, so run at least one test kernel with it enabled.
Prefer allocating before taking the spinlock; use GFP_ATOMIC only when the allocation cannot be moved out of atomic context.

Debug Story: Sleeping Function Called From Invalid Context in a Kernel Driver

The symptom

Reading the message: sleeping function called from invalid context

Reproducing it: the offending code

Root cause: atomic context and might_sleep()

The fix

Confirming the fix

Further reading

The symptom

Reading the message: sleeping function called from invalid context

Reproducing it: the offending code

Root cause: atomic context and might_sleep()

The fix

Confirming the fix

Further reading

Related reading

Don’t jump to the solution