Summary

The NPU device’s kernel driver implements a custom mmap handler that exposes trusted kernel data to user space. These exposed structures contain sensitive data, including kernel pointers, which can be controlled by a user process. The content of these structures is inherently trusted by the kernel, the pointers are accessed for reading and writing at various places. This provides a very convenient arbitrary kernel read-write primitive that can be abused by an attacker to compromise the integrity of the kernel achieve kernel code execution and gain elevated privileges.

The mmap handler is exposed through the /dev/davinci0 character device. Due to the applied selinux policy, access to this device is restricted to the hiaiserver system process. Because of these limitations a practical attack would need to target the hiaiserver first.

The /dev/davinci0 device implements a custom mmap interface in drivers/hisi/npu/device/npu_devinit.c:211 devdrv_npu_map. The implementation allows three different regions to be mapped, based on the page offset provided to the syscall:

  • MAP_L2_BUFF: simple repetitive mapping of a pad page
  • $MAP_CONTIGUOUS_MEM: physically contiguous memory region
  • MAP_INFO_SQ_CQ_MEM: shared mapping between kernel and user space containing the meta data and structures used by the kernel driver

The last mapping is the offending code that exposes the internal data structures for reading and writing. The actual mapping is implemented in drivers/hisi/npu/facility/memory/npu_shm.c:962 devdrv_info_sq_cq_mmap. As the comment explains in the code, the mapped region consists of 5 main parts each taking up 32 MB of virtual address space:

|___SQ(32MB)___|____INFO(32MB)_____|__DOORBELL(32MB)___|___CQ(32MB)___|(32M vitural address space respectively)

Out of the 5 regions only the “INFO” mapping contains initialized data. It is mapped the following way:

/* remap info pfn range for user space */
phy_addr = g_shm_desc[dev_id][DEVDRV_INFO_MEM].phy_addr;
size = g_shm_desc[dev_id][DEVDRV_INFO_MEM].size;
start_addr += DEVDRV_VM_BLOCK_OFFSET;  // gap to reduce memory override probability
COND_RETURN_ERROR(size <= 0, -ENOMEM, "npu dev %d illegal info mem size = %lu\n", dev_id, size);
// ...
err = remap_pfn_range(vma, start_addr, phy_addr >> PAGE_SHIFT, size, vma->vm_page_prot);

The content of the shared g_shm_desc[dev_id][DEVDRV_INFO_MEM] region is initialized on the following call path:

  • devdrv_manager_init
  • devdrv_devinit
  • devdrv_drv_register
  • devdrv_resource_list_init: finally calls the respective devdrv_*_list_init functions that initialize the various structures in the shared memory

After the initialization is complete the mappable “info” memory contains the sensitive kernel data. The region has the following layout on our test device:

[64 * devdrv_ts_sq_info ][1 * devdrv_ts_cq_info][ 2 * 64 * devdrv_stream_info ][ 24 * devdrv_hwts_sq_info ][ 20 * devdrv_hwts_cq_info][64 * devdrv_model_desc_info]

Please note that the exact number and type of these structures might vary across different socs and devices. Furthermore, the hwts and model related structures are not initialized, unless the respective features are enabled in the device tree of the device. Regardless of the configuration, the devdrv_ts_sq_info, the devdrv_ts_cq_info and the devdrv_stream_info kernel structures should always be present and initialized. All of these contain trusted data and pointers to further complex kernel structures that can be hijacked by a malicious user space process, if it has access to the character device.

There are numerous places in the npu device’s kernel driver where the exposed pointers are accessed for either reading or writing. In order to highlight the scope of this problem we compiled a non-exhaustive list of functions that can be triggered by different ioctls on the same device and operate on potentially attacker controlled pointers.

  • drivers/hisi/npu/facility/id_allocator/npu_calc_sq.c
    • devdrv_dec_sq_ref_by_stream
    • devdrv_inc_sq_ref_by_stream
    • devdrv_free_sq_id
    • devdrv_alloc_sq_id
    • devdrv_is_sq_ref_by_no_stream
    • devdrv_free_sq_mem
    • devdrv_get_sq_phy_addr
    • devdrv_alloc_sq_mem
  • drivers/hisi/npu/facility/id_allocator/npu_calc_cq.c
    • devdrv_clr_cq_info
    • devdrv_get_cq_phy_addr
    • devdrv_free_cq_mem
    • devdrv_alloc_cq_mema
    • devdrv_free_cq_id
    • devdrv_alloc_cq_id
  • drivers/hisi/npu/device/core/npu_proc_ctx.c
    • devdrv_proc_alloc_cq
    • devdrv_proc_get_cq_id
    • devdrv_proc_clr_sqcq_info
    • devdrv_proc_free_single_cq
    • devdrv_proc_free_cq
    • __devdrv_get_report_phase_from_cq_info
    • devdrv_find_cq_index
  • drivers/hisi/npu/device/stream/npu_sink_stream.c
    • devdrv_free_sink_stream_id
  • drivers/hisi/npu/device/stream/npu_stream.c
    • devdrv_free_stream_id
  • drivers/hisi/npu/device/core/npu_ioctl_services.c
    • devdrv_ioctl_get_occupy_stream_id
  • drivers/hisi/npu/device/core/npu_recycle.c
    • devdrv_recycle_stream_list
  • drivers/hisi/npu/device/core/npu_proc_ctx.c
    • devdrv_proc_free_stream
    • devdrv_proc_alloc_stream
    • devdrv_proc_clr_sqcq_info
    • devdrv_proc_send_alloc_stream_mailbox

These functions provide arbitrary increment, decrement, null write, null check, list unlink, fixed value and pointer write primitives. Furthermore, the pointers can be redirected to the mapped shared memory. This way an attacker can gain control of further complex kernel structures that can extend the list of primitives.

For accurate details about the exploitation of this vulnerability please refer to our blog post about it.

Affected Devices (Verified)

  • Kirin 990
    • Huawei Mate 30 Pro (LIO)
    • Huawei P40 Pro (ELS)
    • Huawei P40 (ANA)

Fix

Huawei OTA images, released after February 2021, contain the fix for the vulnerability.

Timeline

  • 2020.10.30. Bug reported to Huawei PSIRT
  • 2020.11.25. Huawei PSIRT confirms vulnerability, confirms fix plans
  • 2021.01.31. OTA distribution of the fix, mitigating the vulnerability, starts
  • 2021.06.09. Huawei assigns CVEs
  • 2021.06.30. Huawei releases security bulletin