Summary
The NPU device’s kernel driver implements a custom mmap handler that exposes trusted kernel data to user space. These exposed structures contain sensitive data, including kernel pointers, which can be controlled by a user process. The content of these structures is inherently trusted by the kernel, the pointers are accessed for reading and writing at various places. This provides a very convenient arbitrary kernel read-write primitive that can be abused by an attacker to compromise the integrity of the kernel achieve kernel code execution and gain elevated privileges.
The mmap handler is exposed through the /dev/davinci0
character device.
Due to the applied selinux policy, access to this device is restricted to the hiaiserver
system process.
Because of these limitations a practical attack would need to target the hiaiserver first.
The /dev/davinci0
device implements a custom mmap interface in drivers/hisi/npu/device/npu_devinit.c:211 devdrv_npu_map
.
The implementation allows three different regions to be mapped, based on the page offset provided to the syscall:
MAP_L2_BUFF
: simple repetitive mapping of a pad page$MAP_CONTIGUOUS_MEM
: physically contiguous memory regionMAP_INFO_SQ_CQ_MEM
: shared mapping between kernel and user space containing the meta data and structures used by the kernel driver
The last mapping is the offending code that exposes the internal data structures for reading and writing.
The actual mapping is implemented in drivers/hisi/npu/facility/memory/npu_shm.c:962 devdrv_info_sq_cq_mmap
.
As the comment explains in the code, the mapped region consists of 5 main parts each taking up 32 MB of virtual address space:
|___SQ(32MB)___|____INFO(32MB)_____|__DOORBELL(32MB)___|___CQ(32MB)___|(32M vitural address space respectively)
Out of the 5 regions only the “INFO” mapping contains initialized data. It is mapped the following way:
/* remap info pfn range for user space */
phy_addr = g_shm_desc[dev_id][DEVDRV_INFO_MEM].phy_addr;
size = g_shm_desc[dev_id][DEVDRV_INFO_MEM].size;
start_addr += DEVDRV_VM_BLOCK_OFFSET; // gap to reduce memory override probability
COND_RETURN_ERROR(size <= 0, -ENOMEM, "npu dev %d illegal info mem size = %lu\n", dev_id, size);
// ...
err = remap_pfn_range(vma, start_addr, phy_addr >> PAGE_SHIFT, size, vma->vm_page_prot);
The content of the shared g_shm_desc[dev_id][DEVDRV_INFO_MEM]
region is initialized on the following call path:
devdrv_manager_init
devdrv_devinit
devdrv_drv_register
devdrv_resource_list_init
: finally calls the respectivedevdrv_*_list_init
functions that initialize the various structures in the shared memory
After the initialization is complete the mappable “info” memory contains the sensitive kernel data. The region has the following layout on our test device:
[64 * devdrv_ts_sq_info ][1 * devdrv_ts_cq_info][ 2 * 64 * devdrv_stream_info ][ 24 * devdrv_hwts_sq_info ][ 20 * devdrv_hwts_cq_info][64 * devdrv_model_desc_info]
Please note that the exact number and type of these structures might vary across different socs and devices.
Furthermore, the hwts
and model
related structures are not initialized, unless the respective features are enabled in the device tree of the device.
Regardless of the configuration, the devdrv_ts_sq_info
, the devdrv_ts_cq_info
and the devdrv_stream_info
kernel structures should always be present and initialized.
All of these contain trusted data and pointers to further complex kernel structures that can be hijacked by a malicious user space process, if it has access to the character device.
There are numerous places in the npu device’s kernel driver where the exposed pointers are accessed for either reading or writing. In order to highlight the scope of this problem we compiled a non-exhaustive list of functions that can be triggered by different ioctls on the same device and operate on potentially attacker controlled pointers.
drivers/hisi/npu/facility/id_allocator/npu_calc_sq.c
devdrv_dec_sq_ref_by_stream
devdrv_inc_sq_ref_by_stream
devdrv_free_sq_id
devdrv_alloc_sq_id
devdrv_is_sq_ref_by_no_stream
devdrv_free_sq_mem
devdrv_get_sq_phy_addr
devdrv_alloc_sq_mem
drivers/hisi/npu/facility/id_allocator/npu_calc_cq.c
devdrv_clr_cq_info
devdrv_get_cq_phy_addr
devdrv_free_cq_mem
devdrv_alloc_cq_mema
devdrv_free_cq_id
devdrv_alloc_cq_id
drivers/hisi/npu/device/core/npu_proc_ctx.c
devdrv_proc_alloc_cq
devdrv_proc_get_cq_id
devdrv_proc_clr_sqcq_info
devdrv_proc_free_single_cq
devdrv_proc_free_cq
__devdrv_get_report_phase_from_cq_info
devdrv_find_cq_index
drivers/hisi/npu/device/stream/npu_sink_stream.c
devdrv_free_sink_stream_id
drivers/hisi/npu/device/stream/npu_stream.c
devdrv_free_stream_id
drivers/hisi/npu/device/core/npu_ioctl_services.c
devdrv_ioctl_get_occupy_stream_id
drivers/hisi/npu/device/core/npu_recycle.c
devdrv_recycle_stream_list
drivers/hisi/npu/device/core/npu_proc_ctx.c
devdrv_proc_free_stream
devdrv_proc_alloc_stream
devdrv_proc_clr_sqcq_info
devdrv_proc_send_alloc_stream_mailbox
These functions provide arbitrary increment, decrement, null write, null check, list unlink, fixed value and pointer write primitives. Furthermore, the pointers can be redirected to the mapped shared memory. This way an attacker can gain control of further complex kernel structures that can extend the list of primitives.
For accurate details about the exploitation of this vulnerability please refer to our blog post about it.
Affected Devices (Verified)
- Kirin 990
- Huawei Mate 30 Pro (LIO)
- Huawei P40 Pro (ELS)
- Huawei P40 (ANA)
Fix
Huawei OTA images, released after February 2021, contain the fix for the vulnerability.
Timeline
- 2020.10.30. Bug reported to Huawei PSIRT
- 2020.11.25. Huawei PSIRT confirms vulnerability, confirms fix plans
- 2021.01.31. OTA distribution of the fix, mitigating the vulnerability, starts
- 2021.06.09. Huawei assigns CVEs
- 2021.06.30. Huawei releases security bulletin