Summary

During the regular boot sequence, Huawei’s BootROM initializes the UFS hardware and the crypto engine in order to load and verify the next stage bootloader image from flash. However, when run in download mode, which maybe used for factory flashing and repair purposes, a connected host can communicate with the BootROM via USB over a serial communication channel.

The basis of the communication is a slightly modified version of XMODEM protocol. The first frame to be sent must be the head chunk, which defines the destination address and the size of the file to be downloaded via the following data chunks.

The XMODEM data chunks carry the data of the file to be downloaded in segments. The amount of the encapsulated data in a data chunk must always be 1024 bytes, except for the last data chunk, where less then 1024 bytes are permitted depending the file size. As it can be seen in the snippet below, the received data is immediately placed into its designated location via memcpy, where the destination address is computed based on the file_download_addr + 1024 * latest_seen_seq expression.

if (seq == (next_seq & 0xff)) {
  if (next_seq == xmodem->total_frame_count - 1) {
    chunk_size = xmodem->file_download_length - 0x400*xmodem->latest_seen_seq;
    computed_length = chunk_size + 5;
  }
  else {
    computed_length = 0x405;
    chunk_size = 0x400;
  }
  if (curr_len == computed_length) {
    memcpy(
        xmodem->file_download_addr + 0x400*xmodem->latest_seen_seq,
        xmodem->msg,
        chunk_size);
    xmodem->buf_len = xmodem->buf_len + -5;
    xmodem->latest_seen_seq = xmodem->latest_seen_seq + 1;
    xmodem->next_seq = xmodem->next_seq + 1;
    (...) // send ACK
  }
}

The latest_seen_seq is a counter which gets incremented with each data chunk, which is used to ensure the linearity of the chunks, thus it detects missing or duplicated chunks.

The file_download_addr is filled from the head chunk message. A head chunk has the following schema:

  cmd | seq | ~seq | type  | file length | file address | checksum
 0xfe |  0  | 0xff | {1,2} |  (4 bytes)  |  (4 bytes)   | (2 bytes) 

Below is the pseuocode for the head chunk parsing function:

if (cmd == 0xfe) { // head command 
  // message chunk sanity checks 
  if ( 
    (seq==0) && (msg_len==14) && (file_type-1 & 0xff) < 2 
  ) { 
    (...) // extract length and address from the message 
    xmodem->file_download_length = length; 
    xmodem->file_download_addr   = address; 
   
    if (address == 0x22000) { // limit download address 
      // reset state 
      xmodem->total_received = 0; 
      xmodem->latest_seen_seq = 0; 
      xmodem->next_seq = 1; 
      (...) // calculate total_frame_count from the size 
      send_usb_response(xmodem, 0xaa); // ACK 
      return; 
    } 
    send_usb_response(xmodem, 0x07); // address error 
    return; 
  } 
  send_usb_response(xmodem, 0x55); // NACK 
  return; 
}
if (xmodem->next_seq == 0) { 
  (...) // ignore any other commands while next_seq==0 
  return; 
}

At start, the next_seq field is initialized to 0, and by that the xmodem protocol handler refuses to parse anything except the head chunk. The address variable (extracted from the head chunk field “file address”) is unconditionally assigned to the file_download_addr field of the xmodem struct. The actual address verification happens after the assignment, with the check for the address being 0x22000 (the legitimate location of the xloader firmware). The head chunk sets next_seq to 1 only when the address equals 0x22000.

The code lacks any clean-up routine of the file_download_addr field in case the address check does not pass. Similarly, because head chunks are always processed regardless of the state of next_seq, it is possible to send additional head chunks.

As a result, a fully controlled, up to 1024 bytes long write can be achieved in the following way:

  1. First a valid head chunk is sent with address of 0x22000. This initializes next_seq to 1, thus allowing the processing of data chunks.
  2. Next, a head chunk is sent, but this time the address is the desired address of the arbitrary write. Even though the address check does not pass, the file_download_addr field is updated with the arbitrary address and next_seq remains 1, as the previous head chunk has already set it.
  3. Finally send the maximal 1024 bytes of data (which is allowed to be processed, because next_seq is not zero), which eventually gets memcpy-ed to the arbitrary address defined by the second head chunk.

Because of the lack of mitigations (such as ASLR or stack cookies) that would create entropy in the stack frames and the precision of the current vulnerability’s arbitrary write, the control flow can be deterministically hijacked by precisely overwriting a pushed return address on the stack. Payload injection, thus full code execution is also possible as the whole working memory region is readable/writable/executable.

For more details, please see our BlackHat presentation and the accompanying whitepaper.

Affected Devices (Verified)

  • Kirin 990

    • Huawei Mate 30 Pro (LIO)
    • Huawei P40 Pro (ELS)
    • Huawei P40 (ANA)
  • Kirin 980

    • Huawei Nova 5T (YAL)

Fix

Huawei security update June 2021 fixes this vulnerability.

Timeline

  • 2020.04.21. Bug reported to Huawei PSIRT
  • 2020.05.18. Huawei PSIRT requests more time to complete analysis
  • 2020.06.09. Huawei PSIRT completes analysis, confirms the issue and fix plan. In follow-up meetings in the next days, one year public disclosure embargo agreed to by Taszk.
  • 2021.06.15. Huawei PSIRT assigns CVE-2021-22434, confirms bootROM fix for Kirin 990 via OTA
  • 2021.06.29. OTA distribution of the fix mitigating the vulnerability for Kirin 990 chipset based devices starts
  • 2021.07.08. Huawei confirms that the security bulletin for the issue has been released