資訊安全新聞

Rizin + Unicorn：nightMARE 如何打造高效能惡意程式分析管線

摘要 (Abstract)

本報告針對 nightMARE 進行深入的技術分析，nightMARE 是一個基於 Python 的函式庫，專為可擴展的惡意軟體研究和設定提取而設計。由 Elastic Security Labs 開發，nightMARE 利用 Rizin 等進階逆向工程框架和 Unicorn 等模擬 (emulation) 引擎，來簡化惡意二進位檔的靜態分析和動態分析。本文探討 nightMARE 的架構設計、關鍵模組 (analysis、core、malware) 及其在提取 LUMMA Stealer 等複雜惡意軟體家族設定中的應用。詳細的程式碼範例和架構圖說明了該函式庫在反組譯 (disassembling)、pattern matching、記憶體操作和函式模擬方面的能力，突顯了其在自動化複雜惡意軟體分析任務中的實用性。

Rizin + Unicorn：nightMARE 如何打造高效能惡意程式分析管線 | 資訊安全新聞

1. 簡介 (Introduction)

惡意軟體的持續演進需要精密且高效率的工具來進行分析和威脅情報提取。傳統的手動逆向工程程序通常耗時且資源密集。為了應對這些挑戰，nightMARE 函式庫被開發為一個穩健、基於 Python 的框架，以提高惡意軟體分析的可擴展性和功效 [1]。本報告深入探討 nightMARE 的技術基礎，著重於其設計原則、核心功能，以及在剖析複雜惡意軟體行為中的實際應用。

nightMARE 的目標是將逆向工程和惡意軟體分析所需的各種基本功能整合到一個統一的函式庫中，從而減少程式碼重複並提高可維護性。其主要應用在於為常見的惡意軟體家族建立設定提取器。該函式庫的架構，尤其是其與強大第三方工具的整合，使其成為自動化惡意軟體分析管線中的重要資產。

2. nightMARE 架構與核心元件

nightMARE 函式庫的結構旨在為惡意軟體分析提供一個模組化和可擴展的平台。其設計強調利用既有的逆向工程框架來提供全面的靜態分析和動態分析能力。該專案組織成三個主要模組： analysis 、 core 和 malware [1]。

2.1. 與 Rizin 的整合

nightMARE 靜態分析能力的基石是其與 Rizin 的整合，Rizin 是一個從 Radare2 專案分支出來的開源逆向工程框架 [1]。Rizin 的速度、模組化和廣泛的功能集使其成為二進位分析的理想後端。nightMARE 透過 rz-pipe 模組與 Rizin 介接，從而在 Python 腳本中實現與 Rizin 實例的無縫命令執行和資料取得。這種整合抽象了 Rizin 的大部分複雜性，允許研究人員利用其強大功能，而無需深入熟悉其命令列介面。

nightMARE analysis 元件中的 reversing 模組封裝了 Rizin 的功能。這種抽象為常見的逆向工程任務提供了一個簡化的 API，包括反組譯、pattern matching 和記憶體存取。 Rizin 類別公開的關鍵函式包括：

# Disassembling
def disassemble(self, offset: int, size: int) -> list[dict[str, typing.Any]]
def disassemble_previous_instruction(self, offset: int) -> dict[str, typing.Any]
def disassemble_next_instruction(self, offset: int) -> dict[str, typing.Any]
# Pattern matching
def find_pattern(
    self,
    pattern: str,
    pattern_type: Rizin.PatternType) -> list[dict[str, typing.Any]]
def find_first_pattern(
    self,
    patterns: list[str],
    pattern_type: Rizin.PatternType) -> int
# Reading bytes
def get_data(self, offset: int, size: int | None = None) -> bytes
def get_string(self, offset: int) -> bytes
# Reading words
def get_u8(self, offset: int) -> int
...
def get_u64(self, offset: int) -> int
# All strings, functions
def get_strings(self) -> list[dict[str, typing.Any]]
def get_functions(self) -> list[dict[str, typing.Any]]
# Xrefs
def get_xrefs_from(self, offset: int) -> list
def get_xrefs_to(self, offset: int) -> list[int]

這些函式有助於與二進位程式碼進行程式化互動，從而實現自動化分析工作流程，例如識別特定的指令序列、提取嵌入式資料，以及透過交叉引用追蹤程式碼執行更新。

2.2. 模擬模組 (Emulation Module)

對於動態分析和行為模擬，nightMARE 納入了基於 Unicorn 引擎構建的模擬模組 [1]。該模組主要透過 WindowsEmulator 類別提供輕量級的 PE 模擬能力。與功能齊全的模擬器不同，nightMARE 的模擬器專注於執行程式碼片段或簡短的函式序列，優先考慮簡單性和速度，而非完整的作業系統模擬。

WindowsEmulator 類別提供了載入 PE 檔案、管理堆疊操作和記憶體分配的抽象。它還為進階使用案例提供了對底層 Unicorn 引擎的直接存取。關鍵方法包括：

# Load PE and its stack
def load_pe(self, pe: bytes, stack_size: int) -> None
# Manipulate stack
def push(self, x: int) -> None
def pop(self, x: int) -> None
# Simple memory management mechanisms
def allocate_memory(self, size: int) -> int
def free_memory(self, address: int, size: int) -> None
# Direct ip and sp manipulation
@property
def ip(self) -> int
@property
def sp(self) -> int
# Emulate call and ret
def do_call(self, address: int, return_address: int) -> None
def do_return(self, cleaning_size: int = 0) -> None
# Direct unicorn access
@property
def unicorn(self) -> unicorn.Uc

此外，該模擬器支援兩種類型的 hooks：通用 Unicorn hooks 和匯入位址表 (Import Address Table, IAT) hooks。這使得研究人員可以在模擬過程中攔截和修改特定 API 呼叫的行為，這對於在不執行惡意軟體的情況下了解其功能至關重要。

# Set unicorn hooks, however the WindowsEmulator instance get passed to the callback instead of unicorn
def set_hook(self, hook_type: int, hook: typing.Callable) -> int:
# Set hook on import call
def enable_iat_hooking(self) -> None:
def set_iat_hook(
        self,
        function_name: bytes,
        hook: typing.Callable[[WindowsEmulator, tuple, dict[str, typing.Any]], None],
) -> None:

一個實際的演示涉及 hook Windows 二進位檔（例如 DismHost.exe ）中的 Sleep 函式。這允許分析師觀察或修改睡眠持續時間，這是惡意軟體常用的反分析技術 [1]。

# coding: utf-8
import pathlib
from nightMARE.analysis import emulation
def sleep_hook(emu: emulation.WindowsEmulator, *args) -> None:
    print(
        "Sleep({} ms)".format(
            emu.unicorn.reg_read(emulation.unicorn.x86_const.UC_X86_REG_RCX)
        ),
    )
    emu.do_return()
def main() -> None:
    path = pathlib.Path(r"C:\Windows\System32\Dism\DismHost.exe")
    emu = emulation.WindowsEmulator(False)
    emu.load_pe(path.read_bytes(), 0x10000)
    emu.enable_iat_hooking()
    emu.set_iat_hook("KERNEL32.dll!Sleep", sleep_hook)
    emu.unicorn.emu_start(0x140006404, 0x140006412)
if __name__ == "__main__":
    main()

do_return 函式對於確保 hook 之後正確的執行流程至關重要，它允許模擬在被攔截的呼叫之後的指令處恢復 [1]。

2.3. 惡意軟體模組 (Malware Module)

nightMARE 中的 malware 模組包含各種惡意軟體家族的特定演算法實作。這些演算法涵蓋了設定提取、加密函式和樣本解封包 routines。它們作為如何利用 analysis 模組功能進行實際惡意軟體分析的實用範例 [1]。文章列出了幾個受支援的惡意軟體家族，包括 Blister、Ghostpulse、LUMMA、Netwire 和 Redlinestealer [1]。

3. 案例研究：LUMMA Stealer 設定提取

LUMMA Stealer，也稱為 LUMMAC2，是一種普遍存在的資訊竊取惡意軟體，其特徵是控制流程混淆和資料加密，這使得靜態分析和動態分析都變得複雜 [1]。nightMARE 為提取其設定提供了一個強大的解決方案。該程序可以分解為幾個技術步驟。

3.1. 初始化 ChaCha20 環境 (Context)

LUMMA Stealer 利用 ChaCha20 進行加密。設定提取的第一步涉及識別和收集惡意軟體使用的 ChaCha20 key 和 nonce。這是透過 pattern matching 二進位檔中初始化 ChaCha20 環境的特定指令序列來實現的 [1]。

CRYPTO_SETUP_PATTERN = "b838?24400b???????00b???0???0096f3a5"
def get_decryption_key_and_nonce(binary: bytes) -> tuple[bytes, bytes]:
    # Load the binary in Rizin
    rz = reversing.Rizin.load(binary)
    # Find the virtual address of the pattern
    if not (x := rz.find_pattern(CRYPTO_SETUP_PATTERN, reversing.Rizin.PatternType.HEX_PATTERN)):
        raise RuntimeError("Failed to find crypto setup pattern virtual address")
    # Extract the key and nonce address from the instruction second operand
    crypto_setup_va = x[0]["address"]
    key_and_nonce_address = rz.disassemble(crypto_setup_va, 1)[0]["opex"]["operands"][
        1
    ]["value"]
    # Return the key and nonce data
    return rz.get_data(key_and_nonce_address, CHACHA20_KEY_SIZE), rz.get_data(
        key_and_nonce_address + CHACHA20_KEY_SIZE, CHACHA20_NONCE_SIZE
    )
def build_crypto_context(key: bytes, nonce: bytes, initial_counter: int) -> bytes:
    crypto_context = bytearray(0x40)
    crypto_context[0x10:0x30] = key
    crypto_context[0x30] = initial_counter
    crypto_context[0x38:0x40] = nonce
    return bytes(crypto_context)

get_decryption_key_and_nonce 函式使用 Rizin 的 pattern matching 能力來定位相關指令，然後從指令的運算元 (operands) 中提取 key 和 nonce。接著 build_crypto_context 函式建構 ChaCha20 環境結構。

3.2. 定位解密函式

LUMMA 解密函式通常可以透過其靠近 WinHTTP 匯入 (imports) 的位置來識別。使用從函式起始 bytes 導出的 hex pattern 來定位其在二進位檔中的位址 [1]。

DECRYPTION_FUNCTION_PATTERN = "5553575681ec1?0100008b??243?01000085??0f84??080000"
def get_decryption_function_address(binary) -> int:
    if x := reversing.Rizin.load(binary: bytes).find_pattern(
        DECRYPTION_FUNCTION_PATTERN, reversing.Rizin.PatternType.HEX_PATTERN
    ):
        return x[0]["address"]
    raise RuntimeError("Failed to find decryption function address")

此函式利用 Rizin 的 pattern matching 來精確定位解密 routine 的虛擬位址。

3.3. 定位加密網域名稱的基底位址

從已識別的解密函式進行交叉引用 (xrefs) 有助於定位傳遞加密網域名稱進行解密的位置。這種方法很有效，因為與其他 LUMMA 函式不同，解密函式不是透過混淆的間接方式呼叫的 [1]。

C2_LIST_MAX_LENGTH = 0xFF
C2_SIZE = 0x80
C2_DECRYPTION_BRANCH_PATTERN = "8d8?e0?244008d7424??ff3?565?68????4500e8????ffff"
def get_encrypted_c2_list(binary: bytes) -> list[bytes]:
    rz = reversing.Rizin.load(binary)
    address = get_encrypted_c2_list_address(binary)
    encrypted_c2 = []
    for ea in range(address, address + (C2_LIST_MAX_LENGTH * C2_SIZE), C2_SIZE):
        encrypted_c2.append(rz.get_data(ea, C2_SIZE))
    return encrypted_c2
def get_encrypted_c2_list_address(binary: bytes) -> int:
    rz = reversing.Rizin.load(binary)
    if not len(
        x := rz.find_pattern(
            C2_DECRYPTION_BRANCH_PATTERN, reversing.Rizin.PatternType.HEX_PATTERN
        )
    ):
        raise RuntimeError("Failed to find c2 decryption pattern")
    c2_decryption_va = x[0]["address"]
    return rz.disassemble(c2_decryption_va, 1)[0]["opex"]["operands"][1]["disp"]

get_encrypted_c2_list_address 函式透過分析導向解密呼叫的指令來識別加密 Command and Control (C2) 網域名稱列表的基底位址。然後 get_encrypted_c2_list 提取原始加密的 C2 進入點(entries)。

3.4. 使用模擬 (Emulation) 解密網域名稱

nightMARE 沒有重新實作 LUMMA 客製化的 ChaCha20 解密邏輯，而是利用其模擬模組直接呼叫二進位檔中惡意軟體本身的解密函式。這種方法對於複雜的加密 routines 非常有效 [1]。

# We need the right initial value, before decrypting the domain
# the function is already called once so 0 -> 2
CHACHA20_INITIAL_COUNTER = 2
def decrypt_c2_list(
    binary: bytes, encrypted_c2_list: list[bytes], key: bytes, nonce: bytes
) -> list[bytes]:
    # Get the decryption function address (step 2)
    decryption_function_address = get_decryption_function_address(binary)
    # Load the emulator, True = 32bits
    emu = emulation.WindowsEmulator(True)
 
    # Load the PE in the emulator with a stack of 0x10000 bytes
    emu.load_pe(binary, 0x10000)
    
    # Allocate the chacha context
    chacha_ctx_address = emu.allocate_memory(CHACHA20_CTX_SIZE)
    
    # Write at the chacha context address the crypto context
    emu.unicorn.mem_write(
        chacha_ctx_address,
        build_crypto_context(
            key,
            nonce,
            CHACHA20_INITIAL_COUNTER, 
        ),
    )
    decrypted_c2_list = []
    for encrypted_c2 in encrypted_c2_list:
	 # Allocate buffers
        encrypted_buffer_address = emu.allocate_memory(C2_SIZE)
        decrypted_buffer_address = emu.allocate_memory(C2_SIZE)
        
        # Write encrypted c2 to buffer
        emu.unicorn.mem_write(encrypted_buffer_address, encrypted_c2)
        # Push arguments
        emu.push(C2_SIZE)
        emu.push(decrypted_buffer_address)
        emu.push(encrypted_buffer_address)
        emu.push(chacha_ctx_address)
 
        # Emulate a call
        emu.do_call(decryption_function_address, emu.image_base)
        # Fire!
        emu.unicorn.emu_start(decryption_function_address, emu.image_base)
        # Read result from decrypted buffer
        decrypted_c2 = bytes(
            emu.unicorn.mem_read(decrypted_buffer_address, C2_SIZE)
        ).split(b"\x00")[0]
        # If result isn\'t printable we stop, no more domain
        if not bytes_re.PRINTABLE_STRING_REGEX.match(decrypted_c2):
            break
        # Add result to the list
        decrypted_c2_list.append(b"https://" + decrypted_c2)
        # Clean up the args
        emu.pop()
        emu.pop()
        emu.pop()
        emu.pop()
        # Free buffers
        emu.free_memory(encrypted_buffer_address, C2_SIZE)
        emu.free_memory(decrypted_buffer_address, C2_SIZE)
       # Repeat for the next one ...
    return decrypted_c2_list

這個全面的函式協調了模擬程序：它將 PE 載入到模擬器中，為 ChaCha20 環境和緩衝區分配記憶體，寫入加密資料，將參數推送到模擬堆疊，呼叫惡意軟體的解密函式，最後從模擬記憶體中讀取解密的 C2 網域名稱。這種方法顯著簡化了高度混淆惡意軟體的提取程序。

4. nightMARE 專案結構圖

為了進一步說明 nightMARE 的模組化設計，下圖描繪了其高階專案結構以及其核心模組之間的關係 [1]。

graph TD A[nightMARE Library] --> B(Analysis Module) A --> C(Core Module) A --> D(Malware Module) B --> B1(Reversing Module) B --> B2(Emulation Module) B1 --> R(Rizin Framework) B2 --> U(Unicorn Engine) C --> C1(Bitwise Operations) C --> C2(Integer Casting) C --> C3(Regex for Config Extraction) D --> D1(Malware Family Algorithms) D1 --> D1a(LUMMA Stealer) D1 --> D1b(Redlinestealer) D1 --> D1c(Other Families) D --> D2(Crypto Functions) D --> D3(Unpacking Routines) style A fill:#f9f,stroke:#333,stroke-width:2px style B fill:#bbf,stroke:#333,stroke-width:2px style C fill:#bbf,stroke:#333,stroke-width:2px style D fill:#bbf,stroke:#333,stroke-width:2px

此圖表強調了由 Rizin 和 Unicorn 驅動的 Analysis Module 如何為 Malware Module 實作針對各種威脅的特定檢測和提取邏輯提供基礎能力。 Core Module 提供了整個函式庫中使用的通用公用程式。

5. 結論

nightMARE 函式庫代表了自動化惡意軟體分析的重大進展。透過整合 Rizin 等強大開源逆向工程工具和 Unicorn 等模擬引擎，它為研究人員提供了一個靈活且高效的平台，以開發複雜的惡意軟體設定提取器。其模組化設計和抽象層簡化了諸如二進位反組譯、記憶體操作和函式 hooking 等複雜任務，使其成為持續對抗不斷演變的惡意軟體威脅的寶貴資產。LUMMA Stealer 的詳細案例研究證明了 nightMARE 在剖析高度混淆的惡意軟體和提取關鍵情報指標方面的實用性，從而提高了威脅分析操作的整體功效。

References

NightMARE on 0xelm Street, a guided tour — Elastic Security Labs