Reversing data.wak - noita-player/noitadocs GitHub Wiki

not interested in reverse engineering? click here to skip to the tool release.

header

invalid argument
it's not a stone tablet
you can change it

I still think secrets.png was poorly thought out!

greets research server, feet crew discord

here we go.

time-travel debugging

install "windbg preview" from the microsoft store or download+install its appx to get this tool.

tttracer -dumpFull ./path/to/noita.exe and kill the game when you see the menu being drawn. on a machine with fast NVMe IO, this will take approx 5 minutes. let it sit.

we now have a trace of the game that contains, somewhere, assets being loaded+decrypted. the extremely useful bit here is being able to debug forward and backward. however, we can't change the flow of execution.

data.wak

pretty clear that this 25MB file contains game assets. pop it into binvis.io and check the entropy:

that's high entropy throughout the file. we'll need to reverse how it's encrypted/compressed before we can do anything.

aes encryption

so, we have a goal. let's pop it into radare3.

findcrypt2 results:

[*] loading crypto constants
[*] searching for crypto constants in .text
0x004BCB38: found sparse constants for MD5
[*] searching for crypto constants in .rdata
0x01495D18: found const array Rijndael_sbox (used in AES)
0x01495E18: found const array Rijndael_inv_sbox (used in AES)
[*] searching for crypto constants in .data
0x0171EC60: found const array zinflate_lengthStarts (used in zlib)
0x0171ECE8: found const array zinflate_lengthExtraBits (used in zlib)
0x0171ED68: found const array zinflate_distanceStarts (used in zlib)
0x0171EDE8: found const array zinflate_distanceExtraBits (used in zlib)
[*] searching for crypto constants in .idata
[*] searching for crypto constants in .tls
[*] searching for crypto constants in immediate operand
[*] finished

it's correct about the AES sboxes being used. we breakpoint on access to find the first use, where AES is initialized the first time.

ba r 4 0x01495E18 ".echo aes_inv accessed"
ba r 4 0x01495D18 ".echo aes_enc accessed"

logically, this breakpoint will drop us into a callstack that is involved in asset en/decryption.

inside aes_init

a stack when aes_enc is first accessed is below. we're somewhere inside an AES library - observe the RetAddr values. 0x4CXXXX/0xFCXXXX is very far from 0x10XXXXX, so that locality can be used to guess which stack frames are different libraries.

0:000> kn 5
 # ChildEBP RetAddr  
WARNING: Stack unwind information not available. Following frames may be wrong.
00 0019ed10 004c0640 noita!SDL_main+0x5a71e <-- looks like AES init
01 0019ed24 00fc5720 noita!SDL_main+0x5a2d1 <-- pump the RNG and init
02 0019ed64 010d6190 noita!SDL_main+0xb5f3b1 <-- after init, does AES crypt
03 0019ed74 010d624b noita!SDL_main+0xc6fe21 <-- very interesting.
04 0019ef8c 00fbd2a3 noita!SDL_main+0xc6fedc

I should clarify more about frame 02. some pseudocode:

void __cdecl aes_initone_2(char *a1, int a2, int a3)
{
  j_rand_aes_init(a1, 1);
  j_aes_encrypt_a3_blocks((int)(a1 + 16), a2, a3);
}

that constant being passed to init is used to derive the AES IV, and if we search for other methods that call j_rand_aes_init, we can see that the game can use a dynamic value, 1, or 0x7FFFFFFE. with a little reversing, we see they use it as a seed to their rand engine and pull 4 32bit integers as the IV.

here's some annotated pseudocode for frame 3, both initone functions are identical:

  a2 = 123;
  j_get_16_bytes_random(&aesone_obj, 0x165EC8F);
  j_get_16_bytes_random(&aestwo_obj, 0x165EC8F);
  j_aes_initone_duplicate(&aesone_obj, &a2, 16); 
  j_aes_initone(&aestwo_obj, &a2, 16); // @ 0x10d624b (frame 3 return addr)

putting it together

we set some breakpoints around the functions above to log:

# they AES encrypt the integer 123 (0x7b) when constructing WizardPak for whatever reason
bytes_data   = h2b("7b 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00")
# odd 16 byte value that came from the PRNG in frame 3
bytes_key    = h2b("c3 d2 ba e7 c3 f3 62 9a-17 53 71 d6 b1 f5 05 aa")
# from a breakpoint between PRNG loop and AES init
bytes_iv     = h2b("d2 97 e4 d6 e9 46 ab b9-ed 46 bc 9b 2e 3e d4 e5")
# just looking at what memory changed after the call, we see this random-looking 16 bytes
bytes_result = h2b("a5 58 e1 18 56 10 cf 4d-d2 49 01 bf a9 3e f2 74")

running some quick shitty python, we find the AES mode (OFB and CTR are the same if your data is a single block):

finding WAK parsing

so we know how their encryption scheme works, and how they generate the key (they pull 16 bytes from a PRNG with a specific seed). looking at xrefs to that method, there's only a few that are called during WizardArk construction.

this is becoming rambling, so I'm skipping to findings. apologies if you're new to reversing trying to follow along. it decrypts the 16 byte header first:

decrypted first 16 bytes of data.wak
29a4a020  00 00 00 00 0e 23 00 00-28 2c 07 00 00 00 00 00  .....#..(,......

this represents a struct of

struct wak_header {
    uint32_t unk0;
    uint32_t unk1;
    uint32_t file_list_size; // in bytes
    uint32_t unk2;
}

Noita will then decrypt the file list, which is the below structure pseudocode repeated for file_list_size bytes.

struct file_list_entry {
    uint32_t offset; // position in data.wak
    uint32_t size;   // length
    uint32_t path_len;
    char path[path_len]; 
}

the file list is then iterated to create what looks like a linked list of objects containing the above data. so, these values are shoved into a new allocation (if you wanted to follow them to see where they're accessed).

so, find a file that the game has to load, like file: data/materials.xml offset: 78C62, and break-on-access the address offset is stored in. you'll notice it's accessed within a block that takes a lock on the datawak data. because we already reversed the AES stuff, it's pretty clear what is going on:

// .text:00F6D6C7
  if ( cur_obj != *endlist )
  {
    v10 = Mtx_lock(datawak_object + 248);
    if ( v10 )
      std::_Throw_C_error(v10);
    v21 = 1;
    file_data_location = *(_DWORD *)datawak_object + cur_obj->offset_from_table;
    unk0 = (fileentry_mem *)(*(_DWORD *)datawak_object + cur_obj->offset_from_table);
    if ( LOBYTE(cur_obj_0->field_34) )
    {
      j_aes_for_loading(datawak_object + 32, file_data_location, cur_obj_0->size_from_table, cur_obj_0->index_in_table);
      file_data_location = (int)unk0;
      LOBYTE(cur_obj_0->field_34) = 0;
    }
    a2->field_0 = file_data_location;
    a2->field_4 = cur_obj_0->size_from_table;
    v21 = -1;
    v12 = Mtx_unlock(datawak_object + 248);
    if ( v12 )
      std::_Throw_C_error(v12);
  }

that's right! the index of the file in the file-list table is the seed for their rand-AES thing. now we can decrypt all the files in the WAK.

data
│   credits.txt
│   genome_relations.csv
│   icon.bmp
│   magic_numbers.xml
│   magic_numbers_disable_debug.xml
│   materials.xml
│   secrets_secrets_secrets.png
│   
├───biome
│   │   boss_arena.xml
│   │   boss_limbs_arena.xml
│   │   boss_victoryroom.xml
│   │   bridge.xml
...
│   └───orbrooms
│           orbroom_00.xml
│           orbroom_01.xml
│           orbroom_02.xml
│           orbroom_03.xml
│           orbroom_04.xml
│           orbroom_05.xml
│           orbroom_06.xml
│           orbroom_07.xml
│           orbroom_08.xml
│           orbroom_09.xml
│           orbroom_10.xml
│           orbroom_11.xml
│           
├───biome_impl
│   │   acidtank.png
│   │   acidtank_2.png
...

│   ├───caves
│   │   │   brush_03.png
│   │   │   brush_04.png
..

│   │   └───generated
│   │           cave_0.png
.......

re-wak-ing

noita.exe -wizard_pak will stick the data directory into a new wak (overwriting your current one, careful). easy!

wakman.exe --noita path_to_noita.exe --pak path_to_folder_containing_icon.bmp will do this for you

windbg appendix

dumping ground for misc useful windbg commands

ba r 4 0x01495E18 ".echo aes_inv accessed" <-- set a hwbp on read/write
ba r 4 0x01495D18 ".echo aes_enc accessed"

- logging key/IV-seed for dynamic AES calls
bp 00F6D710 "dps @esp L3; .printf \"arg1: %p - data:\", poi(@esp); .echo \"\n\"; db poi(@esp) L20; .printf \"arg2: %p - data:\", poi(@esp+4); .echo \"\n\"; db poi(@esp+4) L20;"

- useful for pausing when you need to break on access a certain WAK file object they construct.
bp 010d71d0 ".printf \"file: %ma offset: %X\n\", (@ebp-110h), poi(@esi+28h)"

utils appendix

findcrypt2

poro.sig - a FLIRT sig set for their "run_poro" core engine built with vc12+release+/MT