Container Formats - OpenKotOR/PyKotor GitHub Wiki
The KotOR engine organizes resources across several container families, but the evidence for them is not all of the same kind. The runtime KEY/BIF relationship is directly anchored in recovered resource-manager routines from KotOR I, KotOR II, and Aurora nwmain.exe, while the detailed on-disk layouts for KEY, BIF, ERF/MOD/SAV, and RIM are described here from PyKotor's parser implementations and data models. [PyKotor key_data.py, PyKotor bif_data.py, PyKotor erf_data.py, PyKotor rim_data.py] CExoResMan::Exists @ (/K1/k1_win_gog_swkotor.exe @ 0x00408bc0, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x0061b830, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018f590) CExoKeyTable::RebuildTable @ (/K1/k1_win_gog_swkotor.exe @ 0x00410260, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x006304a0, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018ccf0)
Operationally, xoreos-tools exposes the same family split: unkeybif treats KEY and BIF as a linked index-plus-payload unit, while unerf and unrim treat ERF/MOD/SAV and RIM as separate archive families with their own archive logic. [Running xoreos-tools, Unkeybif, Unerf, Unrim]
- KEY — Resource Key Index
- BIF — Binary Resource Archive
- ERF — Encapsulated Resource File
- RIM — Resource Image
The KEY file is the master index for the shipped archive layer. In PyKotor's data model, KEY owns a file table of BifEntry rows plus a resource table of KeyEntry rows, and each KeyEntry stores a (ResRef, ResourceType, resource_id) triple whose resource_id splits into a top-12-bit BIF index and a bottom-20-bit resource index. [PyKotor KEY format overview, PyKotor BifEntry, PyKotor KeyEntry]
The runtime side of that relationship is visible in three binaries even when symbol recovery differs. K1 routes CExoResMan::Exists through a named GetKeyEntry, TSL routes the same public probe through a less descriptive helper, and Aurora exposes an override-first step and then a named FindKey table walk. All three shapes still implement the same idea: the fallback archive layer is table-driven, not filename-driven. CExoResMan::Exists @ (/K1/k1_win_gog_swkotor.exe @ 0x00408bc0, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x0061b830, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018f590)
xoreos's unkeybif documentation describes the same division explicitly from the tooling side: the KEY holds filenames and type IDs, the BIF holds file data, one KEY can control several BIFs, listing needs only the KEY, and full extraction needs the referenced BIFs as well [Unkeybif].
Mods do not normally edit the KEY file. Instead, they place content in the override folder or inside a module .mod capsule so the engine finds it before falling back to the KEY/BIF baseline. See Concepts — Resource Resolution Order and Mod Creation Best Practices — File Priority.
- KEY — Resource Key Index
KEY files map resource names (ResRefs) and types to specific locations within BIF containers. KotOR uses chitin.key as the main KEY file which references shipped BIF files, and PyKotor's KeyEntry.bif_index / KeyEntry.res_index properties decode the composite resource_id exactly that way. [PyKotor KeyEntry.bif_index, PyKotor KeyEntry.res_index]
Mods do not normally edit the KEY. The runtime probe in all three binaries checks higher-priority layers before relying on the keyed archive fallback, so shipping override files or module capsules is the normal way to shadow KEY-indexed resources. CExoResMan::Exists @ (/K1/k1_win_gog_swkotor.exe @ 0x00408bc0, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x0061b830, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018f590)
See:
The KEY file, specifically chitin.key in KotOR, serves as the master index for the shipped archive system:
- Resource Lookup: Maps ResRef + Resource Type (hex IDs and labels) -> BIF location
- BIF Registration: Tracks all BIF files and their install paths
- Resource Naming: Provides the filename (ResRef) missing from BIF files
- Drive Mapping: Historical feature indicating which media held each BIF:
The current three-binary evidence directly supports KEY/BIF as a fallback table layer. It does not require mods to rewrite KEY, because higher-priority layers are checked before the keyed archive probe returns success. CExoResMan::Exists @ (/K1/k1_win_gog_swkotor.exe @ 0x00408bc0, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x0061b830, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018f590)
The KEY indexes BIF entries only. Higher-priority sources can shadow KEY-indexed assets without editing the KEY; using override or MOD for that is the normal modding practice exposed both by PyKotor's explicit search layers and by the tri-binary resource-manager probes above. [PyKotor SearchLocation] CExoResMan::Exists @ (/K1/k1_win_gog_swkotor.exe @ 0x00408bc0, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x0061b830, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018f590)
PyKotor's KEY implementation lives in Libraries/PyKotor/src/pykotor/resource/formats/key/.
Cross-reference implementations (line anchors are against master and may drift):
-
PyKotor
- Layout spec:
key_data.pyL1–L55 - Binary I/O:
KEYBinaryReader.load - Binary I/O:
KEYBinaryWriter(io_key.py) - Data model:
BifEntry - Data model:
KeyEntry - Data model:
KEY
- Layout spec:
-
keyreader.cppKeyReader::loadloadFiles/readFileEntryloadKeys/readKeyEntry- Notes: reone lowercases ResRefs and splits
resource_idinto BIF index and resource index. It reads the 8-byte"KEY V1 "signature as one block, then the four table counts/offsets; it does not surface build year/day in this reader (those bytes are still present in KotORchitin.keyafter the offsets).
-
- Aurora KEY:
src/aurora/keyfile.cpp(shared across Aurora-family games) - xoreos-tools (tooling mirror)
- Aurora KEY:
-
-
KEYObject.tsloadFile(header, BIF table, filename resolution, key entries) - Resource id helpers:
getBIFIndex/getBIFResourceIndex(note:getBIFResourceIndexmasks with0x3FFFin this file — the on-disk encoding uses 20 low bits; preferresId & 0xFFFFFto match PyKotor/reone and the table below)
-
-
-
KEYBinaryStructure.csL17–L114 (FileRoot,FileHeader,BIFInfo,KeywithIndexIntoFileTable/IndexIntoResourceTableproperties)
-
-
Andastra / sotor (Rust):
- local parser:
vendor/sotor/core/src/formats/key/read.rs - local model/tests:
vendor/sotor/core/src/formats/key/mod.rs - notes: accepts
KEY V1andKEY V1.1, then decodesresource_idwith the documented 12-bit BIF index / 20-bit resource index split
- local parser:
The file header is 64 bytes in size:
| Name | type | offset | size | Description |
|---|---|---|---|---|
| File Type | Char |
0 (0x00) |
4 |
Always "KEY " (space-padded) |
| File Version | Char |
4 (0x04) |
4 |
"V1 " or "V1.1"
|
| BIF count | UInt32 |
8 (0x08) |
4 |
Number of BIF files referenced |
| KEY count | UInt32 |
12 (0x0C) |
4 |
Number of Resource Entries |
| Offset to File Table | UInt32 |
16 (0x10) |
4 |
Offset to BIF File Entries* Array |
| Offset to KEY Table | UInt32 |
20 (0x14) |
4 |
Offset to Resource Entries Array |
| Build Year | UInt32 |
24 (0x18) |
4 |
Build Year (years since 1900) |
| Build Day | UInt32 |
28 (0x1C) |
4 |
Build Day (days since Jan 1) |
| Reserved | Byte |
32 (0x20) |
32 |
Padding (usually zeros) |
Note on Header Variations: xoreos-docs: specs/torlack/key.html (Tim Smith/Torlack's reverse-engineered documentation) shows the header ending at offset 0x0040 with unknown values at offset 0x0018. The structure shown here (with Build Year/Build Day and Reserved fields) matches the actual KotOR KEY File Format.
The same overall KEY layout is also visible in Kotor.NET's binary structure reader (KEYBinaryStructure.cs L17-L114) and the Torlack/xoreos historical notes (specs/torlack/key.html); the local Andastra vendor/sotor/core Rust reader now follows the same header shape and explicitly accepts both V1 and V1.1.
Each file entry is 12 bytes:
| Name | Type | Offset | Size | Description |
|---|---|---|---|---|
| File Size | UInt32 |
0 (0x00) |
4 |
Size of BIF file on disk |
| Filename Offset | UInt32 |
4 (0x04) |
4 |
Offset into filename table |
| Filename Length | UInt16 |
8 (0x08) |
2 |
Length of filename in bytes |
| Drives | UInt16 |
10 (0x0A) |
2 |
Drive flags (0x0001=HD0, 0x0002=CD1, etc.) |
Drive flags are a legacy feature from the multi-CD distribution era:
| Flag Value | Meaning | Description |
|---|---|---|
0x0001 |
HD (Hard Drive) |
BIF is installed on the hard drive |
0x0002 |
CD1 (CD 1) |
BIF is on the first game disc |
0x0004 |
CD2 (CD 2) |
BIF is on the second game disc |
0x0008 |
CD3 (CD 3) |
BIF is on the third game disc |
0x0010 |
CD4 (CD 4) |
BIF is on the fourth game disc |
In contemporary distributions:
-
All BIF files use
0x0001(HDFlag) since everything is installed locally -
The engine doesn't prompt for disc swapping
-
Multiple flags can be combined (bitwise OR) if a BIF could be on multiple sources (e.g.
0x0001 | 0x0002(HD Flag | CD1 Flag) for a BIF that is on both the hard drive and the first game disc) -
Mod tools typically set this to
0x0001(HDFlag) for all files
The drive system was originally designed so the engine could:
- Prompt users to insert specific CDs when needed resources weren't on the hard drive
- Optimize installation by allowing users to choose what to install vs. run from CD
- Support partial installs to save disk space (common in the early 2000s)
That filename-table and drive-flag record layout is directly reflected in reone's KEY reader (keyreader.cpp loadFiles / readFileEntry L41-L66).
The Filename Table contains null-terminated strings:
| Name | Type | Description |
|---|---|---|
| Filenames | char[] |
null-terminated BIF Filenames (e.g., data/[models](MDL-MDX-File-Format).bif) |
Each KEY entry is 22 (0x16) bytes in size:
| Name | Type | Offset | Size | Description |
|---|---|---|---|---|
| ResRef | char |
0 (0x00) |
16 |
Resource Filename (null-padded, max 16 characters) |
| Resource Type | UInt16 |
16 (0x10) |
2 |
Numeric resource type ID (wiki table; PyKotor ResourceType) |
| Resource ID | UInt32 |
18 (0x12) |
4 |
Encoded Resource Location (see Resource ID Encoding) (e.g. 0x00000005 for the 5th Resource in the 1st BIF) |
The KEY entry structure must use byte or word alignment (1-byte or 2-byte packing). If the structure is packed with 4-byte or 8-byte alignment, the UInt32 at offset 0x12 (18) will be incorrectly placed at offset 0x14 (20), causing incorrect Resource ID decoding.
On non-Intel platforms, this alignment requirement may cause alignment faults unless the compiler provides an "unaligned" type or special care is taken when accessing the UInt32 field. The structure should be explicitly packed to ensure the UInt32 starts at offset 18 (0x12) rather than being aligned to a 4-[byte](https://en.wikipedia.org/wiki/Byte) or 8-[byte](https://en.wikipedia.org/wiki/Byte) boundary.
That byte-level layout matches reone's loadKeys / readKeyEntry, which reads a 16-byte ResRef, a uint16 type, and the unaligned uint32 resource id before splitting it with >> 20 and & 0xFFFFF.
The Resource ID field encodes both the BIF index and resource index within that BIF:
-
bits
31-20: BIF Index (top 12 bits) - index into file table -
bits
19-0: Resource Index (bottom 20 bits) - index within the BIF file
bif_index = (resource_id >> 20) & 0xFFF # top 12 bits
resource_index = resource_id & 0xFFFFF # bottom 20 bitsresource_id = (bif_index << 20) | resource_index- Maximum BIF Files:
4,096(12-bit BIF index) - Maximum Resources per BIF File:
1,048,576(20-bit resource index)
These limits are more than sufficient for KotOR, which typically has:
-
~50-100BIF Files in a full installation -
~100-10,000Resources per BIF File (largest BIF Files are texture packs)
Given Resource ID 0x00123456:
Binary: 0000 0000 0001 0010 0011 0100 0101 0110
|---- 12 bits -----|------ 20 bits ------|
BIF Index: 0x001 (BIF Index: `1`)
Resource Index: 0x23456 (Resource Index: `144,470` within that [BIF](Container-Formats#bif) file)
The encoding allows a single 32-bit integer to precisely locate any resource in the entire BIF system (e.g. 0x00123456 for the 5th resource in the 1st BIF file).
This split is implemented directly in reone's readKeyEntry as bifIdx = resId >> 20 and resIdx = resId & 0xfffff, and Torlack's historical key.html documents the same packed-ID interpretation with worked examples.
PyKotor's KEY coverage is split between KEYBinaryReader.load, KEYBinaryWriter, and the structural notes plus data models in key_data.py; the broader cross-tool comparison remains summarized under File structure overview.
- BIF File Format - Container format indexed by KEY
- ERF File Format - Self-contained containers (MOD/SAV/ERF) and resolution order
- RIM File Format - Stock module archives (resource image)
- RIM versus ERF
- GFF File Format - Common content type resolved via KEY/BIF
- Bioware Aurora KeyBIF - Official BioWare KEY/BIF specification
- Concepts - Resource resolution order, override folder
- Resource formats and resolution - Hex resource type IDs (SSOT table)
- Community sources and archives -- DeadlyStream, forums for override and resolution troubleshooting
BIF files hold the bulk of the game's shipped read-only resources. PyKotor's BIF model stores per-resource resname_key_index, type, size, and offset, explicitly treating the filename side of the relationship as something recovered through the companion KEY file rather than from the BIF alone. [PyKotor BIF overview, PyKotor BIFResource]
That KEY/BIF split is also what the recovered runtime fallback paths imply. The existence probe and KEY-table rebuild routines work in terms of keyed table entries and table maintenance, not in terms of standalone filename-bearing BIF records, which is why the engine can perform a keyed lookup and then seek directly into the chosen archive. CExoResMan::Exists @ (/K1/k1_win_gog_swkotor.exe @ 0x00408bc0, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x0061b830, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018f590) CExoKeyTable::RebuildTable @ (/K1/k1_win_gog_swkotor.exe @ 0x00410260, /TSL/k2_win_gog_aspyr_swkotor2.exe @ 0x006304a0, /Other BioWare Engines/Aurora/nwmain.exe @ 0x14018ccf0)
- BIF — BioWare Index File
BIF files work in tandem with KEY files which provide the filename-to-resource mappings. BIF files contain only resource IDs, types, and data - the actual filenames (ResRefs) are stored in the KEY file. BIF files are containers that store the bulk of game resources.
BIF containers are the primary storage mechanism for game assets. The game organizes resources into multiple BIF files by category:
-
data/models.bif: 3D model files (MDL/MDX) -
data/textures_*.bif: texture data — companion files include TPC and TXI — split across multiple containers -
data/sounds.bif: Audio files (WAV) -
data/2da.bif: Game data tables (2DA files) -
data/scripts.bif: Compiled scripts (NCS) -
data/dialogs.bif: Conversation files (DLG) -
data/lips.bif: LIP-sync animation data (LIP) - Additional platform-specific BIFs (e.g.,
dataxbox/,data_mac/)
The modular structure allows for efficient loading and potential platform-specific optimizations. Resources in BIF files are read-only at runtime; mods override them via the override/ directory or custom MOD or ERF files. The engine loads from BIF only when the resource is not found in override, loaded MOD, or save (see resource resolution order); the KEY file supplies the mapping from ResRef to the correct BIF and offset.
PyKotor's BIF implementation lives in Libraries/PyKotor/src/pykotor/resource/formats/bif/.
Cross-reference implementations (line anchors are against master and may drift):
-
PyKotor:
- layout (header + variable entries + BZF note):
bif_data.pymodule docstring L1–L40 -
BIFType/BIFResource/BIF:bif_data.pyL60–L569 - binary I/O:
BIFBinaryReader(loadL83–L89, signature L91–L109, header L111–L120, resource table L122–L155, payload L157–L179) BIFBinaryWriter- raw LZMA helper for BZF:
_decompress_bzf_payloadL20–L48
- layout (header + variable entries + BZF note):
-
-
bifreader.cppBifReader::loadL27–L31 (expects an 8-byte signature:BIFFV1plus one trailing ASCII space) loadHeaderL34–L41loadResourcesL43–L50readResourceEntryL52–L67
-
-
xoreos —
biffile.cpp(Aurora-family BIF: fixed vs variable resource handling, KEY merge helpers) -
xoreos-tools —
biffile.cpp(CLI / tooling) -
-
BIFObject.tsreadFromDiskL84–L115 — reads 20-byte header and 16-byte variable rows -
getResourceBufferL164–L177.
-
-
Kotor.NET:
BIFBinaryStructure.csFileRoot/FileHeader/VariableResourceL16–L65 — header skips the fixed-resourceuint32withBaseStream.Position += 4before readingOffsetToResources(same on-disk layout as PyKotor/reone). -
Andastra / sotor (Rust):
- local parser:
vendor/sotor/core/src/formats/bif/read.rs - local model/tests:
vendor/sotor/core/src/formats/bif/mod.rs - notes: accepts
BIFF V1andBIFF V1.1, validates that the fixed-resource count is0, and reads requested 16-byte variable-resource rows by absolute payload offset
- local parser:
-
bioware-kaitai-formats — Kaitai Struct specs for BIF and related BioWare containers.
The file header is 20 bytes in size:
| Name | type | offset | size | Description |
|---|---|---|---|---|
| File Type | char | 0 (0x00) | 4 |
"BIFF" for BIF, "BZF " for compressed BIF |
| File Version | char | 4 (0x04) | 4 |
"V1 " for BIF, "V1.0" for BZF |
| Variable Resource count | UInt32 | 8 (0x08) | 4 | Number of variable-size resources |
| Fixed Resource count | UInt32 | 12 (0x0C) | 4 | Fixed-resource count field |
| Offset to Variable Resource Table | UInt32 | 16 (0x10) | 4 | offset to variable resource entries |
The fixed-resource field is a legacy holdover from older Aurora-family descriptions. In the KotOR-oriented implementations cited here, current readers expect this field to remain 0 and read all payloads from the variable-resource table instead. xoreos-docs Torlack bif.html still labels the field at offset 0x000C as unknown; PyKotor rejects fixed_res_count > 0, and the local Andastra vendor/sotor/core Rust reader enforces the same constraint.
References:
-
xoreos
biffile.cppL64–L67 (fixed count must be 0) - reone
loadHeaderL34–L41 - Kotor.NET
FileHeaderL35–L47 -
Torlack
bif.html.
Each variable resource entry is 16 bytes:
| Name | type | offset | size | Description |
|---|---|---|---|---|
| Resource ID | UInt32 |
0 (0x00) | 4 | Resource ID (matches KEY file entry, encodes BIF index and resource index) |
| Offset | UInt32 |
4 (0x04) | 4 | Offset to resource data in file (absolute file offset) |
| File Size | UInt32 |
8 (0x08) | 4 | Uncompressed size of resource data (bytes) |
| Resource type | UInt32 |
12 (0x0C) | 4 | Resource type identifier (hex IDs and labels: Resource Type Identifiers; PyKotor: ResourceType enum) |
Entries are read sequentially from the variable resource table. The table is located at the offset specified in the file header. Each entry is exactly 16 bytes, allowing efficient sequential reading.
References:
- reone
readResourceEntryL52–L67 - Kotor.NET
VariableResourceL49–L64 - xoreos
biffile.cppL84–L96 - PyKotor table loop
io_bif.pyL122–L141.
Resource Data is stored at the offsets specified in the resource table:
| Name | type | Description |
|---|---|---|
| Resource Data | byte[] | Raw binary data for each resource |
Resources are stored sequentially but not necessarily contiguously; each row supplies its own absolute file offset and payload size, and the stored bytes remain in their native format without an additional BIF wrapper.
The resource access flow is:
The engine reads resources through the following process:
- KEY Lookup: Look up the ResRef (and optionally ResourceType) in the KEY file to get the Resource ID
- ID Decoding: Extract the BIF index (upper 12 bits) and resource index (lower 20 bits) from the Resource ID
- BIF Selection: Use the BIF index to identify which BIF file contains the resource
- Table Access: Read the BIF file header to find the offset to the variable resource table
- Entry Lookup: Find the resource entry at the specified index in the variable resource table
- Data Reading: Seek to the offset specified in the entry and read the number of bytes specified by the file size
That lookup flow matches xoreos's biffile.cpp, which reads each variable-resource row and merges the result with KEY metadata, reone's loadResources, and Torlack's bif.html description of the row semantics.
The Resource ID in the BIF file's variable resource table must match the Resource ID stored in the KEY file. The Resource ID is a 32-bit value that encodes two pieces of information:
- Lower 20 bits (bits 0-19): Resource index within the BIF file (0-based index into the variable resource table)
- Upper 12 bits (bits 20-31): BIF index in the KEY file's BIF table (identifies which BIF file contains this resource)
For example, a Resource ID of 0x00400029 decodes as:
- Resource index:
0x29(41st resource in the BIF) - BIF index:
0x004(4th BIF file in the KEY's BIF table)
References:
- KEY File Format
-
Torlack
key.html(worked examples) -
reone
readResourceEntryL54–L56 (readsidfield from each 16-byte row).
BZF Compression
BZF files are LZMA-compressed BIF files used primarily in iOS (and maybe Android) ports of KotOR. The BZF header contains "BZF " + "V1.0", followed by LZMA-compressed BIF data. Decompression reveals a standard BIF structure.
The BZF format wraps a complete BIF file in LZMA compression:
-
BZF header (8 bytes):
"BZF "+"V1.0"signature - LZMA Stream: Compressed BIF file data using LZMA algorithm
- Decompressed Result: Standard BIF file structure (as described above)
The entire BIF file after the 8-byte BZF header is wrapped in an LZMA stream; decompression yields an ordinary BIF payload that can then be parsed through the same header and resource-table logic described above. In practical terms, the wrapper is storage-oriented rather than schema-oriented: tools should decompress first and then interpret the resulting bytes as a standard BIF.
References:
- PyKotor BZF wrapper layout —
bif_data.pyL35–L39 - PyKotor BZF decompression entry —
io_bif.pyL162–L169 - PyKotor LZMA payload decode —
_decompress_bzf_payloadL20–L48 - PyKotor version string (
BIFBinaryReaderacceptsV1+ two ASCII spaces orV1.1in the 8-byte signature for both BIF and BZF) —io_bif.pyL107–L109 - Some distributions describe mobile BZF as
V1.0in prose—verify against real headers if a file fails to load. - xoreos
biffile.hL56–L60 - reone
BifReader::loadL27–L31
KEY File Relationship
BIF files require a KEY file to map resource IDs to filenames (ResRefs). The KEY file contains:
- BIF file entries (filename, size, location)
- KEY entries mapping ResRef + ResourceType (see Resource Type Identifiers) to Resource ID
The Resource ID in the BIF file matches the Resource ID in the KEY File's KEY entries.
See KEY File Format for the matching KEY-side lookup and BIF index encoding.
PyKotor's BIF implementation is centered in io_bif.py for binary read and write paths and bif_data.py for the in-memory model.
- Resource formats and resolution -- Hex resource type IDs
- Container-Formats#key -- BIF indexing and resource resolution
- Container-Formats#erf -- ERF/MOD containers
- GFF-File-Format -- GFF resources in BIF
- RIM File Format — Resource image module archives (contrast with ERF)
- Bioware-Aurora-KeyBIF -- Aurora KEY/BIF specification
ERF is a self-contained resource archive: every entry carries its own ResRef and type, so no external KEY file is needed. The engine uses ERF (and its signature variants MOD, SAV) for module content, saved games, override packs, and any other bundle that must travel as a single file. The header includes an optional localized-string list — primarily used in SAV files to store the save-game description — followed by a key list pointing into a contiguous data block. Shipped modules also use RIM archives, which solve the same naming problem with a simpler binary layout; see RIM versus ERF and the dedicated RIM section.
- ERF — Encapsulated Resource File
ERF files are self-contained containers that store both resource names (ResRefs) and data in the same file. Unlike BIF files which require a KEY file for filename lookups, ERF files include ResRef information directly in the container. When the engine resolves a resource request, it can service from encapsulated containers (MOD/ERF and stock RIM module archives) before falling back to KEY/BIF; see resource resolution order.
For mod developers, the practical rule is simple: build module-scoped changes as .mod capsules and keep global replacements in override. Vanilla modules ship as .rim / _s.rim (RIM), and a .mod in modules/ overrides the same module’s RIM set when present. See Installing Mods with HoloPatcher and HoloPatcher README for Mod Developers.
ERF containers commonly hold GFF, 2DA, TPC, NCS, and other resource types; the main alternative storage families are KEY and BIF.
See:
PyKotor's ERF implementation lives in Libraries/PyKotor/src/pykotor/resource/formats/erf/, with the layout table in erf_data.py, the read path in ERFBinaryReader.load, and the write path in ERFBinaryWriter.write.
Other engines and tools cover the same container family in parallel: reone's erfreader.cpp and erfreader.h parse ERF V1.0 and MOD V1.0 but intentionally skip localized strings and do not expose explicit SAV fourcc handling; KotOR.js's ERFObject.ts covers header, localized strings, keys, resource records, and serialization; Kotor.NET's ERFBinaryStructure.cs reads the core structures but skips the description StrRef and reserved tail; the local Andastra vendor/sotor/core Rust implementation covers the same header, localized-string, key-list, and resource-list read path in src/formats/erf/read.rs with matching in-memory structures in src/formats/erf/mod.rs; xoreos and xoreos-tools both use erffile.cpp style Aurora readers; KotOR-Unity ships its own ERFObject.cs; and the bioware-kaitai-formats project provides declarative ERF specs for code generation. More repo cross-links are cataloged on Home — Cross-reference: other tools and engines.
RIM (resource image) files are the stock module archives under modules/. They are encapsulated containers like ERF/MOD (ResRef + type + bytes in one file) but not the same on-disk structure as this document’s 160-byte ERF header. Tools that only parse ERF must convert or use a RIM-aware reader.
| Feature | RIM (RIM File Format) | ERF / MOD / SAV (this page) |
|---|---|---|
| Signature |
RIM + V1.0
|
ERF / MOD / SAV + V1.0
|
| Header size | 120 bytes | 160 bytes |
| Localized strings, build date, description StrRef | Absent | Present (strings optional) |
| Index layout | One 32-byte record per resource (includes UInt32 type, offset, size) | 24-byte KEY-style list + separate 8-byte offset/size list |
| MOD-only gap | N/A | Optional 8-byte × entry_count zero block between key list and resource list (quirk) |
Both families are loaded as module-side capsules ahead of KEY and BIF for resources they contain; see resource resolution order. PyKotor can turn a loaded RIM into an in-memory ERF via RIM.to_erf() for writing MOD/ERF output.
Field sizes, implicit offset 0 -> table at byte 120, and padding behavior are specified on RIM File Format.
Everything in this section applies to ERF, MOD, and SAV signatures only. RIM uses a shorter header and a different index record layout; do not treat these tables as the RIM specification.
The file header is 160 bytes in size:
| Name | Type | Offset | Size | Description |
|---|---|---|---|---|
| File Type | char | 0 (0x00) | 4 |
"ERF ", "MOD ", "SAV "
|
| File Version | char | 4 (0x04) | 4 | Always "V1.0"
|
| Language count | UInt32 | 8 (0x08) | 4 | Number of localized string entries |
| Localized string size | UInt32 | 12 (0x0C) | 4 | Total size of localized string data in bytes |
| Entry count | UInt32 | 16 (0x10) | 4 | Number of resources in the container |
| Offset to Localized string List | UInt32 | 20 (0x14) | 4 | Offset to localized string entries |
| Offset to KEY List | UInt32 | 24 (0x18) | 4 | Offset to KEY entries array |
| Offset to Resource List | UInt32 | 28 (0x1C) | 4 | Offset to resource entries array |
| Build Year | UInt32 | 32 (0x20) | 4 | Build year (years since 1900) |
| Build Day | UInt32 | 36 (0x24) | 4 | Build day (days since Jan 1) |
| Description StrRef | UInt32 | 40 (0x28) | 4 | TLK string reference for description |
| Reserved | byte | 44 (0x2C) | 116 | Padding (usually zeros) |
The Build Year and Build Day fields timestamp when the ERF file was created:
- Build Year: Years since 1900 (e.g., 103 = year 2003)
- Build Day: Day of year (1-365/366, with January 1 = day 1)
These timestamps are primarily informational and used by development tools to track module versions. The game engine doesn't rely on them for functionality.
For example:
Build Year: 103 --> 1900 + 103 = 2003
Build Day: 247 --> September 4th (the 247th day of 2003)
Most mod tools either zero out these fields or set them to the current date when creating/modifying ERF files.
The Description StrRef field (offset 0x28) varies by file family and observed build conventions:
-
MOD files:
0xFFFFFFFF(-1) is the standard for BioWare modules.-
Exception: TSL LIPS files consistently use
0xCDCDCDCD(Debug Fill). -
Exception: Some KOTOR 1 modules (e.g.
unk_m41series) use0.
-
Exception: TSL LIPS files consistently use
-
SAV files:
0(typically no description) -
NWM files:
-1(Neverwinter Nights module format, NOT used in KotOR) -
ERF files: Unpredictable (may contain valid StrRef or
-1)
The engine determines save-game handling from file context rather than from a dedicated ERF-header flag. That matches PyKotor's full 160-byte header read and layout table, KotOR.js's ERFObject.parseHeader, Kotor.NET's partial header reader, and the historical Torlack RE notes in xoreos-docs (io_erf.py L70-L96, erf_data.py L19-L36, ERFObject.ts L69-L85, ERFBinaryStructure.cs L86-L97, xoreos-docs specs/torlack/mod.html).
Localized strings provide descriptions in multiple languages:
| Name | type | size | Description |
|---|---|---|---|
| Language ID | UInt32 | 4 | Language identifier (Concepts) |
| string size | UInt32 | 4 | Length of string in bytes |
| string data | char[] | N |
windows-1252 encoded text |
ERF localized strings provide multi-language descriptions for the container itself. Most ERF-family files have zero localized strings, while some MOD-family files include localized module names or descriptions for UI-facing contexts. Use the shared Aurora numeric enum and encodings on Concepts for language ids and legacy code pages.
At the evidence level documented here, the resource-loader path does not rely on these strings for encapsulated-content indexing. They remain metadata fields, with the Description StrRef offering an alternative TLK-backed description mechanism in the header.
PyKotor's localized-string block reader and writer and KotOR.js's parseStructures confirm that this list is a simple language id + byte length + text sequence, while reone's ErfReader omits localized-string parsing entirely, so PyKotor and KotOR.js are the better implementation references for this subsection (io_erf.py L122-L143, io_erf.py L235-L240, ERFObject.ts L91-L97).
KEY List
Each KEY entry is 24 bytes and maps ResRefs to resource indices:
| Name | type | offset | size | Description |
|---|---|---|---|---|
| ResRef | char | 0 (0x00) | 16 | Resource filename (null-padded, max 16 chars) |
| Resource ID | UInt32 | 16 (0x10) | 4 | index into resource list |
| Resource type | uint16 | 20 (0x14) | 2 | Resource type ID (table) |
| Unused | uint16 | 22 (0x16) | 2 | Padding |
Resource names are padded with NULL bytes to 16 characters, but are not necessarily null-terminated. If a resource name is exactly 16 characters long, no null terminator exists. Resource names can be mixed case, though most are lowercase in practice.
PyKotor's reader and writer, KotOR.js's key loop, Kotor.NET's KeyEntry reader, reone's readKeyEntry, and Torlack's padding notes all agree on the 24-byte layout; the main behavioral difference is casing, because PyKotor preserves the stored ResRef while reone lowercases it during import (io_erf.py L148-L155, io_erf.py L242-L246, ERFObject.ts L101-L108, ERFBinaryStructure.cs L128-L134, erfreader.cpp L62-L71, xoreos-docs specs/torlack/mod.html).
Each resource entry is 8 bytes:
| Name | type | offset | size | Description |
|---|---|---|---|---|
| offset to data | UInt32 | 0 (0x00) | 4 | offset to resource data in file |
| Resource size | UInt32 | 4 (0x04) | 4 | size of resource data in bytes |
PyKotor, KotOR.js, Kotor.NET, and reone all implement the resource list as a compact (offset, size) array and then use those pairs to seek into the payload region, with PyKotor also mirroring the layout on write (io_erf.py L159-L162, io_erf.py L248-L252, ERFObject.ts L112-L116, ERFBinaryStructure.cs L157-L161, erfreader.cpp L84-L92).
Resource data is stored at the offsets specified in the resource list:
| Name | type | Description |
|---|---|---|
| Resource data | byte[] | Raw binary data for each resource |
Note: For MOD and NWM files only, there exists an unusual block of data between the resource structures (KEY List) and the position structures (Resource List). This block is 8 bytes per resource and appears to be all NULL bytes in practice. This data block is not referenced by any offset in the ERF file header, which is uncharacteristic of BioWare's file format design.
The best published description of this odd padding region is still Torlack's historical Strange Blank data note preserved in xoreos-docs, which is why most modern implementations mention it but do not assign it semantic meaning (xoreos-docs specs/torlack/mod.html).
ERF files come in several variants based on file type:
| file type | Extension | Description |
|---|---|---|
| ERF | .erf |
Generic encapsulated resource file |
| MOD | .mod |
Module file (contains area resources) |
| SAV | .sav |
Save game file (contains saved game state) |
The on-disk 160-byte layout is the same family for shipped KotOR capsules; the first four bytes are not always a distinct SAV type code—PyKotor treats many .sav files as the MOD / V1.0 header pair for typing purposes (ERFType.from_extension). ERF (and SAV when present) are still valid Aurora signatures when they appear.
MOD files package all resources needed for a game module (level/area):
- Area layouts (
.are,.git) - Module information (
.ifo) - Dialogs and scripts (
.dlg,.ncs) - Module-specific 2DA overrides
- Character templates (
.utc,.utp,.utd) - Waypoints and triggers (
.utw,.utt)
The game loads MOD files from the modules/ directory. When entering a module, the engine mounts the MOD container and prioritizes its resources over BIF files but below the override/ folder.
SAV files store complete game state:
- Party member data (inventory, stats, equipped items)
- Module state (spawned creatures, opened containers)
- Global variables and plot flags
- Area layouts with modifications
- Quick bar configurations
- Portrait images
Save files preserve the state of all modified resources. When a placeable is looted or a door opened, the updated .git resource is stored in the SAV file.
.rim files are not ERF binaries: they use a 120-byte header and 32-byte resource records. Compare:
They are the usual shipped module capsules under modules/; MOD overrides the same module when both exist.
Generic ERF files serve miscellaneous purposes:
- texture packs
- Audio replacement packs
- Campaign-specific resources
- Developer test containers
reone's ErfReader::checkSignature shows the usual practical acceptance rule used by tooling: treat ERF V1.0 and MOD V1.0 as the expected shipped signatures once the stream is open, then dispatch the rest of the parse from the shared container layout (erfreader.cpp L42-L52).
Reverse engineering of the game engine (specifically CExoKeyTable::AddEncapsulatedContents at 0x0040f3c0 in swkotor.exe) reveals how the engine actually parses these files.
The engine's resource manager reads the 160-byte header but only a subset of fields are required for encapsulated-content indexing. The currently documented reverse-engineering evidence supports direct use of:
- file type and Version (Verified against expected values)
- Entry Count (Used to allocate memory for the key table)
- offset to KEY List (Used to seek to the key data)
The remaining header fields are metadata-oriented in this loader path:
-
Language countandLocalized string size offset to Localized string List-
Build YearandBuild Day Description [StrRef](Audio-and-Localization-Formats#string-references-strref)
The current reverse-engineering summary here is that save-game handling is contextual rather than driven by a dedicated ERF-header discriminator. File location and resource-system context matter more than the Description StrRef, so setting that field to 0 in a MOD file does not by itself make the file behave like a save capsule.
In Knights of the Old Republic II: The Sith Lords, MOD files related to lip-syncing (lips_*.mod) consistently use 0xCDCDCDCD for the Description StrRef. This value (-842150451) is a common Microsoft C++ debug fill pattern, which suggests those files were produced with a debug-oriented toolchain.
| Component | PyKotor (line anchors) |
|---|---|
| Layout / types (docstring) | erf_data.py L19–L54 |
ERFType |
erf_data.py L107–L137 |
ERF archive |
erf_data.py L140–L253 |
| Binary read | ERFBinaryReader.load L51–L169 |
| Binary write | ERFBinaryWriter.write L186–L256 |
See also Cross-reference at the top of this page for KotOR.js, Kotor.NET, reone, and xoreos.
- Concepts - Language ID enum and encodings (ERF localized strings, TLK, GFF)
- BIF File Format - Container format used with KEY files
- KEY File Format - Index for BIF containers and resource resolution
- GFF File Format - Common content type stored in ERF containers
- RIM File Format — Resource image containers (distinct binary layout
- comparison)
RIM is the lightweight module archive shipped with the base game. Like ERF, each entry carries a ResRef and type, but the binary layout is intentionally simpler: no localized-string list, no compression, just a flat resource table and contiguous data. The engine loads RIM files from modules/, and a .mod in the same directory shadows the corresponding .rim pair when present. Typical capsule contents include:
- Role in the game and modding
- File structure overview
- Binary format
- Module file sets (rim, _s, dlg)
- Relationship to ERF and MOD
- Cross-reference: implementations
Vanilla modules are usually split across one or more .rim files (for example a main archive and a _s supplementary archive). The engine resolves resources using the same high-level rules as for MOD/ERF: module-scoped containers participate in the pipeline described in resource resolution order. A module_name.mod file in modules/ typically shadows the corresponding .rim pair when present. For MOD versus RIM priority and practical editing notes, see:
RIM is appropriate to read and edit when you are working directly with shipped module archives or building tools that must round-trip vanilla layout. Many modders ship changes as .mod (ERF variant) instead so the game loads a single encapsulated file without replacing base RIMs.
A RIM file is a self-contained archive: each stored resource has a ResRef, a resource type id, and raw payload bytes. There is no separate KEY file and no localized description block like ERF's optional string list.
At a high level:
- Fixed 120-byte header (
RIM+V1.0+ counts and offsets). - Resource table: one 32-byte record per resource (ResRef, type, index, offset, size).
- Resource blobs concatenated after the table, with alignment and trailing padding as used by vanilla-style writers.
Unlike ERF, RIM does not split “key” and “resource list” into two parallel arrays of different record sizes: offset and size live in the same 32-byte entry as the name and type.
Offsets are from the start of the file unless noted.
The header is 120 bytes.
| Name | Type | Offset | Size | Description |
|---|---|---|---|---|
| File type | char[4] | 0 | 4 | Always RIM (space-padded fourCC). |
| File version | char[4] | 4 | 4 | Always V1.0. |
| Unknown | UInt32 | 8 | 4 | Observed as 0 in vanilla-style files; treat as reserved. |
| Entry count | UInt32 | 12 | 4 | Number of resources in the archive. |
| Offset to resource table | UInt32 | 16 | 4 | Byte offset to the first resource entry. If 0, tools treat this as 120 (table immediately after the header). |
| Offset to resources / reserved | UInt32 | 20 | 4 | PyKotor’s writer records 0 here (“implicit” layout: data follows the table and alignment padding). Older descriptions sometimes treat parts of the tail of the header as opaque padding; readers should not depend on undocumented flags without verification. |
| Reserved | byte[] | 24 | 96 | Padding (typically zeros) to complete 120 bytes. |
If the offset at 0x10 is 0, compliant readers assume the resource table starts at byte 120.
Each entry is 32 bytes. Field order matches the reader in PyKotor’s RIMBinaryReader (ResRef, then type, then id, then offset, then size).
| Name | Type | Offset | Size | Description |
|---|---|---|---|---|
| ResRef | char[16] | 0 | 16 | Resource name, null-padded to 16 bytes; not necessarily null-terminated if length is 16. |
| Resource type | UInt32 | 16 | 4 | Numeric resource type id (same family of ids as elsewhere in KotOR; see Resource Type Identifiers). |
| Resource id | UInt32 | 20 | 4 | Index or slot id; vanilla writers typically use 0 .. n−1 in order. |
| Offset to data | UInt32 | 24 | 4 | File offset to the first byte of this resource’s payload. |
| Resource size | UInt32 | 28 | 4 | Length of the payload in bytes. |
In ERF, the parallel “key” record uses a UInt16 resource type plus 2 bytes of padding, and offsets/sizes live in a separate 8-byte-per-entry resource list. RIM folds metadata and locating information into this single 32-byte record.
Payloads are stored at the offsets given in the table. PyKotor’s RIMBinaryWriter reproduces a vanilla-style layout:
- After the last resource entry, the writer may insert padding so the start of the first blob aligns to a 16-byte boundary.
- After each resource’s bytes, the writer appends 0–3 padding bytes so the stream is 4-byte aligned, then writes 16 additional zero bytes before the next resource (including after the last resource, so the on-disk file size reflects that trailing region).
Other tools may emit slightly different padding; parsers should trust only the offset/size fields in the table for the extent of each resource.
A single logical module often spans multiple archives:
| Pattern | Typical role |
|---|---|
module.rim |
Main module payload (areas, templates, scripts, textures, etc.). |
module_s.rim |
Supplementary resources for the same module (split for size or organization). |
module_dlg.erf |
(TSL especially) Dialog and related resources in a separate ERF capsule. |
Exact splits vary by module. Kit and extraction tooling that scans modules/ treats .mod as highest priority, then .rim / _s.rim, then other ERFs; see Kit structure documentation.
RIM and ERF solve the same problem—named, typed resources in one file—but they are not bit-identical layouts.
| Topic | RIM | ERF / MOD / SAV |
|---|---|---|
| Header Size | 120 bytes | 160 bytes |
| File type Tag | RIM |
ERF, MOD, SAV
|
| Localized Description Strings | No | Optional block (language id + CP1252 text) |
| Build year/day, Description StrRef | No | Present in ERF header |
| Per-resource Metadata | One 32-byte entry (includes offset + size) | 24-byte key + 8-byte resource entry |
| Resource Type in Entry | UInt32 |
UInt16 + 2 padding bytes |
| MOD “blank block” Quirk | No | Documented between key list and resource list for MOD/NWM in ERF File Format |
For a side-by-side narrative aimed at ERF readers, see RIM versus ERF on the ERF page.
PyKotor exposes RIM.to_erf() to build an in-memory ERF with the same resources, which can then be serialized as MOD/ERF for tools that only speak the ERF layout.
Cross-reference implementations (line anchors are against master and may drift):
-
PyKotor:
- on-disk layout (120-byte header, 32-byte keys):
rim_data.pymodule docstring L1–L45 -
RIMResource/RIM:rim_data.pyL59–L173 - Kaitai load:
io_rim.py_load_rim_from_kaitaiL22–L35 - legacy reader (implicit table offset 120, row order ResRef, UInt32 type, id, offset, size):
_load_rim_legacyL38–L61 _read_rim_entriesL64–L88-
RIMBinaryReader.loadL120–L127 - vanilla-style writer:
RIMBinaryWriter.writeL142–L198
- on-disk layout (120-byte header, 32-byte keys):
-
Kotor.NET:
RIMBinaryStructure.csFileRoot/FileHeader/ResourceEntryL16–L116 — reads the 20-byte logical header then seeksOffsetToResources(vanilla 120). -
rimreader.cppRimReader::loadL27–L35-
readResourceL47–L58 —UInt16type plusskipBytes(6); not the same 32-byte KotOR row as PyKotor/Kotor.NET
-
-
RIMObject.ts— usesUInt16+UInt16after ResRef -
RIM_HEADER_LENGTH = 160(L9) -
34bytes × row count forrimDataOffset(L84–L95) - retail KotOR RIMs follow 120 / 32 as on this page
-
-
Andastra / sotor (Rust):
- local parser:
vendor/sotor/core/src/formats/rim/read.rs - local model/tests:
vendor/sotor/core/src/formats/rim/mod.rs - notes: uses the retail KotOR
120-byte header /32-byte entry layout, including the implicit0 -> 120resource-table offset fallback documented on this page
- local parser:
Community tools and installers (TSLPatcher, HoloPatcher) support inserting or patching files inside RIM capsules as well as ERF/MOD. See:
The KotOR RIM container stores module resources without compression. For ERF-specific fields and engine behavior around MOD/SAV, see ERF File Format.
-
ERF File Format — Encapsulated resource format (MOD, SAV, generic ERF). Compare RIM layout under:
-
Concepts — Resource resolution, override, MOD versus RIM priority
-
KEY File Format — Index format for BIF storage (contrast with self-contained RIM)
-
BIF File Format — Vanilla bulk storage with KEY
-
Holocron Toolset module resources — Practical module/RIM editing notes