Working with Binary Files - X-Hax/sa_tools GitHub Wiki

Binary Files

In Sonic Adventure 1/2 and its ports most assets are compiled into the game's executable or other binary files. The formats of these files can differ depending on the platform and the version of the game. SA Tools allow to rip ("split") the assets from binary files automatically for several supported titles. See the SA Tools Hub for getting started with creating projects which split content out of the currently supported titles and versions. You can also work with binary files directly.

To open a level, a model or an animation stored in a binary file, you need to know several things, which are explained below.

Compression

If the file has a PRS extension, it is compressed. You can still open the file directly in the tools, but if you want to inspect it in a HEX editor you will need to decompress it first. Some REL files in SADX Gamecube are also compressed. You can tell the REL file is compressed by the presence of the SaCompGC ASCII string near the beginning of the file.

You can use ArchiveTool from command line tools to decompress PRS and REL files. If you use SAMDL to open models from these files, they will be decompressed automatically.

Address and Key

The tools need the address (offset) of the level or model in the binary file to be able to open it. In addition, a binary key is needed. The binary key is the address where the binary file is loaded in memory in the actual game. Some of the keys are listed below:

File (SA1 DC) Key
Level (STG) files C900000
ADV0100.BIN C920000
ADV0130.BIN C920000
AL_GARDEN00.BIN CB80000
AL_GARDEN01.BIN CB80000
AL_GARDEN02.BIN CB80000
Cutscene (EV_) files CB80000
AL_RACE.BIN CB80000
ADVERTISE.BIN 8C900000
MOVIE.BIN 8CEB0000
TIKAL_PROG.BIN CB00000
S_SBMOT.BIN CB08000
Other MOT files CC00000
Other files C900000
File (SA2 DC) Key
SA2 STG files 8C500000
Event (EV_) files C600000
Other Files Key
sonic.exe (SADX) 400000
DLL files 10000000
1ST_READ.BIN (Dreamcast) 8C010000
REL files (Gamecube) C900000
SonicApp.exe (X360) 82000000

SA Tools use a fixed key of 0xC900000 for fixing REL pointers. Use this key for any REL file from any game.

Start offset

In SADX X360, the actual data starts at 0xC800 rather than 0x0. The "Start Offset" field in SAMDL can be used to specify where the data starts without changing the key.

Byte order

All Dreamcast and most PC files are Little Endian. Gamecube, X360 and PS3 files are Big Endian. Some files in SA2 PC are Big Endian.

"Reversed" byte order for colors and UVs

  • In SADX Gamecube basic models, material and vertex colors are stored as RGBA as opposed to the regular Big Endian order of ARGB or the regular Little Endian order of BGRA.
  • In SADX Gamecube basic models, UVs are stored in the Little Endian order: VU.
  • Some chunk models in SA2B Gamecube and SA2 PC (such as Chao Garden trees) store vertex colors as RGBA (SA2B) or ARGB (SA2 PC).

Use the "Reverse" checkbox in SAMDL when you want to open such models.

Level and Model Formats

There are several level and model formats used in these games. Some games use more than one model format. Here is a brief overview:

Basic format (SA1 format)

This is the Ninja Basic format described in KATANA SDK. It's used in the following games:

  • SA1 Dreamcast
  • SADX Gamecube, including the prototypes (colors are stored as RGBA and UVs are stored as VU)
  • SA2 Dreamcast (collision only)
  • SA2B (collision only)
  • SA2 PC (collision only)

Basic+ format (also known as "BasicDX" or "SADX" format)

This is a variation of the Basic format that adds several fields to the meshset and model structures. It's used in the following games:

  • SADX PC (2004)
  • SADX PC (Steam)
  • SADX X360
  • SADX PS3

Although this format is also known as "BasicDX", the Gamecube version of SADX uses the regular Basic format instead. The "DX" format only applies to the 2004 PC port and its derivatives.

Chunk format

This is another model format from KATANA SDK. It's used in the following games:

  • SA2 Dreamcast
  • SA2B
  • SA2 PC
  • SADX (Cream cameo and Chao related models only)

Ginja format (also known as "GC" or "SA2B" format)

This is a proprietary format used in Billy Hatcher and some other Gamecube titles. It's used in the following games:

  • SA2B
  • SA2 PC

Although this format is sometimes called "GC", SADX Gamecube doesn't use it. SA2B and SA2 PC use both Chunk and Ginja models, as well as Basic models for level collision. In SA2B and SA2PC the levels can also be in the old "SA2" format, which doesn't use Ginja models, and in the new "SA2B" format, which uses both Chunk and Ginja models.

Data structures

Model (Attach)

This structure contains the model radius for clipping as well as pointers to vertex and material data. The exact layout of the structure is different depending on the model format. See NJS_MODEL for Basic, NJS_MODEL_SADX for Basic+, NJS_CNK_MODEL for Chunk.

Object

This is NJS_OBJECT from Katana SDK. This structure contains a pointer to the Model (Attach) structure, as well as object flags and position, rotation and scale data. Objects support a node-based hierarchy so they can have "child" and "sibling" objects.

Motion

This is NJS_MOTION from Katana SDK. It contains node animation data.

Action

This is NJS_ACTION from Katana SDK. It consists of a pointer to an NJS_OBJECT and another pointer to an NJS_MOTION.

Finding assets in binary files when the address is unknown

If you know the binary key but don't know the address of the asset you're looking for, you can use the Scanner in Data Toolbox to scan the binary file and extract assets from it.