Decompiled THK Specification - Ezekial711/MonsterHunterWorldModding GitHub Wiki

Basic Language Features

F-Node Naming and Aliasing

F-Nodes decompile from nodes to relatively formulaic names, however it' possible to both freely rename F-Nodes as well as give them aliases. F-Node definition syntax follows the quasi-pythonic format:

Def F-NodeName

EndF

The EndF directive also has aliases: EndFunction and EndDef. F-Nodes can be given simultaneous alternate names through aliasing.

Def F-NodeName0 & F-NodeName1 & ... & F-NodeNameN

EndF

Aliasing allows renaming a F-Node without having to refactor all references to it in separate files.

When decompiling through a thkl the decompiler can replace all instances of F-Node indices for F-Node names instead. However, when there's missing thks or decompiling at thk level the decompiler will still convert node indices into F-Node names, however references to non-decompiled F-Nodes are kept in direct format and F-Nodes are assigned a coersion index.

A coercion index next to a F-Node name indicates to the compiler that a F-Node should compile to a specific node index. Coerced F-Nodes are compiled first into nodes and slotted into the indices they indicate. If there's not enough nodes in the file to fill all of the empty spots required to reach said indices the compiler will generate empty nodes to guarantee the indices are met.

Coercion is performed through the "meta" operator which is the character @. Throughout the language @ is used to indicate meta-syntactical elements that are required for the full editability of thk that would otherwise impair the legibility of most of the format's code.

An example of a coerced and aliased F-Node

Def Function_52 & SummonEcliptic & GlobalEclipticCallBeforeDeath @ 126
	...
EndF

This F-Node can be called through either of it's 3 names or through a direct index call to 126 (more on this further ahead).

F-Segment Format

The language attempts to preserve the principle of 1 segment = 1 line of code. This principle is held at the low and mid level of the language design. For the most part segments are heavily syntactically-sugared. With the function calls dressed through the F-Extensions-Y system and all irrelevant variables nulled from the context. The most general syntax has the form

FlowControl FunctionType -> Action => Directive @ Additional_Parameter : Value, Additional_Parameter : Value, ... \\ Comments

Every element is optional. An empty line will not compile into a segment it will simply be omitted.
The syntax attempts to mirror (except in one specific regard) how the game reads thks.

It's possible to continue on the next line using the line continuation character \:

FlowControl FunctionType -> Action \
						 => Directive \
						 @ Additional_Parameter : Value, \
							Additional_Parameter : Value, \
							... \\ Comments

This code is identical to the one specified before.

Flow Control

Flow Control statements come in 3 varieties:

Node Level Flow Control of the form:

return
reset
repeat

return will go back to the F-Node who called the current F-Node. reset will go go to the main F-Node on the thk who called the current F-Node. repeat will go to the first segment on the current F-Node.

Conditional Branching Flow Control of the form:

If condition -> action => call
	code1
Elif condition -> action => call
	code2
Else 
	code3
EndIf

Where the elif and else blocks are optional as well as the condition -> action => call elements. or

If condition -> action => call
	code1
Elif condition -> action => call
	code2
Else 
	code3
endwith  condition -> action => call

which is functionally equivalent to

If condition -> action => call
	code1
Elif condition -> action => call
	code2
Else 
	code3
	condition -> action => call
EndIf

And exists solely for preservation of the decompilation->compilation identity, because of Capcom design desicions. This is the syntactical exception alluded before.

Inline conditionals

As it's well known, the following:

self.enrage_time_left().leq(22) -> generic.take_off => Global.Function_52

acts as an inline conditional where if the condition is satisfied the action is called and then flow jumps to the global thk function named.

The above rule showcases a few of the intended languge features:

F-Extensions-Y

F-Extensions-Y provide a prettified way of calling the check functions. The compiler-decompiler provides a default set that can be dynamically extended. self.enrage_time_left().leq(parameter2) A non-prettified version is also legal and writes thus: self.function_2B(0,22) F-Extensions-Y are covered in their own spec document.

Action Syntax

Actions can be referred in a multitude of ways inteded to allow flexible use cases. Actions can be tied to Monster Action Name Namespaces, can be tied to variable names, or can be referred absolutely.

Monster Action Namespaces allow importing libraries of action names tied to each monster. Furthermore a generic library is also provided that performs more sophisticated mapping over action names, allowing one to call "generic actions" which are, at compile time, based on the thkl and thk properties, mapped to concrete action indices for the target monster. Allowing portability of action calls.

Monster Action Namespaces are called through their qualified import namespace name and the action name:

Importing the special generic action library

importActions generic
...
generic.take_off

Importing a specific monster's namespace (requires qualification because of characters involved)

importActions Safi'Jiiva as safi
...
safi.take_off

Declaring the thk belongs to a monster and then referring to the monster namespace. If the thk has no monster declaration then the compiler will go from the thkl monster declaration

monster Safi'jiiva
...
monster.take_off

We can bind an action to a variable and then use said variable.

attack_var = safi.take_off
...
self.above_area() -> attack_var => Function_31

Library files can leave unbound variables or function calls by prefacing them with the local keyword, which are then populated by those using the library. For example:

Library File

Def lib_func
	self.above_area() -> local.attack_var => local.Function_31
EndF

Implementation File 1

importLibrary Lib
attack_var = safi.take_off

Def test
	=> Lib.lib_func
EndF

Implementation File 2

attack_var = safi.land

Def test
	=> Lib.lib_func
EndF

Will compile to

Implementation File 1

Def test
	=> library_import
EndF

Def library_import
	self.above_area() -> safi.take_off => Function_31
EndF

and Implementation File 2

Def test
	=> library_import
EndF

Def library_import
	self.above_area() -> safi.land => Function_31
EndF

When the file using the library invokes library functions the variable names resolve to their local binding. This enables reverse dependency injection and treating library files as frameworks. Take note that the Global thk has no such feature. Because it's statically compiled into a thk all of it's bindings must be resolved in some shape or form and thus cannot be treated like a generic library. It's function space however is imported with the same syntax importLibrary Global, however this resolves correctly to external call, not mirrored local copies of the library F-Nodes.

Libraries imported at thkl level are available to all thks that the thkl calls. Calls to the library from thks are resolved as calls to the global thk, and all of the necessary library code is mirrored into the global thk.

Also it's possible to just refer to a monster's action by raw value

ACTION#2B

There's one more way of performing this, however this is, again, heavily disinsentivized because the compiler treats this syntax separately and cannot reason about it:

@action_id = 0x2B

The meta-syntactical operator allows manually setting ANY field. In the case that the left side and the right side bind to the same fields the right-side will be favoured.

This is a feature that exists only to allow all hex edits to also be doable through code edits. It's heavily recommended NOT to touch fields that have formal syntax. The compiler WILL warn you about variable use collisions. This particular usage might also become deprecated over time (for values that have regular syntactical access).

Calling F-Nodes

Alias/Named Calls

It's recommended to invoke F-Nodes through

=> FNodeAlias

This makes it clear that it's an F-Node jump.

It's still within consideration if the compiler will allow calls to F-Nodes to be performed without the leading symbol when there are no other operators n the left

FNodeAlias

Direct Index Calls

Direct index calls are only generated when the decompiler has non-exhaustive thkl. They are heavily disinsentivized as the compiler can't really reason about them. Referring to a direct index will generate a warning if the function is on a file in the same project and doesn't have a coerced index.

FUNCTION#3C

Randomization

Following the rules of randomization in the game, the syntax for random choice is summarized as

Chance (firstChance) -> action => call
	...
Chance (secondChance) -> aciton => call
	...
EndC

Similar to conditionals the unadvisable syntax:

Chance (firstChance) -> action => call
	...
Chance (secondChance) -> action => call
	...
EndCWith function -> action => call

which is equivalent to

Chance (firstChance) -> action => call
	...
Chance (secondChance) -> action => call
	...
	function -> action => call
EndC

EndC also has alias EndChance.

Actions and Calls format has been described before. Chance itself is a keyword, firstChance can be a variable or an integer literal. The compiler handles determining what type of Chance node each Chance must be labelled as, as well as terminating the chance block.

Registers

Possibly the most complex part of the compiler-decompiler.

Registers can be accessed in two ways:

  • Through explicit reference (the decompiler will use this even when it has access to the complete thkl unless the setting is explicitly overriden). The explicit registers are named $A to $T
  • Through a register variable. Register variables are declared at the thkl (conversely they are only compiled properly when compiling from the thkl because of register allocation).

Register operations have special syntax, they occupy the function slot syntactically speaking however they have the form:

[RegisterName Comparison/Asignment Value/Variable]

RegisterName can either be an explicit reference $A to $T, or a register variable.

Comparison includes == <= < >= > != Asignment include |- and ++ When using an assignment the value to the right is ignored. |- sets the value to 0 and ++ increases the value by 1

Value is an integer literal, variable is any variable identifier for an integer literal.

A possible language extension would be enabling the full arithmetic complement: + - * // % and enabling registers on the right side

Register Variables

Register Variables are declared in the thkl with the syntax

Register RegisterName

Optionally it's possible to explicitly set a register name to a specific register, which basically makes it act as a simple alias for the explicit reference.

Register RegisterName as $RegLetter

For example:

Register MyRegister as $A

Registers are allocated at thkl level. The compiler must analyze the entire project to avoid conflict between register usage in multiple modules conflicting.

The Unsafe Directive

It's possible to specify the content of a segment directly in hex through the use of the unsafe directive

unsafe 00 AC 01 BB000104 AA ...

The unsafe directive cannot be combined with anything else except line continuations and comments. Leftover bytes will be filled with 0s. The compiler cannot reason about unsafe segments during most analysis. It will be treated as an empty line during most of the parsing process, register allocations will ignore the contents of the segment. Resolutions and library calls will also ignore it.

Decompilations will never produce this segment. It's preferable to use the meta-syntactic operator to specify fields explicitly. If it's necessary to create an empty segment for some reason, it's recommended to use *& the Useless Segment Directive instead.

⚠️ **GitHub.com Fallback** ⚠️