Decompiled THK Specification - Ezekial711/MonsterHunterWorldModding GitHub Wiki
F-Nodes decompile from nodes to relatively formulaic names, however it' possible to both freely rename F-Nodes as well as give them aliases. F-Node definition syntax follows the quasi-pythonic format:
Def F-NodeName
EndF
The EndF directive also has aliases: EndFunction and EndDef. F-Nodes can be given simultaneous alternate names through aliasing.
Def F-NodeName0 & F-NodeName1 & ... & F-NodeNameN
EndF
Aliasing allows renaming a F-Node without having to refactor all references to it in separate files.
When decompiling through a thkl the decompiler can replace all instances of F-Node indices for F-Node names instead. However, when there's missing thks or decompiling at thk level the decompiler will still convert node indices into F-Node names, however references to non-decompiled F-Nodes are kept in direct format and F-Nodes are assigned a coersion index.
A coercion index next to a F-Node name indicates to the compiler that a F-Node should compile to a specific node index. Coerced F-Nodes are compiled first into nodes and slotted into the indices they indicate. If there's not enough nodes in the file to fill all of the empty spots required to reach said indices the compiler will generate empty nodes to guarantee the indices are met.
Coercion is performed through the "meta" operator which is the character @. Throughout the language @ is used to indicate meta-syntactical elements that are required for the full editability of thk that would otherwise impair the legibility of most of the format's code.
An example of a coerced and aliased F-Node
Def Function_52 & SummonEcliptic & GlobalEclipticCallBeforeDeath @ 126
...
EndF
This F-Node can be called through either of it's 3 names or through a direct index call to 126 (more on this further ahead).
The language attempts to preserve the principle of 1 segment = 1 line of code. This principle is held at the low and mid level of the language design. For the most part segments are heavily syntactically-sugared. With the function calls dressed through the F-Extensions-Y system and all irrelevant variables nulled from the context. The most general syntax has the form
FlowControl FunctionType -> Action => Directive @ Additional_Parameter : Value, Additional_Parameter : Value, ... \\ Comments
Every element is optional. An empty line will not compile into a segment it will simply be omitted.
The syntax attempts to mirror (except in one specific regard) how the game reads thks.
It's possible to continue on the next line using the line continuation character \
:
FlowControl FunctionType -> Action \
=> Directive \
@ Additional_Parameter : Value, \
Additional_Parameter : Value, \
... \\ Comments
This code is identical to the one specified before.
Flow Control statements come in 3 varieties:
Node Level Flow Control of the form:
return
reset
repeat
return
will go back to the F-Node who called the current F-Node.
reset
will go go to the main F-Node on the thk who called the current F-Node.
repeat
will go to the first segment on the current F-Node.
Conditional Branching Flow Control of the form:
If condition -> action => call
code1
Elif condition -> action => call
code2
Else
code3
EndIf
Where the elif and else blocks are optional as well as the condition -> action => call elements. or
If condition -> action => call
code1
Elif condition -> action => call
code2
Else
code3
endwith condition -> action => call
which is functionally equivalent to
If condition -> action => call
code1
Elif condition -> action => call
code2
Else
code3
condition -> action => call
EndIf
And exists solely for preservation of the decompilation->compilation identity, because of Capcom design desicions. This is the syntactical exception alluded before.
As it's well known, the following:
self.enrage_time_left().leq(22) -> generic.take_off => Global.Function_52
acts as an inline conditional where if the condition is satisfied the action is called and then flow jumps to the global thk function named.
The above rule showcases a few of the intended languge features:
F-Extensions-Y provide a prettified way of calling the check functions. The compiler-decompiler provides a default set that can be dynamically extended.
self.enrage_time_left().leq(parameter2)
A non-prettified version is also legal and writes thus:
self.function_2B(0,22)
F-Extensions-Y are covered in their own spec document.
Actions can be referred in a multitude of ways inteded to allow flexible use cases. Actions can be tied to Monster Action Name Namespaces, can be tied to variable names, or can be referred absolutely.
Monster Action Namespaces allow importing libraries of action names tied to each monster. Furthermore a generic library is also provided that performs more sophisticated mapping over action names, allowing one to call "generic actions" which are, at compile time, based on the thkl and thk properties, mapped to concrete action indices for the target monster. Allowing portability of action calls.
Monster Action Namespaces are called through their qualified import namespace name and the action name:
Importing the special generic action library
importActions generic
...
generic.take_off
Importing a specific monster's namespace (requires qualification because of characters involved)
importActions Safi'Jiiva as safi
...
safi.take_off
Declaring the thk belongs to a monster and then referring to the monster namespace. If the thk has no monster declaration then the compiler will go from the thkl monster declaration
monster Safi'jiiva
...
monster.take_off
We can bind an action to a variable and then use said variable.
attack_var = safi.take_off
...
self.above_area() -> attack_var => Function_31
Library files can leave unbound variables or function calls by prefacing them with the local keyword, which are then populated by those using the library. For example:
Library File
Def lib_func
self.above_area() -> local.attack_var => local.Function_31
EndF
Implementation File 1
importLibrary Lib
attack_var = safi.take_off
Def test
=> Lib.lib_func
EndF
Implementation File 2
attack_var = safi.land
Def test
=> Lib.lib_func
EndF
Will compile to
Implementation File 1
Def test
=> library_import
EndF
Def library_import
self.above_area() -> safi.take_off => Function_31
EndF
and Implementation File 2
Def test
=> library_import
EndF
Def library_import
self.above_area() -> safi.land => Function_31
EndF
When the file using the library invokes library functions the variable names resolve to their local binding. This enables reverse dependency injection and treating library files as frameworks.
Take note that the Global thk has no such feature. Because it's statically compiled into a thk all of it's bindings must be resolved in some shape or form and thus cannot be treated like a generic library. It's function space however is imported with the same syntax importLibrary Global
, however this resolves correctly to external call, not mirrored local copies of the library F-Nodes.
Libraries imported at thkl level are available to all thks that the thkl calls. Calls to the library from thks are resolved as calls to the global thk, and all of the necessary library code is mirrored into the global thk.
Also it's possible to just refer to a monster's action by raw value
ACTION#2B
There's one more way of performing this, however this is, again, heavily disinsentivized because the compiler treats this syntax separately and cannot reason about it:
@action_id = 0x2B
The meta-syntactical operator allows manually setting ANY field. In the case that the left side and the right side bind to the same fields the right-side will be favoured.
This is a feature that exists only to allow all hex edits to also be doable through code edits. It's heavily recommended NOT to touch fields that have formal syntax. The compiler WILL warn you about variable use collisions. This particular usage might also become deprecated over time (for values that have regular syntactical access).
It's recommended to invoke F-Nodes through
=> FNodeAlias
This makes it clear that it's an F-Node jump.
It's still within consideration if the compiler will allow calls to F-Nodes to be performed without the leading symbol when there are no other operators n the left
FNodeAlias
Direct index calls are only generated when the decompiler has non-exhaustive thkl. They are heavily disinsentivized as the compiler can't really reason about them. Referring to a direct index will generate a warning if the function is on a file in the same project and doesn't have a coerced index.
FUNCTION#3C
Following the rules of randomization in the game, the syntax for random choice is summarized as
Chance (firstChance) -> action => call
...
Chance (secondChance) -> aciton => call
...
EndC
Similar to conditionals the unadvisable syntax:
Chance (firstChance) -> action => call
...
Chance (secondChance) -> action => call
...
EndCWith function -> action => call
which is equivalent to
Chance (firstChance) -> action => call
...
Chance (secondChance) -> action => call
...
function -> action => call
EndC
EndC also has alias EndChance.
Actions and Calls format has been described before. Chance itself is a keyword, firstChance can be a variable or an integer literal. The compiler handles determining what type of Chance node each Chance must be labelled as, as well as terminating the chance block.
Possibly the most complex part of the compiler-decompiler.
Registers can be accessed in two ways:
- Through explicit reference (the decompiler will use this even when it has access to the complete thkl unless the setting is explicitly overriden). The explicit registers are named $A to $T
- Through a register variable. Register variables are declared at the thkl (conversely they are only compiled properly when compiling from the thkl because of register allocation).
Register operations have special syntax, they occupy the function slot syntactically speaking however they have the form:
[RegisterName Comparison/Asignment Value/Variable]
RegisterName can either be an explicit reference $A to $T, or a register variable.
Comparison includes == <= < >= > != Asignment include |- and ++ When using an assignment the value to the right is ignored. |- sets the value to 0 and ++ increases the value by 1
Value is an integer literal, variable is any variable identifier for an integer literal.
A possible language extension would be enabling the full arithmetic complement: + - * // % and enabling registers on the right side
Register Variables are declared in the thkl with the syntax
Register RegisterName
Optionally it's possible to explicitly set a register name to a specific register, which basically makes it act as a simple alias for the explicit reference.
Register RegisterName as $RegLetter
For example:
Register MyRegister as $A
Registers are allocated at thkl level. The compiler must analyze the entire project to avoid conflict between register usage in multiple modules conflicting.
It's possible to specify the content of a segment directly in hex through the use of the unsafe
directive
unsafe 00 AC 01 BB000104 AA ...
The unsafe directive cannot be combined with anything else except line continuations and comments. Leftover bytes will be filled with 0s. The compiler cannot reason about unsafe segments during most analysis. It will be treated as an empty line during most of the parsing process, register allocations will ignore the contents of the segment. Resolutions and library calls will also ignore it.
Decompilations will never produce this segment. It's preferable to use the meta-syntactic operator to specify fields explicitly. If it's necessary to create an empty segment for some reason, it's recommended to use *&
the Useless Segment Directive instead.