Mjs script syntax - AtomCrafty/MajiroTools Wiki

Mjs script syntax and grammar

The uncompiled .mjs script syntax looks like a fusion between C and JavaScript.

Extensions: .mjs, .mjh, .hgs, .txt

Identifiers

Prefixes and postfixes

Prefixes and postfixes are essential (and mandatory) for identifiers.

Identifies Storage
# var persistent
@ var savefile
% & var thread
_ var local
$ func -
invalid -
Type
int
% ! float
$ string
# intarray
%# !# floatarray
$# stringarray

[1]: The Thread scope prefix & has been observed in-place of %. Compared to the type postfixes, this existence is much more troubling as it directly conflicts with the prefixes required by $get_variable and $set_variable system calls. It's possible this could be considered a private modifier.

[2]: The Float type postfixes !,!# are legacy postfixes that were already being phased out by the release of Mahjong (Majiro engine's first game).

$rand!() is the only remaining system call with this postfix. And $dim_create!#(), $dim_release!#() have been removed in all releases after Mahjong.

[3]: Another postfix is theorized to exist, but may not be usable in the mjs syntax: ~ as seen in the %Op_internalCase~ internal variable (used to store the switch operand).

Assuming this name isn't a collision, ~ may be intended to prevent access to the variable, or it may be an explicit declaration that a variable may contain any (primitive?) type.

At this point, with all the random postfixes and prefixes popping up, it may be safer to consider all symbols as being potential candidates for postfixes and prefixes (at least within reason).

Groups

Groups act as namespaces for identifiers. A group is defined in-script with the following line:

#group "MYNAME"
Group Usage
@ locals/args
@GLOBAL default
@MAJIRO_INTER system calls
@MyNAME user-defined

Unlike in C++, groups can always be used implicitly. Basically it's as if every group is automatically included with a using groupname; statement in C++.

The theory behind this is that identifier names are stored with two hashes during compile time. The hash with the group name, and the hash without (which points to the full hash name). If a new identifier is declared with a conflicting base name, then it probably overwrites the previously stored base name lookup with the new full hash name.

Identifier structure

S Identifier T Group
$ handle_float % @ GLOBAL

Comments

// line comment
/* block
comment */

Operators

Note: The following operators have never been observed in-script, but are known to exist: %, ^, ~, ++x, --x

Note: It's unclear whether ternary operators require arguments to be surrounded with ().

// Unary:
x--; x++; --x; ++x; !x; -x; ~x; //unknown: +x;

// Binary:
x+y; x-y; x*y; x/y; x%y; x&y; x|y; x^y; x<<y; x>>y;
x&&y; x||y; x==y; x!=y; x<y; x<=y; x>y; x>=y;

// Assignment:
x=y;
// Compound:
x+=y; x-=y; x*=y; x/=y; x%=y; x&=y; x|=y; x^=y; x<<=y; x>>=y;

// Ternary:
(x)?(a):(b);

Preprocessor

#include "flags.txt"
#include "includes\stdio.mjh"
#group "MYGROUP"            // All declarations without an explicit group will use this group (func $foo(void); -> func [email protected](void);).
#group push|pop             // Push or pop current group from stack, allowing to temporarily overwrite the current group (used in headers).
#forcecr on|off             // If you put this in, line breaks will be normal (force carriage return?).
#forcegr "$foo();", "$bar$baz$"  // Possibly defines a list of functions that will automatically trigger the "$foo()" statements. "gr" may be shorthand for "graphics".
#forcewipeonblank on|off    // The last line is \p \w automatically.
#subst "s/pattern/repl/"    // Regex replace macro, uses "\#"-style replacement and not "$#" (quotes do not end string until the '/' character).
#use_readflg on|off         // This script will track if lines have been read before.
#define MY_MACRO  0         // Comments can appear with defines.
#if MY_MACRO
//#elif   // existence unknown
#else
#endif

Keywords

Note: There also exists instructions to handle ranges in switch statements, so it's possible there's a related keyword to support this (such as range, which has been observed in the CatSystem2 engine).

if, else if, else
for, while, do while
switch, case, default, break, unbreak
//continue   // existence unknown
return, goto
void, func, var
setskip, constructor, destructor

See Special block syntaxes for explanation on setskip, constructor, destructor keywords.

Note: It's unknown if it's legal to place expressions for inline branching/looping statements on a separate line.

Branching keywords

// Branching:
if(x) {
} else if(y) {
} else { }

// Inline Branching:
if(a)      x;
//else if(b) y; // unknown
//else       z; // unknown

Looping keywords

// Looping:
while(a) { }
do { } while(a);
for(i=0; i<10; i++) {
  break;  // all loops allow break to exit
}

// Inline Looping:
while(a) x;
//for(i=0; i<10; i++) y; // unknown

Switch keywords

// Switches:
switch(a) {
case -1:
  break; // exit switch
case 2:
  unbreak; // fallthrough to next case
case 3:
  x;
  // breaking is optional, no-break behavior is the same as including a break at the end of the case
default:
  // default cases supported
}

switch(s$) { // switches allow string and float arguments too
case "#1":
  switch (b) { // switch statements can be nested
  case 77:
  case 88:
  }
  break;
case "#2":
}

Labels/Return keywords

my_label:
  goto my_label;

return(x); // unknown if `()` are required
return;    // return for void function (in MjIL, this equates to `return(0);`

Definition keywords

void $myfunc_noret();
func $myfunc_withret();
var myvar, anothervar; // inline declaration with assignment is not allowed

Special variables

__SYS__NumParams // number of arguments passed to function

The __SYS__NumParams local variable is used by functions declaring optional arguments. Because optional arguments do not allow default values, those values must be handled by the function at runtime.

Additionally, because of this variable. Function overloading can be performed based on the number of arguments passed.

Declarations

Variable declaration

All user-variables are required to be defined ahead of time, in similar fashion to C. Though they do not need to appear immediately at the beginning of a function, just before their usage. Variables probably do not support declarations with assignment, which includes for declarations within for loops.

var _onevar, _morevar;
var _local_var;       // value is stored in stack-frame
var %threadlocal_var; // value is stored in thread
var @savefile_var;    // value is stored in save-game
var #persistent_var;  // value is stored globally, and persists even when rebooting the game

Function declaration

Function declarations also follow similar patterns to C. The void argument seems to be mandatory when no argument's exist.

void $function_with_no_args(void) {
}
void $function_with_args(_arg1, _another_arg, _str_arg$) {
}
func $function_returns_int(void) {
}
func $function_returns_string$(void) {
}
func $function_returns_floatarray%#(void) {
}
// Optionals are defined with [], and can appear before or after a comma
func $function_with_optionals(_page[, _xin, _yin]) {
}
func $function_with_optionals2(_page, [_xin, _yin]) {
}

Function forward declaration

Note how optional arguments can be included in forward declarations.

// forward-declared function contained within same script
func $check_font_change_only(void);

// forward-declared functions from pic.mjo
func [email protected](_fn$);
void [email protected](_to, _fn$[, _xin, _yin]);

// forward-declared function from pic.mjo (within a function)
void [email protected]();
[email protected](); // called immediately afterwards

Special block syntaxes

There are 3 identified types of special blocks seen in Majiro source scripts.

constructor

The constructor acts as an initialization for the function/thread(?) it's defined in. Usage of this block is not fully understood, as it's only appearance in available source scripts is immediately at the beginning of a function. If there are opcodes used specifically for the constructor, it's possible they are not required if the constructor is immediately run. This block has only ever been observed in a function also containing a destructor block, however its likely they can be used independently if there is no required cleanup.

constructor {
  //...
}

destructor

The destructor block acts as a cleanup method for the function/thread(?) it's defined in. A destructor only becomes active once execution in the function reaches the block. After which, it will be called during thread cleanup. There is potential that defining more than one destructor in a single function is supported, but any new destructor block would replace the old block, causing the old to never execute. There is one opcode dedicated to the destructor's functionality.

destructor {
  //...
}

setskip

This is a VN-related syntax. It's usage isn't fully understood, but it acts as some sort of loop and contains many type of wait executions. There are multiple opcodes used specifically for this block's functionality.

setskip {
  //...
}

Unknown syntaxes

This section describes syntaxes that are known to exist through bytecode instructions, but have never been observed in scripts.

Array syntax

Arrays can have 1, 2, or 3 dimensions. When an array has less than 3 dimensions, its treated as if the remaining dimensions are 1 internally. Arrays have no method for determining their lengths, so consider it similar to how heap-allocated arrays are handled in C.

Out-of-bounds access acts based on wrapping, including both positive and negative indices. So accessing index [-1] will access element [N-1], and accessing index [N+4] would access element [4] (assuming N > 4).

Array index order

Arrays are theorized to use index access in a similar fashion to C jagged arrays. Generally jagged arrays will store the most significant rank first, and least significant rank last (i.e. pixels[y][x], or grid3d[z][y][x]).

Array allocation

Reasoning for this assumption is based on how the $dim_create#() system calls order their parameters:

func $dim_create#([[_dim3=1,] _dim2=1,] _dim1);

// these statements create identical arrays:
_myarray# = $dim_create#(3, 14);  // create int[3][14]
_myarray# = $dim_create#(1, 3, 14);  // create int[1][3][14]

Which looks like this in MjIL:

ldc.i     14 ; _dim1
ldc.i     3  ; _dim2
;ldc.i    1  ; (implied) _dim3
syscall   $dim_create# (2)  ; int[3][14]

Array access

—meanwhile, the ldelem and stelem instructions consume their indices in the opposite order (which is normal for non-call instructions).

ldc.i     42 ; store value
;ldc.i    0  ; (implied) index 3
ldc.i     1  ; index 2  _myarray#[2]
ldc.i     5  ; index 1  _myarray#[2][5]
stelem.i  dim2 loc int $_myarray# 0

So it's assumed the element access syntax looks something like this:

// these statements create identical behavior:
_myarray#[1][5] = 42;
_myarray#[0][1][5] = 42;

// or if multidimensional array access uses commas...
_myarray#[1, 5] = 42;
_myarray#[0, 1, 5] = 42;

Switch selector

Majiro provides one interesting unknown syntax that acts like an N-nary operator. Instead of evaluating a condition to choose a true or false result... this method takes an index and evaluates one of N expressions evaluated at runtime.

// Standard ternary operator:
//             if?true:false
_a = (_condition)?(13):(0);

// Switch selector:
_a = (_index)?("page0", "page1", "page2", "page3", "page4");

// Arguments do not need to be constants:
_s$ = (_index)?(_sel0$, _sel1$, _sel2$, _sel3$, _sel4$);

Interestingly, this is the only operation to use the switch instruction, meanwhile real switch statements use the br.case instruction set.

line    #201
ld      loc int $_index -2
switch  @case_0, @case_1, @case_2, @case_3, @case_4

 case_0:
  ld  loc string $_sel0$ 0
  br  @switch_end
 case_1:
  ld  loc string $_sel1$ 1
  br  @switch_end
 case_2:
  ld  loc string $_sel2$ 2
  br  @switch_end
 case_3:
  ld  loc string $_sel3$ 3
  br  @switch_end
 case_4:
  ld  loc string $_sel4$ 4
  br  @switch_end

switch_end:
st.s    loc string $_s$ 5
line    #202  ; all of this happened on one line