rumus - EternalArbiters/oret-oretan GitHub Wiki

Tentu! Berikut adalah Table 4: Comparative Summary of SLM Architectures yang siap pakai untuk jurnal Sinta 1/Q1. Silakan sesuaikan wording jika diperlukan.


Table 4. Comparative Summary of Small Language Model (SLM) Architectures Evaluated in This Study

Model Input Format Output Type Loss Function Context Support Inference Speed Strengths Limitations Primary Use Case
MLP Multi-Label Classifier Numerical feature vector Multi-label vector BCEWithLogits No Fast Simple, efficient, deterministic No sequential/context awareness Embedded/real-time robots
LSTM Single-Label Classifier Tokenized situation text Single-label (class) CrossEntropy No Medium Handles sequential info in input Single action per prediction, no context Baseline strategy prediction
Seq2Seq Model Tokenized situation text Token sequence CrossEntropy (token) No Slow Flexible, generative strategies Slower inference, no context window Creative/novel strategy generation
Contextual Seq2Seq Model Context buffer (N text) Token sequence CrossEntropy (token) Yes (context window) Slow Temporal/context awareness Higher latency, complex input Adaptive, context-dependent strategies
Toolset & Evaluator Module Model outputs (varied) Evaluation metrics Fast Automated benchmarking Model evaluation & comparison
Contextual Multi-Label Classifier Context buffer (N text) Multi-label vector BCEWithLogits Yes (context window) Fast Multi-label, context-aware Limited to known actions, thresholding Robust, real-time context-rich robots

Keterangan:

  • BCEWithLogits = Binary Cross-Entropy with Logits

  • CrossEntropy (token) = Token-wise Cross-Entropy Loss (for sequence generation)

  • “Context window” berarti model mempertimbangkan lebih dari satu situasi terbaru

  • Toolset/evaluator tidak melakukan prediksi, tetapi menjalankan evaluasi otomatis (kolom “Output Type” dan “Loss Function” bisa diisi “—”)

Saran:

  • Silakan atur lebar kolom sesuai template jurnalmu.

  • Untuk LaTeX, bisa aku bantu konversi jadi tabel tabular jika perlu!

Jika ingin ada penjelasan singkat di bawah tabel, atau ada tambahan kolom/keterangan, tinggal request!

Berikut narasi 3.4. Algorithm Descriptions (termasuk 3.4.1, 3.4.2, dan 3.4.3) versi jurnal Sinta 1/Q1, dengan paragraf ilmiah, penjelasan mendalam, pseudocode (dalam gaya narasi formal), dan rumus eksplisit.


3.4. Algorithm Descriptions

This section details the architectural components, forward computations, and training objectives of the principal models used for Sumo robot strategy prediction: the Multi-Layer Perceptron (MLP) for multi-label classification, the LSTM-based single-label classifier, and the sequence-to-sequence (Seq2Seq) generative model. Each model is described in terms of its layer structure, activation functions, inference workflow, and loss formulation.

3.4.1. Multi-Layer Perceptron (MLP) for Multi-Label Prediction

The MLP architecture serves as a baseline for multi-label classification, directly mapping input feature vectors to multiple simultaneous strategy outputs. The model comprises an input layer, one or more fully connected hidden layers with nonlinear activation (typically ReLU), and an output layer with sigmoid activation units corresponding to the set of possible actions.

Forward computation and training process can be described as follows: Given an input vector $\mathbf{x} \in \mathbb{R}^d$, where $d$ is the number of normalized features, the MLP computes hidden activations through successive affine transformations and nonlinearities. The final output layer produces a vector $\mathbf{\hat{y}} \in [0,1]^k$, where $k$ is the number of action labels. Each output node represents the probability that a particular action should be executed under the current input.

Pseudocode (Algorithm 1): MLP Multi-Label Forward Pass and Training

Input: feature vector x, ground truth label vector y (multi-hot)
Output: predicted probability vector y_hat

1. h_1 = ReLU(W_1 x + b_1)
2. h_2 = ReLU(W_2 h_1 + b_2)         // (for multi-layer MLP)
...
n. y_hat = sigmoid(W_out h_n + b_out)
n+1. loss = BCEWithLogitsLoss(y_hat, y)
n+2. Update all weights by backpropagation

Loss Function: The Binary Cross-Entropy with Logits loss is employed:

$$ \mathcal{L}{\text{BCE}} = -\frac{1}{k}\sum{i=1}^{k} [y_i \log(\hat{y}_i) + (1-y_i) \log(1-\hat{y}_i)] $$

where $y_i$ and $\hat{y}_i$ are the ground-truth and predicted probabilities for action $i$, respectively.


3.4.2. LSTM Classifier for Single-Label Prediction

The LSTM classifier operates on tokenized textual representations of the robot's situational context, encoding temporal dependencies in the sequence of observations. The model consists of an embedding layer (mapping tokens to vectors), one or more LSTM encoder layers, and a fully connected output layer with softmax activation over all possible strategy classes.

Workflow: Given a sequence of input tokens $\mathbf{x} = (x_1, ..., x_T)$, the embedding layer maps each token to a dense vector. The LSTM processes the sequence, maintaining hidden and cell states, and the final hidden state is projected through a fully connected layer to yield a logit vector over all possible strategies.

Pseudocode (Algorithm 2): LSTM Classifier Training

Input: token sequence x = (x1, ..., xT), ground truth class label y
Output: predicted class probabilities y_hat

1. e = Embed(x)                           // token embedding
2. h_T = LSTM(e)                          // last hidden state after sequence
3. logits = W_out h_T + b_out
4. y_hat = softmax(logits)
5. loss = CrossEntropyLoss(y_hat, y)
6. Update all weights by backpropagation

Loss Function: The model is trained using the Cross-Entropy loss:

$$ \mathcal{L}{\text{CE}} = - \sum{j=1}^{C} y_j \log(\hat{y}_j) $$

where $C$ is the number of strategy classes, and $y_j$ and $\hat{y}_j$ are the true and predicted probability for class $j$.


3.4.3. Sequence-to-Sequence (Seq2Seq) Model

The Seq2Seq model employs an encoder-decoder framework, typically based on LSTM layers, to enable generative, token-wise strategy prediction. The encoder receives the tokenized situational context (optionally including a context window) and encodes it into a fixed-length context vector. The decoder auto-regressively generates the output strategy sequence, one token at a time, using the encoded context as its initial state.

Training employs teacher forcing, where the decoder receives the ground-truth previous token as input at each step. Both special tokens—beginning-of-sequence (BOS) and end-of-sequence (EOS)—are used to mark the start and end of generation, respectively.

Flow diagram: Input tokens → Embedding → Encoder LSTM → Context vector → Decoder LSTM (auto-regressive, with teacher forcing) → Output tokens (strategy sequence)

Pseudocode (Algorithm 3): Seq2Seq Generation and Training

Input: input token sequence x = (x1, ..., xT), ground truth output tokens y = (y1, ..., yL)
Output: predicted token sequence y_hat

Encoder:
1. e_x = Embed(x)
2. h_enc = EncoderLSTM(e_x)

Decoder (training with teacher forcing):
3. y_prev = BOS_token
4. For t = 1 to L:
      e_y = Embed(y_prev)
      h_dec = DecoderLSTM(e_y, h_enc)
      logits = W_out h_dec + b_out
      y_hat_t = softmax(logits)
      loss += CrossEntropyLoss(y_hat_t, y_t)
      y_prev = y_t (teacher forcing)

Decoder (inference):
5. Autoregressively generate y_hat until EOS_token

Loss Function: The training objective is token-wise categorical cross-entropy:

$$ \mathcal{L}{\text{Seq2Seq}} = - \sum{t=1}^{L} \sum_{j=1}^{V} y_{t,j} \log(\hat{y}_{t,j}) $$

where $V$ is the vocabulary size, $L$ is the target sequence length, and $y_{t,j}$, $\hat{y}_{t,j}$ are true and predicted probabilities for token $j$ at time step $t$.


These algorithmic formulations provide a rigorous foundation for reproducible experimentation and comparative evaluation among the SLM variants in the context of Sumo robot strategy prediction.


Jika ingin caption flow diagram, atau ingin versi bilingual (Indonesia-Inggris), tinggal bilang! Mau lanjut ke contextual seq2seq atau multi-label classifier?

Tentu! Berikut narasi 3.4.4. Contextual Seq2Seq Model, 3.4.5. Contextual Multi-label Classifier, dan 3.4.6. Toolset for Benchmarking and Evaluation—dalam gaya jurnal Sinta 1/Q1, dengan paragraf narasi ilmiah, rumus eksplisit, dan penjelasan arsitektur serta evaluasi yang komprehensif:


3.4.4. Contextual Seq2Seq Model

The Contextual Sequence-to-Sequence (Seq2Seq) model extends the standard Seq2Seq framework by incorporating a buffer of recent situational contexts as its input, thereby enhancing temporal awareness and strategy adaptability. In this architecture, instead of processing a single situational descriptor, the input to the encoder is a concatenation of the N most recent situational statements, separated by special context tokens such as “[CTX]”. This context window allows the model to consider short-term history, which is crucial for making informed decisions in dynamic Sumo robot matches.

The overall data flow begins with the assembly of the context buffer, which is tokenized and embedded before being fed into a bidirectional LSTM encoder. The encoded context representation is then passed to the LSTM decoder, which generates the output strategy sequence token by token in an auto-regressive manner, leveraging the accumulated contextual information. The decoding process employs either greedy search or top-k sampling, depending on the inference configuration.

Compared to the standard Seq2Seq model, the Contextual Seq2Seq model’s primary distinction lies in its multi-situational input structure and the explicit use of context delimiter tokens. This enhancement allows the model to generate more contextually appropriate and temporally consistent strategies, especially in scenarios involving rapid environmental changes or adversarial tactics.

The operational flow of this architecture is illustrated in Figure X, which depicts the sequence from input context buffer through embedding, encoding, and decoding, culminating in the generation of an action strategy.


3.4.5. Contextual Multi-label Classifier

The Contextual Multi-label Classifier is designed to address the challenge of simultaneously predicting multiple relevant actions, while incorporating recent situational history into the decision process. The model architecture is based on either a bidirectional LSTM or a Transformer encoder, both of which process the tokenized context buffer assembled from the N most recent situations.

In this model, the entire context window is first embedded, then processed by the encoder to capture both temporal and semantic dependencies across situations. The encoded representation is passed through one or more fully connected layers with sigmoid activation units, producing a vector of probabilities corresponding to the complete set of possible actions.

The output vector $\hat{y}$, with each element $\hat{y}_i$ in the range [0,1], represents the predicted probability of executing action $i$ given the context. A fixed threshold $\tau$ (typically 0.5) is applied to determine which actions are selected as active:

For each action i:
    If ŷ_i > τ, then action i is included in the predicted strategy;
    Otherwise, it is excluded.

Or, in formula form:   Predicted actions = {i | ŷ_i > τ}

This approach enables efficient, parallel, and context-aware prediction of multiple strategies, with direct support for history-dependent behaviors and rapid inference—crucial for real-time robotic applications.


3.4.6. Toolset for Benchmarking and Evaluation

A comprehensive and standardized evaluation toolset is integrated to ensure fair, transparent, and reproducible benchmarking across all SLM variants. The toolset consists of automated scripts and software modules designed to calculate a suite of key performance metrics—including token-level accuracy, BLEU score (for sequence generation tasks), F1-score, precision, recall (for multi-label classification), and inference latency.

The evaluation process is executed as follows:

  • For each trained model, predictions are generated on the held-out test dataset using consistent input formatting and batch sizes.
  • The evaluator script compares predicted outputs to ground-truth labels, computes all relevant metrics, and exports the results as tables and publication-quality graphics.
  • BLEU score is calculated according to the standard formula, measuring n-gram overlap between generated and reference strategy sequences. F1, precision, and recall are computed for each action label and averaged (micro or macro) as appropriate.
  • Inference latency is measured as the average time per prediction across the entire test set, providing insights into each model’s real-time suitability.

This automated benchmarking framework supports both batch and single-sample evaluation, and is extensible for future metric inclusion or deployment on different computational environments. The toolset ensures that all models are compared under identical conditions, thereby providing a rigorous foundation for result analysis and scientific reproducibility.


Tips:

  • Tambahkan diagram alir (flowchart/figure) di bagian contextual seq2seq sesuai template.
  • Jika butuh penjelasan kode evaluator atau format hasil metrik (tabel/grafik), tinggal request!

Siap lanjut ke bab berikutnya!

⚠️ **GitHub.com Fallback** ⚠️