Assignment 2 - MIPT-ILab/mipt-mips GitHub Wiki

** Note: this assignment is finished.**

Introduction

In this assignment you will implement the instruction decoder and disassembler.

All requirements remain the same as in previous task


What is disassembler?

As you know, all instructions are just sets of bits. In MIPS, it is 4-byte words. Processors decodes instruction according to its rules called format.

Although formatted 4-bytes opcodes are compact and fully readable by processor, for programmer it is very complicated. That's why people use assembler language — a language with translates understandable and standardized statement. Every statement can be converted to 4-byte instruction — it is called assembling. Vice versa, every 4-byte instruction can be converted to statement of assembler language — disassembling.

So, disassembler is a program that converts assembled program back to human-understandable view. It is extremely useful during development of any simulator, architectual or micro-architectual, functional or performance.


Task

Update your repo with latest materials (reference realization of func_memory), available in our main repo. You should create new branch task_2 and create three new files:

    func_sim/func_instr/func_instr.h
    func_sim/func_instr/func_instr.cpp
    func_sim/func_instr/disasm.cpp
Note: You have to update your repo to use "master" version of func_memory.

Coverage

Disassembler should support all instructions listed on this page without pseudo-instructions and all registers listed on MIPS register page

FuncInstr class

Class FuncInstr is a basic abstraction of instruction in our future simulator.

Interface

Your implementation should provide at least these interfaces

class FuncInstr
{
    // ...
    public:
        FuncInstr( uint32 bytes);
        std::string Dump( std::string indent = " ") const;
};

std::ostream& operator<<( std::ostream& out, const FuncInstr& instr);

You are free to create your own internal variables and methods.

Suggested internal variables

You need to specify:

  • Format type (R, I or J)
  • Registers addresses
  • Type of instruction

The easiest way to support these options is C enumerations.

class FuncInstr
{
    // ...
    enum Format
    {
        FORMAT_R,
        FORMAT_I,
        FORMAT_J
    } format;
    /// ...
};
Constructor

Constructor takes bytes variable and initializes internal variables (parser) using MIPS instruction format on these bytes.

Please avoid long constructor. We recommend you to write some function-helpers.

FuncInstr::FuncInstr( uint32 bytes)
{
    this->initFormat(bytes);
    switch (this->format)
    {
        case FORMAT_R:
            this->parseR(bytes);
            break;
        case FORMAT_I:
            this->parseI(bytes);
            break;
        case FORMAT_J:
            this->parseJ(bytes);
            break;
        default:
            assert(0);
    }
    // ...
};
Byte parser

Good technique to parse bytes is C-style union of structures with bit values.

class FuncInstr
{
    // ...
    union
    {
        struct
        {
            unsigned imm:16;
            unsigned t:5;
            unsigned s:5;
            unsigned opcode:6;
        } asI;
        struct
        {
            // ...
        } asR;
        struct
        {
            // ...
        } asJ;
        uint32 raw;
    } bytes;
};
// ...
    this->tReg = this->bytes.asI.t;
// ...
ISA storage

To store ISA information we suggest you to use static array. There is example of this array below.

class FuncInstr
{
    // ...
    struct ISAEntry
    {
        const char* name;

        uint8 opcode;
        uint8 func;

        FuncInstr::FormatType format;
        FuncInstr::Type type;
        // ...
    };
    static const ISAEntry[] isaTable;
    // ...
};

// ...

const FuncInstr::ISAEntry[] FuncInstr::isaTable =
{
    // name   opcode    func      format              type
    { "add  ", 0x0,     0x20, FuncInstr::FORMAT_R, FuncInstr::ADD /*...*/ },
    // more instructions ...
};

You are free to add in ISAEntry as many fields as you wish.

Dump

Dump method should return disassembly of instructions with their correct names and registers names. Format of output is following:

<indent><instr name> <reg1>[, <reg2>][, <reg3>][, const]

Examples:

add $t0, $t1, $t2

addi $t0, $t4, 0x20

Constants are printed in hexadecimal format.

Note: Dump function must not parse any input bytes! It should use internal variables of class
Hint: You may initialize output in constructor call and save it
Output operator

Output operator should just call Dump method with empty indent string.

std::ostream& operator<< ( std::ostream& out, const FuncInstr& instr)
{
     return instr->Dump("");
}

Disassembler

Disassembler should be stored in disasm.cpp file. This file should have main function taking two arguments — elf filename and section name. After it launches ElfParser, reads every 32-bit word in this section and print its disassembler on standard output with four space indentation.


Validation

Makefile and unit tests exist already. Your disassembler should be built by make disasm and tests on FuncInstr class are run by make test.

Note that tests cover only few simple cases. Please be sure that coverage is great enough. You are able to use MIPS binutils to generate sample tests for your implementation of disassembler. Test programs and instructions are available in <workspace>/tests/samples directory.

⚠️ **GitHub.com Fallback** ⚠️