Mantras - arizvisa/ida-minsc GitHub Wiki

This plugin tries to follow a couple of mantras and so it might help to share them with users. Mumble these to yourself quietly as you use this plugin.

  • Object-oriented programming and their schemas suck as they require repeatedly referencing documentation. Therefore, I will avoid that style of programming as often as possible.
  • Things such as long module names, namespaces, or function names take longer to type and thus are inherently designed to slow me down. Therefore, I will use aliases and shortcuts as often as possible.
  • If I'm not sure where the right function is that I need in order to perform some action on something that I am currently reversing. I will resolve this by identifying the action I'm trying to perform and use it to determine the namespace that contains the functionality that I desire. Auto-completion is my friend.
  • I don't care to remember what parameters a particular function needs. Therefore, I will give the function whatever that I currently have (prioritized in order of scope) and allow the function to fill in what is missing according to what I have currently selected.
  • I dislike repeating actions to find things while I'm reversing, therefore I will instead tag them so that I can query it later and avoid having to revisit some area entirely.
  • Numbers, lists, dictionaries, sets, and names are also legitimate annotations in an IDA database. I will tag these values at certain contexts using a name whose semantics can quickly communicate their purpose to me or someone else that I am sharing my tags with.
  • The cost of a (mental) context-switch for development (or similar) is expensive and distracts me from my current reverse-engineering endeavors. Therefore, I will attempt to perform the minimal amount of development as necessary in order to still remain focused on my original goal.

Object-oriented programming and schemas suck due to me having to memorize them along with parameter order.

"Therefore, I will avoid relying on them at all costs and mentally prioritize the order by their scope (larger-encompassing to smaller-details)."

This plugin aims to steer away from the user having to keep track of object types and their relationships in order to allow the user to focus on what they're doing. The IDAPython api expects the user to know certain flags, types, or structures when interacting with it. The plugin attempts to abstract away this required knowledge by allowing the user to use addresses, ranges, names, indexes, etc. as parameters where they're needed.

Examples

  • I don't remember how to interact with the first operand of this instruction.

    • But, I need to change the first operand's type to a stack variable.
      instruction.op_stack(0)
      
    • I want to find all the other references to this operand.
      instruction.op_refs(0)
      
    • I want to make the first operand a hexadecimal number.
      instruction.op_hex(0)
      
    • I want to output the hexadecimal value of this operand.
      print(instruction.op_hex(0))
      
    • I want to output the type of this operand.
      print(instruction.op_type(0))
      
    • Is this operand being written to?
      'w' in instruction.op_state(0)
      
    • How is IDA rendering this operand?
      print(instruction.op_repr(0))
      
  • I want to change the type of a structure member, but I don't remember anything about IDA's structure api in order to determine the structure's identifier, or even it's complete name.

    • How should I identify a structure that I want to use?
      1. First I'll list them.
        • I don't remember what I have, so I'll just list everything.
          structure.list()
          
        • I know it starts with an underscore, so I'll list everything prefixed with an underscore.
          structure.list('_*')
          
        • I have a regex for the structures I want to list.
          structure.list(regex='_?LIST*')
          
      2. Now I can assign it to a variable.
        • I know the exact name of the structure I want to fetch.
          st = structure.by('_LIST_ENTRY')
          
        • I want the 5th structure in my structure list.
          st = structure.by(index=5)
          
    • I want to display this structure's members in order to locate a particular one.
      • I don't remember what fields are defined, so I'll just display it.
        print(st.members)
        
      • I want to know the member at offset 0x10.
        print(st.members.by_offset(0x10))
        
      • My field is the third member of the structure.
        print(st.members[3])
        
    • I'm pretty sure the member I want has "ptr" somewhere in its name.
      • Let me save the member to a variable.
        mem = st.members.by(like='*ptr*')
        
    • This member needs some of its attributes to be modified.
      • I want to modify the name of the member by prefixing it with "p".
        mem.name = 'p', mem.name
        
      • It needs to be a 3-element array of 32-bit integers, but I have no idea what magic spell of flag combinations to use to get IDA to understand me.
        1. I know that 32-bits is 4 bytes in size, so my type should look like the following.
          mytype = (int, 4)
          
        2. This array has 3 elements, so I need to wrap my type in a list.
          [mytype, 4]
          
        3. I'll just assign the full type directly to the member.
          mem.type = [(int, 4), 3]
          
      • I want the member type to be a void pointer.
        mem.typeinfo = 'void*'
        
    • The second instruction operand of the current instruction is actually pointing into this member.
      • I'll just inform IDA that the second operand (1) of the instruction is referencing the member.
        instruction.op_structure(ea, 1, mem)
        
      • To display the structure offset of this operand, I'll print it.
        print(instruction.op_structure(1))
        

Long modules/namespaces/functions slow down my typing because I don't know how to touch-type and have a terrible WPM.

"Therefore, I will use aliases and shortcuts as often as possible."

The plugin aims to accomplish this by taking the most common function names, and adding aliases that reference them so that if the user doesn't remember the full name, they can take a guess and likely find what they're looking for. Module names (representing contexts) like "database" or "function", are abbreviated as "db" or "func" (respectively). Likewise, the common namespaces and even functions within these modules are also abbreviated. So, the database.address namespace can be abbreviated as db.a, database.xref has an abbreviation as db.x, and even the database.here() function has an abbreviation as db.h().

On startup, the plugin will reset the global Python namespace and load the contents of the user's ".idapythonrc.py" file into it. This way the user can save any common snippets or tools they write, and make them available throughout all of their reversing sessions.

Examples

  • I like to enumerate all the cases for the switches within a function, but I don't want to have to type the same code all the time. To avoid this, I'll add some code to my $HOME/.idapythonrc.py file so that it's always available.
    1. I'll first add a function that dumps a switch to my rc-file.
      • Let's define a function called "dump_switch" that will just print out every case number and address.
        def dump_switch(switch):
            print('\n'.join('%#x: %x'% (case, switch.case(case)) for case in switch.cases))
        
    2. Now I'll use that function to define something that dumps all the switches in a function.
      • I'll start writing a "dump_switches" function.
        def dump_switches(ea):
        
      • First I'll need to iterate through all the switches in a function ea.
        def dump_switches(ea):
            for switch in function.switches(ea):
        
      • Now I can call "dump_switch" for every switch in my function.
        def dump_switches(ea):
            for switch in function.switches(ea):
                dump_switch(switch)
            return
        
    3. Whenever I want to display a switch, I can just call the function directly.
      dump_switches(h())
      
      • I'm going to use this a lot, so I might as well map it to a hotkey.
        ui.keyboard.map('ctrl+a', dump_switches)
        

I will avoid referring to documentation because I already have hundreds of tabs that I need to go through.

"Thus, I will always keep track of the context that I'm trying to work in. I will use this context when considering what namespace contains the function that I desire."

The path that a user may type in order to descend to the functionality within the namespaces that are provided by this plugin are sorted in a hierarchy. This hierarchy starts with the "context" of what the user desires to interact with. Things like "functions", "structures", "enumerations", etc. get their own module. Within these modules are either sub-contexts such as function.frame, function.chunk(s), function.block(s), database.config, database.xref, enumeration.members, database.functions, database.type, database.exports, or function.frame.args.

One the context/sub-context has been determined, the next aspect will be the action or verb such as database.name, database.get, function.comment, function.blocks.iterate, database.imports.list, function.block.color, function.frame.args.size, etc.

In the majority of cases if the action is a callable function and its name can also be considered an attribute of the context/sub-context it is associated with, then it can be used to either read or modify the attribute within the database. Some examples could be database.color which if given an RGB value will apply the color to an address, whereas if it was not given a color will end up returning the color for an address. Another example would be function.type which will return the function's type information, but when called with a parameter can change the function's type.

Some actions like in database.get or database.set include some more types to descend into which one can use to read or write to an address in a database. In these cases, there's the database.get.integer or database.set.float namespaces which provide even more sub-types such as the database.get.float.single and the database.set.integer.uint32_t functions.

(Remember: verbs/actions come before types and sub-types)

Examples

  • I want to set a uint16_t at an address in the "database".
    • Context is "database"
    • Verb/action is "set"
    • Type is an integral "uint16_t"
    • The function I'll need to call should set an "uint16_t".
      database.set.integer.uint16_t(ea)
      
  • I want to get a single at an address in the "database".
    • Context is "database"
    • Verb/action is "get"
    • Type is a float of type "single"
    • The function I'll need to call should set a floating-point "single".
      database.get.float.single(ea)
      

I'm not sure which parameter goes where, but I'm currently at a particular address or have a basic block selected that should be used.

This is represented in the plugin via examination of the parameters that the user gave a function. If a necessary parameter is found missing, then the implementation will try take a guess at what the user is trying to act upon. If the user did not give the function an address that is required by it, then the plugin will look at what the user has selected and attempt to fill it in.

As each function defined by the plugin can take most arbitrary types as their parameters, the plugin will convert its input into whatever the function needs in order to perform the action requested by the user.

Examples

  • I want to disassemble the current basic block that I've selected in a function.
    • I'll need to get the disassembly of the current block and print it.
      print(function.block.disassemble())
      
    • If I want to disassemble every baic block and print it.
      for block in function.blocks():
          print(function.block.disassemble(block))
      
    • I want to just figure out their boundaries and write them to the screen.
      for left, right in function.blocks():
          print(hex(left), hex(right))
      
  • I want to know the entrypoint address for some function.
    • To get the address of a function I can just grab it from the "function" module.
      print(function.address(name))
      
    • I'm not sure where the function is at, so I'll just use any address within the function.
      print(function.address(ea))
      

I know I'll need to recall some list of names when looking at this function, but I don't remember what for.

"Therefore, I'll tag it so that I can query it later whenever I remember what that need was."

Some annotations that a reverse-engineer may write into their database can be completely arbitrary or even temporal as the user may currently be debugging something and the address will change when the process terminates. In order to facilitate these needs, a tag can contain an arbitrary python type (that must be serializeable). These tags will be displayed as "repeatable" comments so that they are readily available as a user is navigating through their database.

When using a more recent version of IDA, these tags can also be modified just like regular comments. In this way, the user does not need to always script the tagging of addresses with their values. Thus if there's something their script missed, the user can manually assign or repair a tag if necessary. By default, all comments the user notes in their database are indexed and are queryable.

Examples

  • I'm looking at a function that references a bunch of imports within it, and I don't want to have to re-decompile to identify them whenever I see it referenced.
    1. First I'll grab the addresses that I'm associating with the function.
      • I'll assign all the addresses I want to a list.
        addresses = [ea_of_import1, ea_of_import2, ea_of_import3]
        
    2. Next I'll convert them to import names so they're easy to identify.
      • This way I can use a list-comprehension to transform each address to a name.
        imports = [ database.imports.name(ea) for ea in addresses ]
        
    3. Now I can just tag the function with the import names.
      • To save my list of imports, I'll just tag them to the function with the key "imports".
        function.tag(ea, 'imports', imports)
        

I will use my database as a notepad for all of my RE tools.

"This way I can treat it as an actual datastore, and ask it for addresses to feed into another tool."

The plugin aims to simplify the exchange of data between multiple instances of IDA as well as different tools within the user's arsenal. Tags are able to be imported, exported, or rendered in a variety of ways. This way a user can perform analyses however they deem fit, and still be able to query them later in order to use as input for something else.

Examples

  • I have a great breakpoint into some parser, but the breakpoint is pretty complex and is the second set of a bunch of breakpoints that I'm using.
    • First I'll need to tag it.
      1. I have a breakpoint that I commonly use, so I'll assign it into a variable.
        value = '.printf "Hit offset: %%x\\n", %#x; .printf "Function name: %s\\n"; g'% (database.offset(), function.name())
        
      2. Now that it's in a variable, I'll tag it to the current address I've selected.
        database.tag('breakpoint(1)', value)
        
    • Now if I want to query the entire database for it.
      1. First I'll need to select all functions that reference the "breakpoint(1)" tag in their contents.
        • I'll need to query all my functions for any instance of the "breakpoint(1)" tag in its contents.
          for fn, found in database.selectcontents('breakpoint(1)'):
          
      2. Now I'll need to select the addresses within the function that are tagged with "breakpoint(1)".
        • Now that I've got all the functions, I need to query their contents for "breakpoint(1)".
          for fn, found in database.selectcontents('breakpoint(1)'):
              for ea, tags in function.select(fn, 'breakpoint(1)'):
          
      3. Finally I can just print out the breakpoint to copy+paste into my debugger.
        • Now I'll go through each address in the functions' contents, and output the breakpoint.
          for fn, found in database.selectcontents('breakpoint(1)'):
              for ea, tags in function.select(fn, 'breakpoint(1)'):
                  print('bp %#x "%s"'% (ea, tags['breakpoint(1)']))
          

I will use keyboard shortcuts and hooks to automate as much of myself as possible.

There are occasionally many mindless tasks that a reverse-engineer may find themselves performing when examining a particular binary. As IDA does not support macros (as of yet), the user is left to having to script their particular tasks. IDA does allow a user to hook various parts of its analyses or actions that the user may perform, however, these hooks may affect other tools that use them.

In order to allow multiple things to be applied during a user's reversing session, this plugin exposes IDA's various hooking, notification, and binding APIs via namespaces that the user can interact with or assign within their rc-file. These interfaces then allow the user to manage them or stack their callables in any way that they deem fit.

Examples

  • It seems that the second operand of this instruction always references a string that points to the current function's actual name.
    • I want to bind this to "Ctrl+A".
      1. First I'll want to create a callable that renames the current function with the string pointed to by the second operand.
        • I need to define a function that renames the current "function" using the string pointed to by the second (1) operand.
          def F():
              function.name(database.get.string(instruction.op(1).offset)))
          
      2. I need to map this function definition to a hotkey.
        ui.keyboard.map('Ctrl+A', F)
        
    • I don't remember what keys I've currently bound:
      • Let me list the hotkeys that are currently bound.
        ui.keyboard.list()
        
    • I'm done with that keybinding:
      • I'm done with that hotkey for "Ctrl+A" so I can just remove it.
        ui.keyboard.unmap('Ctrl+A')
        

The cost of a (mental) context-switch for development (or similar) is expensive and is a distraction from my end goals.

"Thus, I will avoid reversing as much as possible unless I have to."

When performing analysis (be it major or minor), the reverse-engineer will typically need to transition into the mindset of a developer. This plugin tries to reduce that requirement by reducing the IDAPython api into individual single-line commands that perform the actions requested by the user. Within the plugin's api, every function performs the user's requested action and can take its input in a variety of formats. This intends to allow the user to write one-off scripts within the IDAPython REPL, and not have to invest too much time in debugging the side-effects of their script.

This combined with tagging support should allow the user to quickly prototype the collection of relevant attributes, incrementally improve upon them through either refinement or manual adjustment, and then eventually use them to identify the relevant pieces of code or data that concerns them.

Examples

  • I want to focus only on functions that make allocations, and include signed multiplications.
    • I'll first tag functions that use "malloc".
      1. I need to find all the xrefs to an import that has "alloc" in its name.
        for ea in database.xrefs.up(database.imports.search('*alloc*')):
        
      2. Now I can iterate through all of them, and tag the function with a comment for me to read.
        for ea in database.xrefs.up(database.imports.search('*alloc*')):
            function.tag(ea, 'malloc', 'calls malloc at %x'% ea)
        
    • I'll then tag signed multiplications that use "imul" for its mnemonic.
      1. I need to iterate through all the functions in the database.
        for ea in database.functions():
        
      2. Then I need to iterate through all their instructions.
        for ea in database.functions():
            for ea in function.iterate(ea):
        
      3. This way I can then check the instruction's mnemonic for "imul".
        for ea in database.functions():
            for ea in function.iterate(ea):
                if instruction.mnemonic(ea) == 'imul':
        
      4. Now that I've found one, I'll just tag the function with this knowledge.
        for ea in database.functions():
            for ea in function.iterate(ea):
                if instruction.mnemonic(ea) == 'imul':
                    function.tag(ea, 'signed-multiply', ea)
            continue
        
    • Now I can query for any functions that use both.
      1. I used a couple of tags, so I'll want to search for ones that use both "malloc" and "signed-multiply".
        for ea, tags in database.select(And=('malloc', 'signed-multiply')):
        
      2. Now I can iterate through them, and output their address and function name.
        for ea, tags in database.select(And=('malloc', 'signed-multiply')):
            print(hex(ea), function.name(ea))