Asynchronous_Simulation - beyond-all-reason/springrts_engine_wiki_mirror GitHub Wiki

Asynchronous Simulation

Asynchronous simulation (ASIM) allows the game physics simulation to run alongside the 3D graphics rendering on a dual core CPU or better. This gives higher performance, and allows the engine to maintain a high rendering frame rate (FPS) at all times even in big battles.

Spring disables ASIM by default, unless a game has been specifically designed to support it. The reason is that ASIM gives rise to Lua related side effects that can be a bit hard to understand, especially for a game designer who has little experience with threaded programming.

Steps required to support ASIM

  • 1. Choose a Lua Threading Model > 2 below 2. Rewrite the Lua code to conform with the rules imposed by that threading model 3. Learn about the side effects of the model chosen and tweak your code to work around it

Lua Threading Models

These modes can be set for a game using luaThreadingModel in Modrules.lua. The engine can override with the setting MultiThreadLua. A higher mode generally means that the game will perform better, with less side effects, but also requires more work. A game that specifies luaThreadingModel > 2 will be considered ASIM compatible, and will have ASIM enabled by default.

  • 0. No MT, single threaded

    For reference only, this mode should never be used in practice
    
  • 1. Single state

    This mode has a single Lua environment for everything. It will
    cause a significant slowdown because simulation and rendering
    threads block each other trying to access the shared Lua
    environment. Especially problematic is Lua rendering; being
    inherently time consuming (looping through large amounts of
    units etc.) it will force the simulation thread to wait. This
    gives rise to the classic phenomenon of having incredibly high
    FPS but still lagging behind (high ping) since simulation cannot
    keep up with the normal pace.
    
  • 2. Single state, batching of unsynced events (default mode)

    This mode is considerably better than mode(1) since it has a
    separate Lua environment for LuaUI.
    All simulation events sent to LuaUI are also batched/delayed to
    reduce the need for the threads to block each other. For
    instance, if simulation triggers UnitMoved(), the event will not
    reach LuaUI right away, but is put in a batch and handled as
    soon as possible by the rendering thread. The batching system is
    managed, i.e. if the unit that triggered UnitMoved is about to
    be deleted (and thus invalidated) the batch is forcefully run
    before deletion occurs. Because the forced batch run is
    performed by the simulation thread, locking is required and this
    can degrade simulation performance for the same reasons as in
    mode(1).
    The other means of unsynced communication, SendToUnsynced and
    Script.LuaUI.XXX() are also batched and managed to make sure any
    object ID or such sent are still valid once the message arrives.
    Note: Script.LuaUI.XXX() does because of the batching NOT have a
    return value.
    This mode unfortunately suffers the same problems as mode(1) for
    rendering gadgets, since simulation events must be sent to the
    gadgets directly (without batching) in order not to desync the
    game.
    With this mode, it is illegal to invoke LuaUI from another Lua
    environment (LuaGaia or LuaRules). This can happen depending on
    what call-ins the game and any user-enabled widgets implement.
    If such an attempt to invoke LuaUI is detected, it will be
    skipped to prevent a deadlock, and a warning will also be
    printed in the console. Unfortunately there is no other solution
    than to stop using the call-in that throws the error, or to use
    a higher threading model.
    
  • 3. Dual states for synced, batching of unsynced events, synced/unsynced communication

    **via EXPORT table and SendToUnsynced**
    Same as mode(2), but the gadgets have separate Lua environments
    depending on whether the simulation or rendering thread is
    invoking it. This alleviates the negative performance impact for
    rendering gadgets, but makes it impossible to directly share
    synced gadget data to the unsynced part via _G --\> SYNCED. To
    work around this limitation, this mode introduces _G.EXPORT
    that is automatically copied to SYNCED.EXPORT. Only small
    amounts of data should be stored in the EXPORT table, because
    the copying is performed very frequently. Cyclical and/or very
    deeply nested tables are not allowed. Larger data can be stored
    temporarily, e.g.
    _G.EXPORT.myhugetable = huge_table
    SendToUnsynced("huge_table_available")
    _G.EXPORT.myhugetable = nil
    
  • 4. Dual states for synced, batching of unsynced events, synced/unsynced communication

    **via SendToUnsynced only**
    Same as mode(3) but with the EXPORT table disabled to increase
    the performance further. You don't need EXPORT, SendToUnsynced
    alone is sufficient.
    
  • 5. Dual states for all, all synced/unsynced communication (widgets included)

    **via SendToUnsynced only**
    Same as mode(4), but adds separate Lua environments for LuaUI
    depending on whether the simulation or rendering thread is
    invoking it. With dual LuaUI states, batching is no longer
    needed and therefore disabled. This makes it even faster, since
    batching overhead is eliminated and forced batch runs (see
    mode(2)) are not needed. However, this mode adds significant
    complexity for the programmer, since essentially two instances
    of the same widget are run, and only the calling thread
    determines which one is invoked. As a rule of thumb, all
    simulation events (GameFrame etc.) are invoked by the simulation
    thread. All UI and rendering events and Update() are invoked by
    the rendering thread. So, if a variable is modified in
    GameFrame(), that change cannot be seen in Update(). The widget
    must instead use SendToUnsynced to communicate any data from the
    simulation widget instance to the rendering widget instance.
    Design wise it may be suitable to write the widget similarly to
    a gadget, with clearly separated "synced" and "unsynced" parts,
    each with their own variables, functions and call-ins.
    
  • 6. Dual states for all, all synced/unsynced communication (widgets included)

    **is unmanaged and via SendToUnsynced only**
    Same as mode(5) but the SendToUnsynced and Script.LuaUI.XXX()
    communication is unmanaged. The Lua coder must manually check if
    object references sent are still valid when the message arrives.
    

Common side effects

  • 1. Out-of-order execution, applies to all threading models

    All rendering call-ins (Draw\*, Update, Mouse/Keyboard) must be
    designed so that they work correctly even if objects are deleted
    (and possibly replaced by another object with same ID) in
    between the rendering call-ins. This means rendering code that
    stores lists of e.g. units in global variables must update these
    according to UnitCreated/UnitDestroyed events or re-check the
    units for validity before use. Keep in mind that if an object
    has been replaced, the operation that you intended to perform
    may no longer be valid for the new object.
    
  • 2. Performance hit for rendering call-ins in gadgets, applies to threading models < 3

    Avoid writing any time-consuming rendering code in a gadget. The
    easy way out is to let a widget do the rendering instead. The
    gadget can send a message to the widget by calling
    Script.LuaUI.XXX(), but the value returned from this call is
    undefined if ASIM is enabled. Similarly, the result obtained
    when checking for the existence of a Script.LuaUI.XXX function
    can be unpredictable, instead the widget may send a LuaRules
    message to indicate that the function now exists or has ceased
    to exist. If the LUA-SYNC-CPU indicator appears in the upper
    right part of the window it usually indicates a performance
    problem
    
  • 3. Batching glitches, applies to threading models 2 - 4

    For example, a widget that processes an UnitEnteredLOS event
    cannot safely make any assumptions about the unit actually being
    in LOS. Because of batching delay, it may have left LOS already
    when the message is processed, and an attempt to get further
    info about the unit may therefore fail, since access to
    out-of-LOS units is restricted. Numerous events are more or less
    affected by this type of issue.