Performance guidelines - FAForever/fa GitHub Wiki
Supreme Commander: Forged Alliance comes equipped with a Lua interpreter. Unlike compiled languages, where compilers can optimize and eliminate common inefficiencies, an interpreter executes code exactly as it is written. This makes 'code hygiene' particularly crucial. Your code navigates through its execution without the luxury of compiler optimizations. This article introduces various considerations to help write high performing code.
For references and additional reading:
- (1) Lua 5.0 references
- (2) Lua 5.0 implementation (chapters 4, 5 and 7 are most relevant)
- (3) Lua 5.0 performance tips
Specifically (2) and (3) are a good read that can help your understanding of this article.
Concepts
Scope of a value
For this section it is useful to read chapter 7 'The virtual machine' of The implementation of Lua 5.0
Understanding the scope of a value is critical in Lua, covering global, table, upvalue, and local scopes. Retrieving values involves specific opcodes: GETGLOBAL
, GETUPVALUE
, GETTABLE
, and GETLOCAL
. Notably, GETGLOBAL
consumes the most resources, while GETLOCAL
is essentially a cost-free move operation.
A relatively simple way to 'upgrade' the scope of a value to the upvalue scope is by writing local X = X
, where X
is a value in the global scope. An excellent example can be found in the chapter 'Basic facts' of reference (3).
This technique is to be applied consistently throughout the repository, as demonstrated in the FactoryUnit class. External references are localized at the top of the file, ensuring they are captured in the closure when the class functions are created. As mentioned before, the upvalue scope enhances access speed compared to the global or table scope and therefore the code executes faster.
Table access
For this section it is useful to read chapter 4 'Tables' of The implementation of Lua 5.0
Retrieving a value from a table is relatively expensive. It involves a GETTABLE
opcode. Even when you retrieve a value from the array part of the table. It is best to cache the result in the local or upvalue scope when that is possible. As a simple example, take the following code snippet:
---@param self FactoryUnit
---@param builder Unit
---@param layer Layer
OnStopBeingBuilt = function(self, builder, layer)
StructureUnitOnStopBeingBuilt(self, builder, layer)
local brain = self.Brain
local blueprint = self.Blueprint
if blueprint.CategoriesHash["RESEARCH"] then
-- update internal state
brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
brain:SetHQSupportFactoryRestrictions(blueprint.FactionCategory, blueprint.LayerCategory)
-- update all units affected by this
local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
for _, unit in affected do
unit:UpdateBuildRestrictions()
end
end
end,
The reference to the brain and the blueprint are stored in a local value. The generated opcode looks like the following:
{
"( 154) 0 - GETUPVAL 3 0 0", -- StructureUnitOnStopBeingBuilt(self, builder, layer)
"( 154) 1 - MOVE 4 0 0", -- StructureUnitOnStopBeingBuilt(self, builder, layer)
"( 154) 2 - MOVE 5 1 0", -- StructureUnitOnStopBeingBuilt(self, builder, layer)
"( 154) 3 - MOVE 6 2 0", -- StructureUnitOnStopBeingBuilt(self, builder, layer)
"( 154) 4 - CALL 3 4 1", -- StructureUnitOnStopBeingBuilt(self, builder, layer)
"( 156) 5 - GETTABLE 3 0 250", -- local brain = self.Brain
"( 157) 6 - GETTABLE 4 0 251", -- local blueprint = self.Blueprint
"( 159) 7 - GETTABLE 5 4 252", -- if blueprint.CategoriesHash["RESEARCH"] then
"( 159) 8 - GETTABLE 5 5 253", -- if blueprint.CategoriesHash["RESEARCH"] then
"( 159) 9 - TEST 5 5 0", -- if blueprint.CategoriesHash["RESEARCH"] then
"( 159) 10 - JMP 0 24", -- if blueprint.CategoriesHash["RESEARCH"] then
"( 161) 11 - SELF 5 3 254", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
"( 161) 12 - GETTABLE 7 4 255", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
"( 161) 13 - GETTABLE 8 4 256", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
"( 161) 14 - GETTABLE 9 4 257", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
"( 161) 15 - CALL 5 5 1", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
"( 162) 16 - SELF 5 3 258", -- brain:SetHQSupportFactoryRestrictions(blueprint.FactionCategory, blueprint.LayerCategory)
"( 162) 17 - GETTABLE 7 4 255", -- brain:SetHQSupportFactoryRestrictions(blueprint.FactionCategory, blueprint.LayerCategory)
"( 162) 18 - GETTABLE 8 4 256", -- brain:SetHQSupportFactoryRestrictions(blueprint.FactionCategory, blueprint.LayerCategory)
"( 162) 19 - CALL 5 4 1", -- brain:SetHQSupportFactoryRestrictions(blueprint.FactionCategory, blueprint.LayerCategory)
"( 165) 20 - SELF 5 3 259", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 21 - GETGLOBAL 7 10", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 22 - GETTABLE 7 7 261", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 23 - GETGLOBAL 8 10", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 24 - GETTABLE 8 8 262", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 25 - SUB 7 7 8", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 26 - LOADBOOL 8 0 0", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 165) 27 - CALL 5 4 2", -- local affected = brain:GetListOfUnits(categories.SUPPORTFACTORY - categories.EXPERIMENTAL, false)
"( 166) 28 - MOVE 6 5 0", -- for _, unit in affected do
"( 166) 29 - LOADNIL 7 9 0", -- for _, unit in affected do
"( 166) 30 - TFORPREP 6 2", -- for _, unit in affected do
"( 167) 31 - SELF 10 9 263", -- unit:UpdateBuildRestrictions()
"( 167) 32 - CALL 10 2 1", -- unit:UpdateBuildRestrictions()
"( 166) 33 - TFORLOOP 6 0 1", -- end
"( 167) 34 - JMP 0 -4", -- for _, unit in affected do
"( 170) 35 - RETURN 0 1 0", -- end
maxstack=13,
numparams=3
}
This function can still use some attention. In the first sweep we tackled the low hanging fruit. This involved caching the access to the blueprint and the brain. By doing so we prevent a lot of additional table operations. As a small example, take the following line of code:
brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
If we revert the process it would look like this:
self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
And the generated opcode would look like the following. We marked the table operations to make it easier to see.
{ table: 1FB29E88
"( 156) 5 - GETTABLE 3 0 250", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^
"( 156) 6 - SELF 3 3 251", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^
"( 156) 7 - GETTABLE 5 0 252", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^^^^
"( 156) 8 - GETTABLE 5 5 253", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^^^^^^^^^^^^^^^^
"( 156) 9 - GETTABLE 6 0 252", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^^^^
"( 156) 10 - GETTABLE 6 6 254", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^^^^^^^^^^^^^
"( 156) 11 - GETTABLE 7 0 252", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^^^^
"( 156) 12 - GETTABLE 7 7 255", -- self.Brain:AddHQ(self.Blueprint.FactionCategory, self.Blueprint.LayerCategory, self.Blueprint.TechCategory)
^^^^^^^^^^^^^^^^^^^^^^^
"( 156) 13 - CALL 3 5 1",
}
And in contrast:
INFO: { table: 1F9D35F0
INFO: "( 159) 7 - SELF 5 3 252", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
^^^^^^^^^^^
INFO: "( 159) 8 - GETTABLE 7 4 253", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
^^^^^^^^^^^^^^^^^^^^^^^^^
INFO: "( 159) 9 - GETTABLE 8 4 254", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
^^^^^^^^^^^^^^^^^^^^^^^
INFO: "( 159) 10 - GETTABLE 9 4 255", -- brain:AddHQ(blueprint.FactionCategory, blueprint.LayerCategory, blueprint.TechCategory)
^^^^^^^^^^^^^^^^^^^^^^
INFO: "( 159) 11 - CALL 5 5 1",
INFO: }
This means by caching the results we easily reduced the number of GETTABLE
opcodes to less than half of the original count, for just this one line! It matters in practice. This is best shown by the optimalisations in #5524. The most relevant changes are lines 804 - 821. The author caches the result of the table access in a local value instead of traversing several tables just to update another property of the same value.