Coding guidelines (dirty draft) - We-the-People-civ4col-mod/Mod GitHub Wiki

coding

  • VERBOSELY COMMENT ON YOUR CODE! This is most likely the most important coding guideline in a project with various teammembers, who join and leave at random and don't know each other. Always think of a person, reading your section of the code the first time. Can it immediately understand what the code does and why it is done that way from the code and the comments alone?

  • Try to write code, which doesn't need comments if possible. Function names can be self explanatory while more hidden implementation issues needs to be commented. This is particularly true for interactions with the EXE, which has a whole lot of hidden undocumented requirements, which we discover one by one when we fail to meet such requirements and introduce a bug that way.

  • Use verbose variable names

  • Prefix variable names with character that indicates the type. Eg bName, pName and aiName for boolean, pointer and array of integers.

  • additional m_ Prefix for members

  • Inline simple getters. Simple means that it just and only returns a member variable.

  • big O notation in comment on function definition

  • initialize members in constructor call

  • use pointers if value can be NULL, otherwise use references -- never assume a pointer isn't NULL without explicitly testing it

  • use asserts -- the absolute minimum usage is to check function parameters

variable naming

Vanilla uses Hungarian notation to indicate the type of each variable. It is used as follows:

prefix meaning comment
m_ member variable of a class
a array (usually C style) used after m_, but before any other prefix
b bool
e enum
em_ enum map mod added type
i integer
ui unsigned integer
ia_ info array mod added type
info_ info array mod added type
k reference no idea why vanilla picked k for this
p pointer
sz string usually CvString or CvWString

List is not complete as various people haven't used this consistently due to not remembering all of them and/or not caring enough.

After the prefix, the variable is named as a sentence where each word starts with uppercase and no spaces, underscores or similar, example: m_aiExampleArray.

Proposed name change

Hungarian notation helps when the types aren't type strict. For instance char* can be a string, char array and a pointer to a single byte variable. We can't tell which one based on the variable, so more strict naming is needed to indicate the purpose.

In the modern era, such arbitrary variables aren't recommended. Instead we should tell variable types apart based on which enum or class instance they represent. C style arrays are generally prone to bugs, so they should be replaced anyway, if nothing else, then with a class containing the pointer, so we have a deconstructor to free memory, access functions to test array bounce etc. EnumMap works based on this principle.

m_ seems like a good idea for private member variables in classes. Public variables in structs can obmit them as such structs shouldn't really be anything other than a collection of public variables and the names are used elsewhere for access.

p should likely be kept to remind that it can be NULL. Use of naked pointers should be reduced if not completely eliminated, through it might be hard to get rid of some of them.

a should be kept, but made to apply to only C style arrays. It will then slowly die as we replace those arrays with more safe types of arrays.

i should likely be kept, but also in many cases replaced. We should have classes for movement points and other stuff like that to make the code more readable and easier for the compiler to detect mistakes. It also makes it easier to use clean code by moving certain code aspects into variable classes, like a movement points class can have everything with MOVE_DENOMINATOR internally, resulting in better use of code encapsulation.

The rest of the prefixes can likely be omitted as they indicate enum of class use, in which case the type tells everything we need to know.

Names should likely use all lowercase and underscores as "spaces" in names makes it easier to read them, like eDefendingUnit vs defending_unit. We might make the same considerations for functions, like CvPlot::isValidDomainForLocation vs CvPlot::is_valid_domain_for_location.

Issues with not using Hungarian notation

Say we have a function, which contains

name meaning
pUnit pointer to CvUnit instance
iUnit unit ID (through this is sometimes UnitTypes as int)
eUnit UnitClass of the unit
kUnit CvUnitInfo reference
eProfession enum value of the current profession
kProfession reference to CvProfessionInfo

Getting rid of prefixes means we will have multiple variables just called unit. We need to consider a naming convention, which can handle a case like this.

File and class prefixes

prefix meaning
Cv DLL internals or DLL interacting with the EXE
Cy DLL functions exposed to python and related to code to this (like code to expose enums)
no prefix basically the same as Cv, but modders have not been consistent in adding that to new files or functions
EXE EXE_interface.cpp is the only file with this prefix. It has wrapper functions to avoid the exe calling Cv functions directly, hence freeing up the Cv functions to change arguments if needed

Cy is used consistently. Cv is not and we may consider how we will handle the DLL internals in the future, particularly in terms of classes. Many classes will NOT benefit from a Cv prefix, like CvEnumMap just looks stupid.

committing

  • do not merge commits into the main branches, which by themselves leave the code broken
  • if you work on a large issue, create your own branch
  • when you are finished with your implementation squash commits together which belong together
  • important! reference resolved issues in the final commit message. make an amendment if you forgot to do that
  • keep the commit history on the main branches clean and slim (use rebase merges)
  • close issues which are resolved and reference the commit hash if necessary