Skip to content

Coding Standards (java, cpp)

Gary edited this page Mar 4, 2023 · 3 revisions

Managing Dependencies

  • Use forward declarations wherever possible to avoid depending on #include files
  • Make the header file directly associated with its implementation the first one in the implementation file so that the header is compiled in at least one context without other implicit includes (verifies it can be compiled independently)
  • Use pimpl_ with scoped_ptr or shared_ptr as an additional tool to limit header dependencies
  • Move private methods into cpp files
    • Free function in unnamed namespace
    • Free function taking reference to object for additional dependencies
    • Free function taking privates as params
    • Free function as member of Impl struct

Construction/Destruction/Copying/Assignment

Construction

  • Declare all single-argument constructors explicit unless you need implicit conversion
  • Initialize members using member-initialization lists (listing members in the same order they are defined in the class)
  • Use an Init method if construction can fail (consequence of no exceptions)

Destruction

  • All classes should have a public virtual destructor save for exceptional performance cases
  • If no public virtual destructor then should have protected non-virtual destructor
  • Destructor should never throw (consider wrapping with try { ... } CATCH_UNEXPECTED_EXCEPTION)

Copying

  • Derive from boost::noncopyable (or equivalent) for all classes which should not be copied
  • For all other objects use either natively copyable primitive types or pimpl idiom
  • If you allow compiler generated copying put a comment indicating you thought about this
  • If this isn't suitable be sure to implement copying correctly! (e.g. assignment to self in =)

Memory/Pointers/References

Smart Pointers

  • Every dynamically allocated object should be owned by a smart pointer
  • Use the smart pointer type corresponding to the pointer's lifetime
    • unique_ptr when only one owner exists,
    • scoped_ptr for simple stack-like behavior,
    • shared_ptr for containers and reference-based lifetime, etc.

Arguments to functions

  • Immutable object arguments to methods should be passed as const reference
  • Mutable arguments should be passed as dumb pointer (implying external ownership)
  • Non-const reference should never be passed
  • Output arguments should be the last ones in the function, and passed as a raw pointer

Subclassing

General

  • Make spare use of subclassing for re-use. Prefer composition.
  • Be wary of protected methods. Is there a way to make what is needed part of the public interface?
  • Make virtual functions non-public (preferably private if subclasses don't need to call them directly) and public functions non-virtual
  • When redefining a virtual function always mark it virtual as documentation that it is an override
  • Don't call virtual functions in constructors or destructors

Interfaces

  • Prefer deriving from interfaces to deriving from base classes
  • Add a non-pure virtual destructor to all Interface classes

Static Data

  • NEVER use static in a header file. Use extern instead.
  • Do not use static data of class types in threaded code (c++ order of initialization issues)
  • Use const char* for global strings
  • If you need a class type to be static new it up in main (and never free it) rather than allowing it to constructed as part of static initialization
  • Prefix static data with s_
  • Global functions that do not depend upon external variables should be implemented as free functions in a namespace rather than as part of a "static" class or a singleton.
  • Global functions that depend upon static data should be defined as static members of a class which owns the static data and arranges for its synchronized access.

Constness

  • Make as many member functions and variables const as possible
  • Prefix constants (including enums) with k
  • Keep enums and constants within implementation files wherever possible
  • Place constants in the anonymous namespace rather than using the static prefix
  • Use const char* for strings (see above)

Design

  • Prefer composition to subclassing for re-use.
  • Consider using more free functions to increase encapsulation of classes (less code depends upon internals)
  • Use boost::function and boost::bind to implement polymorphism by function not type
  • Avoid initialization dependencies across compilation units

Source Code

Naming (consistent with Java naming)

 namespace my::lower_case::name
 class MyClass
 myMemberFunction
 myLocalVariable
 myMemberVariable_
 pImpl_
 pMyPointerVariable
 kMyConstant
 MyEnum / kMyEnumValue
 MY_MACRO
 myvar() / setMyvar
  • Use is/has/can/should for booleans

Indenting

  • Use spaces, not tabs, for indentation
  • 3 space indent
  • Double (6 space) indent when continuing long lines
  • Beginning brace goes on newline for types, functions, lambdas, and blocks
    • Exception: simple lambdas are allowed to be on one line

Line width

Maximum line width of 100 characters.

Comment headers

All C++ and Java source files should start with the following comment header (adapt comment style for other file types).

/*
 * NameOfFile.cpp
 *
 * Copyright (C) 2023 by Posit Software, PBC
 *
 * Unless you have received this program directly from Posit Software pursuant
 * to the terms of a commercial license agreement with Posit Software, then
 * this program is licensed to you under the terms of version 3 of the
 * GNU Affero General Public License. This program is distributed WITHOUT
 * ANY EXPRESS OR IMPLIED WARRANTY, INCLUDING THOSE OF NON-INFRINGEMENT,
 * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Please refer to the
 * AGPL (http://www.gnu.org/licenses/agpl-3.0.txt) for more details.
 *
 */

Set the copyright date to the current year when creating a new file.

Java

Unless otherwise specified based on all of the above C++ standards

  • Use ALL_CAPS for constants (rather than k prefix with mixed case)
     public static final RWebContentFileType RMARKDOWN =
           new RWebContentFileType("r_markdown", "R Markdown", EditorLanguage.LANG_RMARKDOWN,
                                ".Rmd", new ImageResource2x(ICONS.iconRmarkdown2x()), true, true, true);
  • Use get prefix for instance accessors for consistency with Java libraries and GWT
     private String path;
     public String getPath()
     {
        return path;
     }
  • private members should appear at the bottom of the class declaration
  • Anonymous inner classes should start on their own line
     openCommands.addCommand(new SerializedCommand()
     {
        @Override
        public void onExecute(final Command continuation)
        {
           paneManager_.openFileInNewColumn(item, continuation);
        }
     });
  • Avoid using wildcard (org.path.to.package.*) imports
  • Where possible, compare objects for equality using their equals method rather than the == operator. The latter delegates to JavaScript’s == operator, which for non-intrinsic types checks whether two objects are the same instance rather than the same value.
     if (ext.equals(".htm") || ext.equals(".html") || ext.equals(".nb.html"))
     {
        ...
     }
  •  // Non-block style comments are preferred. 
    
     // In multi-line comments, each line 
     // should be commented without using a block.
  • If commenting a function, then prefer javadoc-style blocks.

Iterating

When iterating over a collection that implements the Iterable interface it is preferred to use Java 8's foreach syntax.

   for (Object obj : Collection c) 
   {
      ...
   }

Regular Expressions

Avoid regular expressions that may perform excessive backtracking -- these can lead to poor runtime performance and, in extreme cases, runtime exceptions that can cause an application crash.

This is especially important when constructing regular expressions intended to run over arbitrary user text, as it's possible that the regular expression engine could inadvertently become stuck backtracking over regions of text where it was not intended to. Here are a few main anti-patterns:

Greedy Intersections

Avoid sequential greedy atoms, where there is some intersection between the characters matched by each atom (example)

In this case, the portion of the regular expression (\w+_\w+) behaves poorly, as _ is actually part of the \w character class, and so matching something like

abc123_def456@

can lead to excessive backtracking within abc123_def456 after failing to match at @.

Greedy Ending

Avoid using greedy regular expressions when attempting to match text at the end of a string (example).

Although it's unintuitive, a regular expression like ---\s*$ can lead to excessive backtracking, e.g. in matching

---                    b

Once the \s* section hits b, the regular expression will attempt to backtrack and, in effect, keep looking for the end of the line (even though it's obvious that we can't find the end of the line by backtracking!). This leads to quadratic behavior in the regular expression engine, effectively backtracking 0.5 * n2 times, where n is the length of the greedy match. In situations like this, we likely want to use a possessive / non-backtracking class, e.g. ---\s*+$.

Tools like Regex101 can be useful when diagnosing the performance of a particular regular expression.

Logging

When logging messages, use the following levels:

  • DEBUG level for diagnostics and information.
  • WARNING level when recovering from an unexpected state and continuing execution.
  • ERROR level when an unexpected state is unrecoverable and will result in an operation failing to complete.

TODOs

Use TODO comments as a tool to track work that needs to be done prior to merging a development branch.

We generally avoid leaving TODO comments on building/shipping branches so that it's possible to use a TODO search effectively on development branches.

Philosophy

We prefer using functions and closures (boost::function() and boost::bind()) to classes.

We use a rich error object (Error) rather than exceptions. This is to force reflection about what to do with each error (end the world, log silently, take another code path, warn the user, etc.). This sometimes results in extra code but it makes the overall system much easier to understand (no parallel code path which you can't see when reading the code for a function). More rationale is found in the Google C++ Style Guide.

Pointers are always managed using boost::shared_ptr or boost::scoped_ptr. The only exception to this is occasionally we'll use a naked pointer for static data to work around problems with C++ order of static construction/destruction.

When passing arguments by reference to functions, we always put them at the end of the signature and use a "naked" pointer (e.g. std::string*). This might seem counter to #3, but having established that memory is never managed with explicit new and delete there's no semantic ambiguity to std::string* (i.e. no uncertainty about who deletes it). Whenever you see a naked pointer in our code it is always owned by someone else so you never delete it.

The major benefit to this is that it's very clear when reading code at the call site that a side effect of the function is to modify an input parameter. If we were were just passing by C++ reference then there's no way to "see" that there is a side effect at the call site. This idea also came from the Google Style Guide.

Clone this wiki locally