Code formatting - NetLogo/NetLogo GitHub Wiki
This document does not concern itself with Java code style. Ideally, you should not be writing new Java code. If you absolutely must write new Java, try to glean the code style from the plethora of existing Java files and emulate it.
For anything else that you can't find the answer to here, please see the Scala Documentation site for their suggestions. If you can't find any advice about a particular something anywhere, do it your way and see what happens.
In version control (Git), aim to make your commits small, orthogonal chunks of functionality. A commit should neither contain huge swathes of unrelated changes nor a small set of changes that cannot stand alone. Commits that end up on the main branch of a GitHub repo are each expected to compile and pass the tests. Occasional lapses in this are permissible, but don't make a habit of it.
Commit messages should be descriptive and fit within 72 characters (the Git standard). Any information you decide to include that isn't entirely concerned in describing what the commit does (e.g. explanation of caveats, personal woes, or contracts made with Satan) should be placed in the area below the main commit message.
Clearly, these rules don't apply to commits on work-in-progress (WIP) branches (neither local nor remote). However, do make sure to rebase the WIP branch to clean up the history before bringing it over to the main branch.
Bite-sized commits are easy to revert and cherry-pick. They are also easier to understand when looking through old version history and trying to understand what was being done and why.
Commits are expected to keep the repo in a good state (i.e. compiling; tests passing) because it makes bisecting a lot easier, and also for the same reasons given above for having small commits (e.g. If commit A doesn't pass the tests, but commit B does, and I revert commit B, I likely now have a test-failing repo on my hands, and I probably don't know why they're failing).
Commit messages are expected to fit within 72 characters, with a suggestion of keeping it to around 50 characters, because that's simply the Git standard.
Lines are limited to 120 characters in length.
Use your judgment.
The era of widescreen monitors is a wondrous thing.
When not awkward to do so, code should be padded with extra spaces to make clear the relations of parts across lines.
"When not awkward to do so" is the operative phrase. If doing this would require padding the code with a large block of spaces (e.g. 20+ spaces), probably don't worry about it.
When the common parts of a block of code line up, it is much easier to read. This is not something that is unique to code; how pleased would you be if each cell in a spreadsheet had a unique size? How easy would that be to read? Very often, code takes the form of applying very similar operation operations to multiple pieces of partially-related data, and it's very convenient to be able to look at one row and one column of a block of code and immediately know what the entire block does.
One of the most common complaints about code alignment is that it makes diffs unreadable. The get around this problem in Git, simply perform your diff
or show
or whatever with the -w
flag to not show whitespace changes. Whoa.
2-space indents. No tab characters ever. No trailing spaces. Ensure that your files all end in newlines. (git diff
can easily detect many of these errors.)
From the official Scala style documentation: "There should be no space between parentheses and the code they contain. Curly braces should be separated from the code within them by a one-space gap, to give the visually busy braces “breathing room”."
Adding extra spaces for your indentation in order to make code line up is perfectly acceptable—encouraged, even.
2-space indentation is the Scala standard way.
Tabs are obnoxious and cause the code to be displayed differently on different machines.
Trailing spaces serve no use and benefit no one.
Files that don't end in newlines can wreak havoc on certain Linux tools.
Package names should not be mentioned outside of import declarations.
Bad code:
val x = scala.collection.Seq[Int]()
Less-bad code:
import scala.collection.Seq
val x = Seq[Int]()
In whatever scope you choose to bring in your imports, the imports should be done as soon in the scope as possible.
Bad code:
def foo(x: String): Int = {
val sentinel = "f"
import x._
val bool = contains(sentinel)
val b2 = !bool
import bool._
if (||(b2)) 1 else 0
}
Less-bad code:
def foo(x: String): Int = {
import x._
val sentinel = "f"
val bool = contains(sentinel)
import bool._
val b2 = !bool
if (||(b2)) 1 else 0
}
Imports should be listed explicitly (no wildcards) and somewhat alphabetized. Unused imports should not be.
Bad code:
import scala.collection.{ Set => SSet }
import java.util._
import java.net.URL
import scala.collection.Map
import scala.collection.Seq
import scala.collection.Seq.empty
import scala.util.Try
object Derp extends App {
val set = SSet(new Date, new Date, new Date)
val seq = Seq(new URL("http://yahoo.com"), new URL("http://google.com"), new URL("http://duckduckgo.com")) ++ empty
Try(set.zip(seq))
}
Less-bad code:
import java.net.URL
import java.util.Date
import scala.collection.{ Seq, Set }
import scala.collection.Seq.empty
import scala.util.Try
object Derp extends App {
val set = Set(new Date, new Date, new Date)
val seq = Seq(new URL("http://yahoo.com"), new URL("http://google.com"), new URL("http://duckduckgo.com")) ++ empty
Try(set.zip(seq))
}
The tree style is only required at the top-level of a file. These rules are not strictly enforced for small files or small collections of imports.
The goal is to keep files from becoming entirely cryptic. In Scala, lots of things can be imported (e.g. packages, classes, objects, methods, variables, implicit methods, implicit classes, implicit variables, types), and it becomes problematic when a plethora of different stuff is imported at the top of the scope in wildcards, and the programmer is then unable to easily determine where unfamiliar terms are coming from.
Avoiding wildcard imports and unused imports is useful because doing so improves compilation times and makes the code more explicit. It can also help in quickly learning a file's purpose simply by looking at what it depends upon.
Having an explicit, centralized, ordered list of imports is handy when refactoring code. It's particularly useful when moving files into new packages, since it makes it easier for the programmer to rename/remove a whole package of imports easily, and it makes it easier for the programmer to find particular import declarations.
The proposed rules are not without their problems, but I (Jason) find it better than the standard way of declaring imports. Some people complain about the time that it takes to set up the imports in this way. I feel that it's not particularly difficult to write the imports in this way, and it takes no more than a few minutes per commit. If you find that it somehow derails your train of thought to deal with import style while trying to solve programming problems, simply clean up the import style right before making the commit, or during a rebase before pushing.
Scala's match
should only be used conservatively—essentially, only as a last resort to calling isInstanceOf
, asInstanceOf
, or accessor methods.
Bad code:
Option(3) match {
case Some(x) => x*2
case None => 0
}
Less-bad code:
Option(3) map (_ * 2) getOrElse 0
If your data type provides methods for avoiding pattern matching, you should (generally) use them. These methods keep your data coherent and allow you to let the type system and the API guide you to correct code. Pattern matching gives you greater power, but it is not an amount of power that you need when you have data-coherent methods at your disposal, and to use pattern matching in such cases is merely to tempt fate with code that might be subtly broken.
Bad code:
(1, 5) match { case (x, y) => x + y }
Less-bad code:
val (x, y) = (1, 5)
x + y
The point here is that match
should not be used if you're only trying to do a destructuring bind (or, basically, if you only have one possible case that you're matching against). As with Example 1, match
gives you too much expressive power for such a simple idea.
Bad code:
val msg =
if (myList.length == 3) {
val List(x, y, z) = myList
s"This list is a triple: $x and $y and $z"
}
else if (myList.length == 2) {
val List(x, y) = myList
s"A two-item list—a personal favorite of mine: $x and $y"
}
else if (myList.isEmpty)
"That list was empty, you sneaky devil!"
else
"Error: Improperly-sized list!"
Less-bad code:
val msg = myList match {
case List(x, y, z) => s"This list is a triple: $x and $y and $z"
case List(x, y) => s"A two-item list—a personal favorite of mine: $x and $y"
case Nil => "That list was empty, you sneaky devil!"
case _ => "Error: Improperly-sized list!"
}
match
can be applied tastefully. This example demonstrates this; here we need the full power of match
in order to write the most-elegant code that Scala will allow us to write, since we need to be checking conditions and binding values in a way that the API for our data type (List
) doesn't—realistically can't—provide.
Unknown.
Conciseness, clarity, and declarativity are the keys. The examples above explain how to pursue these values.
It's not quite a style guideline, but Scala's many different sorts of classes should be used tastefully.
-
object
: For creating a singleton object (possibly to contain some members similar to Java'sstatic
members) -
type
: A simple alias for an existing type -
trait
: A stackable multiple inheritance construct -
class
: The default choice; a concrete, single inheritance construct -
case class
: Similar to a class, but intended be more like a record than a class; it's mostly just for grouping data together and pattern matching, and usually lacking any additional methods -
abstract class
: Basically, never use this; it's an abstract, single inheritance sort of class that has little use beyond Java compatibility
Judgment calls all around.
All of these different constructs communicate different intentions. Choose the one that best signals your intended use of the construct.
Method return types should be explicitly written unless you feel certain that the return type of your method is entirely obvious.
Obvious:
def x = 3
def x = myFavoriteInt
def myString = {
val y = shibbity
val x = shobbity
x wibbity x
}
def myJabberwocky = {
val v = vorpal
val f = fluon
123
}
Non-obvious:
def derp = getAwesomeValue
The type that I have in mind is Int
.
def runRubeGoldberg() = {
val x = lightFire()
val y = boilWater()
val z = spinTurbine()
(x |@| y |@| z) { z + y + x }
}
The type that I have in mind is ValidationNEL[String, Int]
.
None. If you're not sure whether or not the type is obvious, allow me to help you: It is not obvious.
Knowing a method's return type is monumentally useful when attempting to use that method. Eliding the return type declaration can save you some typing, yes, but, if the return type isn't obvious, it makes extra work for anyone attempting to use the method (sometimes, yourself included).
Favor x: Int
over x : Int
.
Whitespace padding for lining up code.
Consistency; it's the Scala standard style for type ascriptions.
Avoid the string concatenation operator +
. Instead, favor string interpolation (or <myString>.format(<myFormat>)
if need be). Favor Scala's multiline string literals over concatenating strings and newline characters.
If +
really leads to the clearest code, use it, but such situations don't seem to come up often.
As the "Exceptions" text suggestions, it's about clear code. The very idea of adding strings is kind of nonsense to begin with, but, on top of that, it leads to code that is harder to follow than is a lone string that is just a literal with values embedded into it (rather than a grab bag of string literals and string values "added" together).
Opening curly braces must not be placed on a new line. Closing curly braces are flexible.
Favor parentheses for lambdas that are single expressions. Use curly braces for lambdas whose bodies contain multiple expressions (not that Scala gives you a choice on this).
Avoid unnecessary braces and parentheses, especially around single expressions and case
statement bodies. This violates standard Java code style, but this is a rule that makes more sense in Scalaland, where basically everything is an expression.
Bad code (single expressions):
def x(y: Int, z: Boolean): Int =
if (z) {
(y)
}
else {
y match {
case a | if a > 3 => {
println("Big number!")
a / 2
}
case a => {
println("Small number...")
a
}
}
}
}
Less-bad code:
def x(y: Int, z: Boolean): Int =
if (z)
y
else
y match {
case a | if a > 3 =>
println("Big number!")
a / 2
case a =>
println("Small number...")
a
}
Sometimes, extra braces and parens can be used to help to make precedence clearer (e.g. def myFunc(x: Int, y: Int): Int = x + (3 * y)
).
Regarding placement of opening curly braces, a brace on a newline causes problems with Scala's semicolon inference, and can be interpreted as a separate block that isn't tied to the intended code.
If the bounding characters are performing no effectual/communicative purpose, they are unnecessary and simply syntactic noise that distracts readers and makes code more verbose. This should be avoided.
Standard Scala code style rules draw a distinction between def foo: Int
and def foo(): Int
. The former is for pure/referentially transparent functions, and the latter is for impure functions.
Sometimes, an abstract method is declared as empty-paren, but its implementation is pure. Despite its purity, the implementing method should be empty-paren, like its abstract parent.
Knowing whether or not a value is pure informs the programmer as to whether or not the value can be freely reused/composed. You don't want to be using some foo
as if it's a mere value, only to discover that it's having a subtle side-effect in an unrelated portion of your program.
Methods of return type Unit
should always use an =
right before the method body.
Bad code:
def printX(x: String) { println(x) }
Less-bad code:
def printX(x: String) = println(x)
Years ago, the style of the "bad code" was preferred in the Scala community, but that is no longer the case. The =
should always be used, and the other rules for declaring the return type of your method should be used in determining whether or not to declare the return type. If you're not sure whether or not the Unit
return type is clear, favor explicitness.
Whether or not to wrap the body of a Unit
method in curly braces is a judgment call.
You'll likely find many functions laying around that use old the style. Those functions aren't evil, but it would be nice if you were to convert any you come across to the "modern" style.
It is an edict from on high (by the great Odersky himself). The idea is basically to have consistency with the syntax for non-Unit
methods.
Objects, classes, types, type parameters, and constants should have their names in title case. Packages should be named in all lowercase. Everything else should be camelcased.
This is more of a suggestion than a rule. Use your best judgment for the "language"/API you're trying to build.
Title case is used for constants so they can be pattern matched against (which is a nuance of Scala). Packages inherit their naming conventions from Java (for lack of a better suggestion, and for interop). Most of the other conventions also come from Java.
Methods/functions should generally have names that are verbs or imperatives. Objects should have names that are nouns. Prefer the shortest name you can provide that both accurately and clearly describe the purpose of the value. If your name takes more than 3 camelcased words to describe, you likely either need to think of a better name, or rethink the nature of this value for which no sensible word exists.
There are always exceptions. These suggestions should be treated simply as a starting point.
When deciding whether or not to bind a value into a variable (e.g. a val
), prefer the path of clarity. If you have a function call that you can't fit onto a single line, that's often a sign that you should try binding some values to meaningful names.
Bad code:
throw new EngineException(
context,
this,
I18N.errorsJ.getN(
"org.nlogo.prim.$common.expectedBooleanValue",
displayName,
Dump.logoObject(tester),
Dump.logoObject(value)))
Less-bad code:
val key = "org.nlogo.prim.$common.expectedBooleanValue"
val testerDump = Dump.logoObject(tester)
val valueDump = Dump.logoObject(value)
val i18nStr = I18N.errorsJ.getN(key, displayName, testerDump, valueDump)
throw new EngineException(context, this, i18nStr)
This is yet another "rule" that is more suggestion than rule.
When function calls go trailing off and nesting off into oblivion, it can be tough to tell at a glance what exactly is going on—what arguments are being passed in for what, and why. Flattening things out and giving them names helps to understand the thought-process that went into a programmer's writing of a function call.
Semicolons should not be placed at the end of lines.
Semicolons may be used for combining multiple short statements onto the same line.
Scala infers semicolons at newlines, so explicitly placing them is needless. For the sake of consistency with the rich collection of already-existent code that doesn't use them, you should also avoid them.
Don't check in commented-out code.
Other than that, comments should be "signed" with your initials and the date.
Really small/incontrovertible comments might not require a signature, but, if you have a comment like that, you can usually communicate the same idea without a comment by simply writing clearer code (e.g. instead of val c = "%20" // Space char
, you could write val spaceChar = "%20"
).
We don't check commented-out code into version control, because, umm... that's what version control's for. You can use Git to quickly return to the version of the code before you commented it out/deleted it, so commenting it out usually very unnecessary.
The reason that comments should be initialled is because that helps any reader of the comment to easily track down the commenter for further explanation of the comment. The reason that comments should be dated is because the date adds a bit of context to the comment, and, also, the more recent a comment is, the more likely it is to still be true.