Clean Code by Robert C. Martin

Most impactful concepts for me

Refactoring as a continual and incremental process
The importance of names
- clear, concise, meaningful, consistent, don't "encode" (e.g. Hungarian), long isn't necessarily bad
Design classes with high cohesion and DI
Functions as one level of abstraction and the Stepdown Rule
- e.g.
  To do X, first we do A, then we B, then we C
  - To do A, we ...
  - To do B, we ...
  - To do C, we ...
  To do Y, first we ...
Keep it short
- minimize length of functions, of classes, and of parameter lists
Clean boundaries such as 3rd party interfaces and unknown interfaces
- some applicable design patterns to help accomplish below concepts: Adapter, Facade, Bridge
- 3rd party: don't let the 3rd party API control you, integrate it with an interface customized to your app
- unknown: create the interface you wish you had, mock it in tests, then integrate real interface (e.g. use adaptor)
Do not BDUF (big design up front)
- simple design up front, delay decisions as long as possible--you'll theoretically be more informed later
the complimentary nature of objects and data structures (<-- copy/paste-search to read the notes)

What is clean code?

Robert C. Martin's Collection of thoughts from:

Bjarne Stroustrup: inventor of C++; author of The C++ Programming Language
Grady Booch: author of Object Oriented Analysis and Design with Applications
"Big" Dave Thomas: founder of OTI, godfather of the Eclipse strategy
Michael Feathers: author of Working Effectively with Legacy Code
Ron Jeffries: author of Extreme Programming Installed and Extreme Programming Adventures in C#
Ward Cunningham: inventor of Wiki; inventor of Fit; coinventor of eXtreme Programming; Motive force behind Design Patterns; Smalltalk and OO thought leader; the godfather of all those who care about code
Uncle Bob

clean code is:

Does one thing well
logic straight forward--bugs hard to hide
Error handling is complete
performance close to optimal--doesn't tempt adding messy unprincipled optimizations
reads like well-written prose
full of crisp abstractions and straightforward lines of control
can be read and advanced by a developer other than its original author
has unit and acceptance tests
provides one way rather than many ways to do one thing
minimal dependencies which are explicitly defined
provides a clear and minimal API
should be literate
looks like it was written by someone who cares
runs all the tests
contains no duplication
expresses all the design ideas that are in the system
minimizes the number of entities such as classes, methods, functions, and the like
meaningful names
do one thing
early use of simple abstractions to
- enable future change
- maintain focus on the purpose (guide away from extraneous implementations)
expressiveness
each routine you read turns out to be pretty much what you expected
makes the language look like it was made for the problem
easy to read

ALL BELOW ARE THE UN-EDITED COMPLETE NOTES

Forward:

God is in the details

- Ludwig mies van der Rohe

TPM (Total Productive Maintenance)

a 1951 Japanese quality approach
5S principles
1. Seiri
  - think "sort" in English
  - Knowing where things are
  - Suitable naming
2. Seiton
  - think "systematize" in English
  - A place for everything, and everything in its place
3. Seiso
  - think "shine" in English
  - get rid of extraneous comments, commented out code, etc.
4. Seiketsu
  - standardization
  - group must agree on coding style and set of practices
5. Shutsuke
  - self-discipline
  - having the discipline to follow the practices and frequently reflect on one's work and be willing to change

Continually maintain and refactor code. Ship code that is more readable and more maintainable from the start.

architect Christopher Alexander views every act of design itself as a small, local act of repair.

French poet Paul Valery: a poem is never done and bears continual rework, and to stop working on it is abandonment.

The design lives in the code; design is a process, not a static endpoint

The code is the design
Simple code

Introduction

The comic illustration about WTFs/minute as the only valid measurement of code quality

Craftsmanship

There are two parts to learning craftsmanship: knowledge and work. You must gain the knowledge of principles, patterns, practices, and heuristics that a craftsman knows, and you must also grind that knowledge into your fingers, eyes, and gut by working hard and practicing.

(Page 27).

Ch. 1 Clean Code

Some have expressed that one day there will be no need for programmers

this is nonsense, because there will always be a need to translate requirements into machine language
humans can't even master this translation of vague requirements
their will always be a need for programmers to perform this translation--and it takes an expert to do this effectively; a machine cannot do this, a human must be the translator

Bad code can bring a company down

a company had a killer app that became popular quick
eventually the company died bc users stopped using it due to excessive bugs, crashes, etc.
an employee confirmed that this was due to creating a messy, un-maintainable code base while rushing to production

Wading through bad code--being impeded, slowed down by bad code

We commonly allow bad code saying that we'll clean it up later

but LeBlanc's law says

later means never

Programmer's responsibility to defend the quality of the code against the boss's schedule-based demands

The only way to go fast is to keep the code as clean as possible.

What is clean code?

Collected thoughts from:

Bjarne Stroustrup: inventor of C++; author of The C++ Programming Language
Grady Booch: author of Object Oriented Analysis and Design with Applications
"Big" Dave Thomas: founder of OTI, godfather of the Eclipse strategy
Michael Feathers: author of Working Effectively with Legacy Code
Ron Jeffries: author of Extreme Programming Installed and Extreme Programming Adventures in C#
Ward Cunningham: inventor of Wiki; inventor of Fit; coinventor of eXtreme Programming; Motive force behind Design Patterns; Smalltalk and OO thought leader; the godfather of all those who care about code
Uncle Bob

clean code is:

Does one thing well
logic straight forward--bugs hard to hide
Error handling is complete
performance close to optimal--doesn't tempt adding messy unprincipled optimizations
reads like well-written prose
full of crisp abstractions and straightforward lines of control
can be read and advanced by a developer other than its original author
has unit and acceptance tests
provides one way rather than many ways to do one thing
minimal dependencies which are explicitly defined
provides a clear and minimal API
should be literate
looks like it was written by someone who cares
runs all the tests
contains no duplication
expresses all the design ideas that are in the system
minimizes the number of entities such as classes, methods, functions, and the like
meaningful names
do one thing
early use of simple abstractions to
- enable future change
- maintain focus on the purpose (guide away from extraneous implementations)
expressiveness
each routine you read turns out to be pretty much what you expected
makes the language look like it was made for the problem
easy to read

The boyscout rule:

Leave the campground cleaner than you found it.

always refactor

Ch. 2 Meaningful Names

Use Intention-Revealing Names

example, instead of a variable d with a comment, use a meaningful variable name like daysSinceCreation

Avoid disinformation

don't use names that already have entrenched meanings which are different from the intended meaning
don't use cryptic names that look like something else
- e.g. l (lower case "L" looks like an Arabic numeral "1")

Make meaningful distinctions

don't make a minor name change to satisfy a compiler requirement
- e.g. 2 different things in the same scope that you think warrant the same name; you just misspell one of them
  - later, a misspelling correction could cause a compiler error
  - instead, these 2 different things should have 2 different names
don't use number series naming!!!
- e.g. a1, a2, a3, ...
don't use "noise words"
- e.g. a Product class that has ProcuctInfo and ProcuctData properties
  - "Info" and "Data" are meaningless noise words

Use pronounceable words

compare

class DtaRcrd102
{
    private Date genymdhms;
    private Date modymdhms;
    private final String pszqint = "102";
}

class Customer
{
    private Date generationTimeStamp;
    private Date modificationTimeStamp;
    private final String recordId = "102";
}

Use searchable names

single-letter names can ONLY be used as local variables inside short methods
The length of a name should correspond to the size of its scope

Avoid encodings

don't encode type or scope into names
Hungarian Notation was common due to old language requirements and limitations
- this is no longer necessary and should be avoided
don't use member prefixes (e.g. m_description "m_" is unnecessary clutter and a marker of old code)

Interfaces and Implementations

scenario: building an abstract factory for the creation of shapes
- the factory will be an interface and will be implemented by a concrete class
it's common to see IShapeFactory
Uncle bob prefers encoding implementation rather than enconding abstraction
- encoding the interface is a distraction at best and TMI at worst
- prefers the user not know they are working with an interface
- just want them to know it's a shape factory
- if I must encode either the interface or the impementation, I choose the implementation.
  - prefer ShapeFactoryImp or even CShapeFactory over IShapeFactory
my thoughts here:
- isn't there an overarching principle to abstract away and/or hide implementation details?
- and if so, isn't IShapeFactory more aligned with that principle than ShapeFactoryImp or CShapeFactory??

Avoid mental mapping

common to see variable names that are neither from the problem domain nor the solution domain
- e.g. single letter variable name that we must mentally map to the concept
"smart" vs professional programmers
- to professionals, clarity is king

Class names

a class name should not be a verb
a class name should be a noun or noun phrase
- e.g. Customer, WikiPage, AddressParser

Method names

method names should be verb or verb phrases
- e.g. postPayment, deletePage, save
Accessors, mutators, and predicates
- should be named for their value and prefixed with "get", "set", and "is", according to the javabean standard

string name = employee.getName();
customer.setName("mike");
if(paycheck.isPosted()) // ...

Overloaded constructors

use static factory methods with names that describe the arguments

Complex fulcrumPoint = Complex.FromRealNumber(23.0);

// is generally better than

Complex fulcrumPoint = new Complex(23.0);

consider enforcing their use by making the corresponding constructors private

Don't be cute

say what you mean. mean what you say
don't use "HolyHandGrenade" instead of "deleteItems" for entertainment value

Pick one word per concept

be consistent, choose the word for the concept and stick with it
- "fetch", "retrieve", "get" as equivalent methods for different classes is confusing
"controller", "manager", "driver" in same code base is confusing

Don't pun

example "add" method
- using "add" for a method that adds 2 items
- and then using "add" when adding an item to a list
  - maybe instead use "append" or "insert"

Use solution domain names

programmers know technical terms and programming terms like pattern names and algorithm names
e.g. AccountVisitor (visitor pattern)

Use problem domain names

when solution domain names aren't sufficient, use domain names
can at least consult a domain expert
the code that has more to do with the problem domain concepts should have names drawn from the problem domain

Add meaningful context

names which are meaningful on their own are uncommon
commonly it's necessary to add context
use well-named namespaces, classes, and functions to provide context
in rare cases, prefixes might be necessary

Don't add gratuitous context

example: "Gas Stataion Deluxe" application
- you have a "GSDAccountAddress"
- later you need another mailing address, do you reuse that specific class?
- the "GSD" is pointless, and worse, makes it less re-usable
- another downside in this scenario, is that searching for GSD in the app pulls up more results than you want due to overusing "GSD"
shorter names are preferred over longer as long as they are still clear
- "Address" is a good class name and can have instances named "AccountAddress" and "CustomerAddress"
- if you need to distinguish between MAC address, port address, and web address, you might consider "PostalAddress", "MAC", and "URI"

Ch. 3 Functions

Small!

1st rule of functions is that they should be small
2nd rule of functions is that they should be smaller than that
should hardly ever be even 20 lines long
shoot for less than a handful of lines
and short lines--not taking up the width of the screen

Blocks and indenting

if and loop blocks should be one line long--a function call
that function should have a nicely descriptive name--documentation value
no nesting (e.g. ifs inside ifs)
all this keeps functions short and readable

Do one thing

functions should do ONE THING
how do we know if a function is doing one thing??
- we ensure that the body of the function remain at ONE LEVEL OF ABSTRACTION lower
- the function's main concept (hopefully communicated by its name) is broken into steps
- those steps should be describable in a "TO" paragraph
- here is an example function followed by it's "TO" paragraph

// this example function from the book is a refactored version of the original
// - two versions of refactoring later actually

public static string renderPageWithSetupsAndTeardowns(PageData pageData, boolean isSuite)
throws Exception
{
    if (isTestPage(pageData)) includeSetupAndTeardownPages(pageData, isSuite);
    return pageData.getHtml();
}

// TO paragraph:
// TO RenderPageWithSetupsAndTeardowns, we check to see whether the page is a test
// page and if so, we include the setups and teardowns. In either case we render
// the page in HTML.

note: one way to know a function is doing just one thing is if you have a hard time extracting another function out of it who's name isn't merely a restatement of its implementation.

Sections within functions

a function that can be divided into sections (e.g. declarations, initializations, etc.) is a sign that it's doing more than one thing
functions that do one thing cannot be reasonably divided into sections

One level of abstraction per function

to make sure that a function is doing just one thing, make sure that the statements within it are all at the same level of abstraction

The Stepdown Rule: Reading code from top to bottom

the code should read like a top-down narrative
- each function should be followed by those at the next level of abstraction so that we can read the program descending one level of abstraction at a time as we read down the list of functions
- i.e. we want to be able to read the program as though it were a set of TO paragraphs, each of which is describing the current level of abstraction and referencing subsequent TO paragraphs at the next level down
- e.g.
  - To print all prime factors under user input n we request user input for n, then we get all prime factors under n, and then we print these prime factors
    - to request user input for n ...
      - to ...
    - to get all prime factors under n ...
      - to ...
    - to print these prime factors ...
      - to ...

Switch statements

these are hard to keep small
they also are difficult to make do one thing, given that by their nature they do N things
we can't always avoid switch statements but we can bury them at a low-level and never repeat them
general rule for switch statements
- they can be tolerated if they appear only once, are used to create polymorphic objects, and are hidden behind an inheritance relationship so that the rest of the system can't see them.

Use descriptive names

don't be afraid to use a long name
- a long descriptive name is better than a long descriptive comment
don't be afraid to spend time choosing a name, even try several different names and read the code with each
choosing a descriptive name can help clarify the design
- it's not uncommon that hunting for a good name results in a favorable restructuring of code

Function arguments

the ideal number of arguments for a function is zero (niladic)
next is one (monadic), followed closely by 2 (dyadic)
3 (triadic) should be avoided where possible
more than 3 (polyadic) requires very special justification--and then shouldn't be used anyway

why is it so important to minimize # of arguments?

arguments make it more difficult to understand the function
increase testing effort the more the arguments, the argument combinations to test for

Common monadic forms

common
- asking a question about an argument: bool fileExists("MyFile")
- operating on an argument: InputStream fileOpen("MyFile")
less common but still useful
- event: void passwordAttemptFailedNTimes(int attempts)
  - system interpret is to interpret this function as an event and use the argument to alter state
  - use this form with care, make it clear that this is an event
avoid other forms
- void includeSetupPageInto(StringBuffer pageText)
  - using an output arg instead of return value for transformation is confusing
  - if a function is going to transform its input arg, the transformation should be the return value

Flag arguments

passing a boolean into a function is a truly terrible practice

Dyadic functions

Point p = new Point(0,0) is obviously a valid dyadic function
- a Cartesian point naturally takes 2 arguments
- in fact the 2 arguments are ordered components of a single value
assertEquals(expected, actual)
- this is obviously fine, but even this has a problem
- what order to pass in these arguments? you just have to learn it
just be aware that dyadic functions come at a cost
and be aware of the mechanisms that might be available to convert them to monads

Triads

always try to avoid if possible

Argument objects

when a function looks like it should take more than 2-3 args, these args could likely be wrapped in a class

Circle makeCircle(double x, double y, double radius);
Circle makeCircle(Point center, double radius);

is reducing the # of args by creating objects out of them cheating? NO!
when groups of variables are passed together (like the above Point for the circle) they're likely part of a concept that deserves a name of its own

Argument lists

some functions take a variable # of arguments

String.Format("%s worked %.2f hours.", name, hours);

if the variable arguments are all treated identically as in the above string format, then they are equivalent to a single argument
thus String.Format() is dyadic

// another way to think of this function
String.Format(String format, Object... args);

all the same rules apply to that dyad
same is true for monads, dyads, and triads that take variable arguments

void monad(Integer... args);
void dyad(String name, Integer... args);
void triad(String name, int count, Integer... args);

Verbs and keywords

monads should be verb/noun pairs
- e.g. write(name) or perhaps writeField(name) is even better due to clarifying what name is
encoding argument names into function names can be helpful
- e.g.
  - assertEquals(expected, actual) ... what order to we put our arguments?
  - assertExpectedEqualsActual(expected, actual) ... this fixes the question of ordering

Have no side effects

side effects are unexpected changes to function arguments, instance properties, scope variables, globals, etc.
these can be very damaging, because the function caller has no idea of or even reason to expect these effects
example
- an example was given of a function called checkPassword which
  - not only checked the password, but also
  - initialized Session if password was correct
- the side affect is initializing Session--the function name did not tell the caller about Session
- fix by changing name to checkPasswordAndInitializeSession (... lol, but no, bc SRP)

Output arguments

in general, avoid output arguments, if a function must change the state of something, have it change the state of its owning object

Command query separation

functions should either do something or answer something, but not both
negative example: public boolean set(String attribute, String value);
- leads to odd code like this: if(set("username", "uncle bob")){...}

Prefer exceptions to returning error codes

returning error codes from command functions is a subtle violation of command query separation
read this section on page 46 (77 of PDF)
- the whole section is short and fits on this one page
- the section is a clear explanation with a wonderful code illustration/clarification
interesting note:
- James Coplien seems to have the complete opposite opinion
  - source: https://henrikwarne.com/2014/09/04/a-response-to-why-most-unit-testing-is-waste/#comment-5502
  - on exceptions
    - "In general, I hold exceptions to be a bad idea"
    - "Exceptions are just hyper galactic GOTOs"
    - "If you’re throwing an exception across more than two activation records, you probably can’t reason about the code"
    - the paragraph he wrote that those quotes are from also mentions C++ though
      - so idk if his opinion written there is limited to exceptions in C++, but i doubt it??
  - on return codes
    - "with good programming discipline they [exceptions] can be replaced by return codes ... and handled locally"

Extract try/catch blocks

if you have multiple function calls in a try block, refactor/extract them into a new function, and call just that single function in the try block instead
easier to read and modify

Error handling is one thing

functions should do just one thing
error handling is one thing
if a function has a try block in it:
- the keyword try should be the first word in the function body, and
- nothing after the try/catch block should exist in that function body

Error codes are dependency magnets - BAD!!!

dependency magnets are classes or enums which many other classes must reference
example: an error code enum ... add a new error code, all modules that reference this must re-compile
better: exceptions - a new exception inherits from the base Exception class and won't have that issue

Don't repeat yourself

don't repeat yourself

How to write good functions?

start ugly, refactor, refactor, refactor
TDD: don't just start ugly and refactor
so start ugly but confident due to thorough unit test coverage
then in the safety of continually passing unit tests, refactor, refactor, refactor

Ch. 4 Comments

Comments are at best a necessary evil.

Examples of worth while comments

authorship and copyright at top of file
others were mentioned but didn't really seem worth noting, bc they were still noted as debatably worth using

Examples of bad comments

a bunch of these examples, but not worth taking note of
he especially liked showing examples similar to this:

// this is where we get the name
public string getName() {}

BOTTOM LINE of this chapter: don't use comments almost ever

Ch. 5 Formatting

a source file's vertical formatting

try to keep it short
- example solutions like fitness and junit keep most files under 100 lines
vertical openness between concepts - effective use of blank lines to separate concepts
vertical density - grouping lines of a concept together
vertical distance
- closely related concepts should be vertically closer to each other

declare variables as close to their usage as possible

local function variables should be declared at the top of the function
instance variables should be declared at the top of the class
- this should not increase the vertical distance of these variables, because in a well-designed class, they are used by many, if not all, of the methods of the class

dependent functions

if one function calls another, they should be vertically close
caller above callee

conceptual affinity

code that is closely related to other code should have little vertical distance
caller/callee is just one example
another example would be a group of functions that perform variations of a basic task

vertical ordering

high-level functions first
low-level details after

horizontal formatting

according to the distribution of his example solutions
- 20-60 chars represents 40%
"Hollerith" limit of 80 is "a bit arbitrary"
100-120 should be the max
uncle bob sets his limit at 120

horizontal openness and density

white space can help to associate or disassociate concepts
e.g. myFunction(parm1, param2)
- no space between function name and parenthesis
- space between params bc they are separate things
e.g. return (-b + Math.sqrt(determinant)) / (2*a) and return b*b - 4*a*c
- no space in the higher precedence multiplied factors
- space between the lower precedence added terms

horizontal alignment

var numbVar                 = 1;
var numbVarLonger           = 1;
var numbVarRidiculouslyLong = 1;

e.g. above shows aligning variable assignments
don't do this
if you have a long list of declarations, you may need to refactor something(s) out

indentation

very important
this helps show the hierarchy of scopes (namespace, class, method, block, etc.)
don't break these indentations for short statements that normally are expressed in blocks

// prefer this
public class CommentWidget extends TextWidget
{
    public static final String REGEXP = "example regex pattern";

    public CommentWidget (ParentWidget parent, String text)
    {
        super(parent, text);
    }

    public String render() throws Exception
    {
        return "";
    }
}

// over this
public class CommentWidget extends TextWidget
{
    public static final String REGEXP = "example regex pattern";

    public CommentWidget (ParentWidget parent, String text){super(parent, text);}

    public String render() throws Exception{return "";}
}

dummy scopes

sometimes a for and while loops have dummy scope bodies
try to avoid these or at least make them very obvious

// prefer this
while (dis.read(buf, 0, readBufferSize) != -1)
    ;

// over this
while (dis.read(buf, 0, readBufferSize) != -1);

team rules

decide on formatting rules for the team, and each team member needs to follow these rules
the entire code base must be consistent

Uncle Bob's formatting rules

Listing 5-6, pages 91, 92 (pp. 122, 123 of the PDF)

Ch. 6 Objects and Data Structures

We keep our variables private, because we don't want anyone else depending on them.
We want the freedom to change their type or implementation.
Why do so many programmers automatically add getters and setters to their objects, exposing their private variables as if they were public?

Data abstraction

it's better to not just add auto-getter/setters
- this does not hide implementation
use abstraction to hide data implementation while still allowing users to manipulate

example:

// this reveals implementation
public interface Vehicle
{
    double getFuelTankCapacityInGallons();
    double getGallonsOfGasoline();
}

// this hides implementation
public interface Vehicle
{
    double getPercentFuelRemaining();
}

Data/Object anti-symmetry

Procedural Shapes

public class Square
{
    public Point topLeft;
    public double side;
}

public class Rectangle
{
    public Point topLeft;
    public double height;
    public double width;
}

public class Circle
{
    public Point center;
    public double radius;
}

public class Geometry
{
    public final double PI = 3.14;

    public double area(Object shape) throws NoSuchShapeException
    {
        if (shape instanceof Square)
        {
            Square s = (Square)shape;
            return s.side * s.side;
        }
        else if (shape instanceof Rectangle)
        {
            Rectangle r = (Rectangle)shape;
            return r.height * r.width;
        }
        else if (shape instanceof Circle)
        {
            Circle c = (Circle)shape;
            return PI * c.radius * c.radius;
        }
        throw new NoSuchShapeException();
    }
}

Polymorphic Shapes

public class Square implements Shape
{
    private Point topLeft;
    private double side;

    public double area()
    {
        return side*side;
    }
}

public class Rectangle implements Shape
{
    private Point topLeft;
    private double height;
    private double width;

    public double area()
    {
        return height * width;
    }
}

public class Circle implements Shape
{
    private Point center;
    private double radius;
    public final double PI = 3.14;

    public double area()
    {
        return PI * radius * radius;
    }
}

objects vs data structures

objects hide their data behind abstractions and expose functions that operate on that data
- see polymorphic shapes code above
- the objects have hidden data, and public functions operating on that data
data structures expose their data and have no meaningful functions
- see procedural shapes code above
- the objects are really just data structures - public data, no functions

the complimentary nature of objects and data structures

Procedural code makes it easy to add new functions without changing the existing data structures
- see procedural shapes code above
- could easily add a new perimeter function to the Geometry class, and all "shape" data structures would be unaffected
OO code makes it easy to add new classes without changing existing functions
- see polymorphic shapes code above
- can easily add a new shape class, and the existing shape classes' area functions would be unaffected

the compliment is also true:

Procedural code makes it hard to add new data structures, because all the functions must change
- adding a new "shape" data structure would require updating the Geometry.area function (and a perimeter function, and so on)
OO code makes it hard to add new functions because all the classes must change
- if we wanted to add a perimeter function to shapes, we'd have to add the new function to the super class and implement that in all inheriting classes

So, the things that are hard for OO are easy for procedures, and the things that are hard for procedures are easy for OO!

In any complex system there are going to be times when we want to add new data types rather than new functions. For these cases objects and OO are most appropriate. On the other hand, there will also be times when we'll want to add new functions as opposed to data types. In that case procedural code and data structures will be more appropriate.

Mature programmers know that the idea that everything is an object is a myth. Sometimes you really do want simple data structures with procedures operating on them.

Law of Demeter (LOD)

a module should not know about "the innards" of the objects it manipulates
an object should not expose its internal structure through accessors, because to do so is to expose, rather than to hide, its internal structure
the Law states that a method f of class C should only call the methods of these:
- C
- an object created by f
- an object passed as an argument to f
- an object held in an instance variable of C
the method f should not invoke methods on objects that are returned by any of the allowed functions
example possible violation:

final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();

"train wreck" ... code like the above is often called a train wreck bc it looks like a bunch of coupled train cars

Options opts = ctxt.getOptions();
File scratchDir = opts.getScratchDir();
final String outputDir = scratchDir.getAbsolutePath();

the above snippets (train wreck style or the second) are in a function that has a lot of knowledge
- knowledge of options, of a scratch directory, and of an absolute path
whether the code violates the LOD depends on whether ctxt, Options, and ScratchDir are objects or data structures
- if they are objects, then exposing their "innards" is a violation
- if they are data structures, then LOD doesn't apply
  - bc data structures naturally expose data and have not functionality
the use of accessor functions confuses the issue
- LOD would not be a question if the code had been written like this:

final String outputDir = ctxt.options.scratchDir.absolutePath;

this issue would be a lot less confusing if data structures simply had public variables and no functions, and objects had private variables and public functions
but there are frameworks and standards (e.g. "beans") that demand that even simple data structures have accessors and mutators

Hybrids

avoid these
- they are indicative of a muddled design whos authors are unsure or (worse) ignorant of whether they need protection from functions or types
what are they
- part object, part data structure
- have functions that do significant things
- have public variables or accessors and mutators that basically make the private variables public
these hybrids are the worst of both worlds
- they make it hard to add new functions AND hard to add new data structures

Hiding structure

make sure that higher level classes aren't aware of lower level functionality
make sure that classes aren't breaking the LOD - make sure they're not aware of internals of classes that they shouldn't know the internals of

DTOs

a quintessential data structure, a class with public variables and no functions
very useful, for I/O communication (e.g. database, sockets, etc.)
often the first in a series of translation stages converting raw data in a database into objects in the application code
the "bean"
- somewhat more common (at least in Java i assume--my thought)
- have private variables manipulated by getters/setters
- the "quasi-encapsulation of beans seems to make some OO purists feel better but usually provides no other benefit
active record
- a specific form of DTO
- in addition to the public (or publicly accessible) variables, they have navigational methods
  - e.g. save() and find()
- typically direct translations from database tables or other data sources
- developers often treat these as objects by putting business rules in them - don't
- instead, use an object that hides it's internals, and have these internals be instances of these active records (this is an example approach, there are other ways)

conclusion

objects expose behavior and hide data
- makes it easy to add new kinds of objects without changing existing behaviors
- makes it hard to add new behaviors to existing objects
data structures expose data and have no significant behavior
- makes it easy to add new behaviors to existing data structures
  - (my thought: idk if "better" wording here might be:)
    - ("makes it easy to add new behaviors that operate on existing data structures" or)
    - ("makes it easy to add new functions that operate on existing data structures")
- makes it hard to add new data structures to existing functions

Ch. 7 Error Handling

many code bases are dominated by error handling
error handling is important, but if it obscures logic, it's wrong

Use exceptions rather than return codes

return codes are used in languages that don't have exceptions
these languages are still around but if you have exceptions, you should use them
problems with working with flags or return codes
- clutter the caller
- require that the caller check for errors immediately after the call
exceptions allow cleaner code with logic unobscured by error handling

without exceptions

public class DeviceController
{
    ...
    public void sendShutDown()
    {
        DeviceHandle handle = getHandle(DEV1);
        // Check the state of the device
        if (handle != DeviceHandle.INVALID)
        {
            // Save the device status to the record field
            retrieveDeviceRecord(handle);
            // if not suspended, shut down
            if (record.getStatus() != DEVICE_SUSPENDED)
            {
                pauseDevice(handle);
                clearDeviceWorkQueue(handle);
                closeDevice(handle);
            }
            else
            {
                logger.log("Device suspended. Unable to shut down");
            }
        }
        else
        {
            logger.log("Invalid handle for: " + DEV1.toString());
        }
    }
    ...
}

with exceptions

public class DeviceController
{
    ...
    public void sendShutDown()
    {
        try
        {
            tryToShutDown();
        }
        catch (DeviceShutDownError e)
        {
            logger.log(e);
        }
    }

    private void tryToShutDown() throws DeviceShutDownError
    {
        DeviceHandle handle = getHandle(DEV1);
        DeviceRecord record = retrieveDeviceRecord(handle);

        pauseDevice(handle);
        clearDeviceWorkQueue(handle);
        closeDevice(handle);
    }

    private DeviceHandle getHandle(DeviceID id)
    {
        ...
        throw new DeviceShutDownError("Invalid handle for: " + id.toString());
        ...
    }
    ...
}

the code with exceptions untangled the 2 concerns - 1. algorithm for device shutdown 2. error handling

Write your try-catch-finally statement first

t-c-f => try-catch-finally statement

one of the most interesting things about exceptions is that they define a scope within your program
when you execute code in the try block of a t-c-f you are stating that execution can abort at any point and then resume in the catch
a t-c-f is, in a way, like a transaction - the catch block must leave the program in a consistent state regardless of what happens in the try
good practice to start with a t-c-f whenever writing code that could throw exceptions
this helps define what users of the code should expect if something goes wrong

Use unchecked exceptions

java has "checked exceptions" where a method's signature includes a list of exception types it can pass to its caller
examples of languages that have no such thing: C#, C++, Python, Ruby
robust systems can be created without checked exceptions - so does there benefit justify their price?
- the price: violating the OCP
  - if you throw a checked exception who's catch is 3 methods above, the signatures of the 2 methods under the catching method must declare that exception type
  - this means changes to low level methods' signatures for checked methods will force signature changes to higher level methods
- worth the price? basically no (but it was mentioned maybe for a critical library)

Provide context with exceptions

create informative error messages and pass them along with your exceptions
mention the operation that failed and the type of failure

Define exception classes in terms of a caller's needs

many ways to classify errors
- by source - what component did they come from
- by type - device failures, network failures, programming errors, etc.
when defining exception classes in an app, most important concern should be how they are caught

poor exception classification

ACMEPort port = new ACMEPort(12);

try
{
    port.open();
}
catch (DeviceResponseException e)
{
    reportPortError(e);
    logger.log("Device response exception", e);
}
catch (ATM1212UnlockedException e)
{
    reportPortError(e);
    logger.log("Unlock exception", e);
}
catch (GMXError e)
{
    reportPortError(e);
    logger.log("Device response exception", e);
}
finally
{
    ...
}

refactored

LocalPort port = new LocalPort(12);

try
{
    port.open();
}
catch (PortDeviceFailure e)
{
    reportError(e);
    logger.log(e.getMessage(), e);
}
finally
{
    ...
}

public class LocalPort
{
    private ACMEPort innerPort;

    public LocalPort(int portNumber)
    {
        innerPort = new ACMEPort(portNumber);
    }

    public void open()
    {
        try
        {
            innerPort.open();
        }
        catch (DeviceResponseException e)
        {
            throw new PortDeviceFailure(e);
        }
        catch (ATM1212UnlockedException e)
        {
            throw new PortDeviceFailure(e);
        }
        catch (GMXError e)
        {
            throw new PortDeviceFailure(e);
        }
    }
}

why is the refactored version better
simplifying the code by wrapping the API we're calling and making sure it returns a common exception
- since we are doing basically the same thing for each exception type

benefits of wrapping APIs

wrapping 3rd-party APIs is considered best practice, here's why
wrapping it minimizes your dependency on it
- you can choose to use a different library with minimal penalty
makes it easier to mock
not tied to the API vendors design choices
- your wrapper is the API you want, and if the vendor changes something, you have minimal changes to make

define the normal flow

following the above discussed exception handling practices pushes error detection to the edges of the program resulting in less cluttered code
- wrap APIs to throw your own exceptions
- handler above your code to handle aborted computations
this is good but sometimes you may not want to abort a computation

awkward code

try
{
    MealExpenses expenses = expenseReportDAO.getMeals(employee.getID());
    m_total += expenses.getTotal();
}
catch (MealExpensesNotFound e)
{
    m_total += getMealPerDiem();
}

in this business, if meals are expensed, they're part of the total, otherwise, a per diem is expensed for the employee
the exception handling is cluttering the logic
would rather the code simply look like this:

MealExpenses expenses = expenseReportDAO.getMeals(employee.getID());
m_total += expenses.getTotal();

we can make it that simple
change the ExpenseReportDAO to always return a MealExpenses object
if there are no meal expenses, it returns a MealExpenses object that returns the per diem as its total

public class PerDiemMealExpenses implements MealExpenses
{
    public it getTotal()
    {
        // return the default per diem
    }
}

this is the "Special Case Pattern" (book references Fowler)
- create a class or configure an object to handle a special case
- doing so relieves client code of the burden of dealing with exceptional behavior

Null

this is an error handling chapter, so we will discuss things we do that invite errors

Don't return null

it's very common to see lines of code all over a program checking for null
this is bad!
returning null is creating problems for yourself - for the caller of what returns null
if you are tempted to return null from a method, consider throwing an exception or returning a special case object instead
- similarly with 3rd party APIs, wrap an API method that returns null in your own method that either returns a special case object or throws
an example: a method that returns a list that can return null
- instead of checking for null returned from this method, make the method return an empty list instead of null
- now you don't have to check for null before iterating the list, just iterate it

Don't pass null

avoid passing null into methods whenever possible - this is even worse than returning null
- working with an API method that expects you to pass null is an example where you can't avoid this
in most programming languages, there isn't a good way to handle null passed in to a method
- you can check for null and have an exception handler for InvalidArgumentException, but what should the handler do?
just avoid passing null into methods

Ch. 8 Boundaries

we seldom control all the software in our system
- we may buy 3rd-party packages, use open source, depend on components from other teams, etc.
we have to integrate the foreign code with our own
we have to keep the boundaries of our software clean

Using 3rd-Party code

there's a natural tension between an interface provider and the user of that interface
interface providers strive for broad applicability, users want the interface to focus on their needs
this tension can lead to problems at system boundaries
example: java.util.Map
- what if we want to insure that no one deletes items in a map we want to pass around?
  - there's a clear function in the Map's interface
- what if we want to limit the types of objects put in a map?
  - Map is generic

// say we have an app that needs a Map of Sensors
Map sensors = new HashMap();

// now some other part of the code needs to access sensor
// - we notice that this tends to be all over the place in the application
// - the client is carrying the responsibility of casting
Sensor s = (Sensor)sensors.get(sensorId);

// let's improve the readability of the code
Map<Sensor> sensors = new HashMap<Sensor>();
// now the client no longer needs to handle casting in all those places in the code
Sensor s = sensors.get(sensorId);

// Map<Sensor> is still providing more capability than we need or want
// And if the Map interface ever changes, we may have many places in the code to change
// ...
// This class could be an good approach
// this can evolve with very little impact on the rest of the app
// the use of generics is no longer a big issue, bc casting is handled in the class
// this interface is tailored to the needs of the app
// the code is easier to read, and harder to misuse
// this class can enforce design and business rules
public class Sensors
{
    private Map sensors = new HashMap();

    public Sensors getById(String id)
    {
        return (Sensor)sensors.get(id);
    }

    // snip
}

// use
Sensors sensors = new Sensors();
Sensor s = sensors.getById(sensorId);

this is not a suggestion to always encapsulate the use of Map in this form
this is advice that you NOT pass Maps (or any other interface at the boundary) around your system
if you use a boundary interface like Map, keep it inside the class, or close family of classes, where it's used
avoid returning Map from, or accepting Map as an argument to, public APIs

Exploring and learning boundaries

learning and integrating a 3rd-party API can take a lot of time and effort
- common to spend long debugging sessions trying to figure out how the 3rd-party code works and determining whether an exception is being thrown by a bug in your code or by code that is simply using the API wrong
using "learning tests" can help
create tests that focus on what you want out of the API
these are essentially controlled experiments that check your understanding of the API
learning tests not only help us learn the API, they verify that the package works as we expect
- if an update comes later, our learning tests show whether the package still works as we expect
these learning tests can serve as "boundary tests"
- it's always good to support a clean boundary with a set of outbound tests that exercise the interface in a way that production code does
boundary tests helps us migrate to newer versions rather than feeling like we have to stay with an old version longer than we maybe should

Using code that does not yet exist

there's another kind of boundary, one that separates the known from the unknown
example:
- sometimes you have to write code for an application that relies on 3rd-party code for which you don't know the interface yet
- how do you start writing you code in this scenario??
- start working far away from the boundary
- you'll find yourself getting close to that boundary and even bumping up against it
- as you do this, you'll learn more about what you WANT the interface to do for you
- define the interface you WISH you had
- you can mock the API in tests
- once the real interface is ready
  - create an Adapter to bridge the gap between your application interface and the other interface
    - this Adapter encapsulates the interaction with the API and provides a single place to change when the API changes
  - you can create boundary tests once you have the API

principles

design to accommodate change without huge investments and rework
avoid letting to much of your code know about external APIs
have clear separation and tests that define expectations

Ch. 9 Unit Tests

3 laws of TDD given
writing tests this way results in a bulk of test code that can present a daunting management problem

Keeping tests clean

dirty test code is as bad as--or worse than--no test code
test code has to change as production code changes, and the dirtier the code the greater the maintenance burden
with dirty test code, tests will become viewed as an ever-increasing liability
test code is just as important as production code

Tests enable the -ilities

unit tests are what keep your code flexible, maintainable, and reusable
- if you have unit tests, you do not fear making changes to the code!
- unit tests allow you to make changes with confidence
  - confidence that you have not broken anything
- this means you can make refactor and improve the architecture and design without fear
tests enable all the -ilities, bc tests enable change

Clean tests

what makes a clean test? readability
- you want to say a lot with as few expressions as possible

Domain-specific testing language

design a testing API that helps the tests be simple and readable
this testing API is not designed up-front, but evolves over time as a result of continually refactoring the tests

Dual standard

this dual standard is related to things like efficiency in memory and CPU
unit tests don't have to be as efficient as prod code
(but tests MUST still be clean and expressive)

One assert per test

this helps readability and is a good guideline
but doesn't necessarily always have to be followed
definitely minimize the number of asserts

Single concept per test

test a single concept in each test function
this is probably a better rule than the one-assert-per-test rule

F.I.R.S.T.

Fast - tests should run fast
Independent - tests should not depend on each other
Repeatable - tests should be repeatable in any environment
- you should be able to run the tests in the production environment, in the QA environment, and on your laptop while riding home on the train without a network
Self-Validating - tests should have a boolean output (pass or fail)
- you shouldn't have read a log file or compare output files
Timely - tests should be written in a timely fashion
- tests should be written ahead of the production code that makes them pass

Ch. 10 Classes

class organization

standard Java convention says ...
a class should begin with a list of variables
- public static constants, then
- private static variables, then
- private instance variables
- (there is seldom a good reason to have a public variable)
public functions should follow the variables
put the private functions called by a public function immediately after than public function

Encapsulation (emphasis in this quite is mine)

We like to keep our variables and utility functions private, but we're not fanatic about it. Sometimes we need to make a variable or utility function protected so that it can be accessed by a test. For us, tests rule. If a test in the same package needs to call a function or access a variable, we'll make it protected or package scope. However, we'll first look for a way to maintain privacy. Loosening encapsulation is always the last resort.

p. 136 (p. 167 of PDF)

my thoughts from reading this - and from other things he has said
- uncle bob apparently sometimes puts unit tests in the same package as the production code they're testing??
  - protected, in Java, means accessible in the containing type and derived types, and in the containing package
  - packages, in Java, are basically Namespaces
    - source: Oracle here - https://docs.oracle.com/javase/tutorial/java/concepts/package.html
    - source: Jon Skeet here - https://stackoverflow.com/a/25160624/9533368
  - i assume that this choice is not common?
    - the last sentence in that paragraph emphasizes that "loosening of encapsulation is always a last resort"
    - so that may mean that even if they have a test in the same package, it likely won't affect the SUT's encapsulation
    - it also could maybe indicate that it's not very common for him to put unit tests in the same package? (but this is a guess/assumption)
- on our team, we never put tests in the same project as our production code
  - we have separate designated test projects where we put all tests

Classes should be small!

how small?
measure in # of responsibilities - 1
- with functions, we measured in # of lines
a class name should describe its responsibility
- it should be concise and unambiguous

Single Responsibility Principle (SRP)

a class or module should have one, and only one, reason to change
many developers fear that a large number of small, single-purpose classes makes it more difficult to understand the bigger picture
- concerned about having to navigate from class to class to figure out how a larger piece of work gets accomplished
just as much to learn about the app regardless of organization
- analogy of a toolbox:
  - you want everything thrown in a couple drawers?
  - or everything organized into smaller, labeled drawers?

Cohesion

classes should have a small # of instance variables
each method of a class should manipulate one or more of these instance variables
in general, the more instance variables a method manipulates, the more cohesive that method is to its class
a class in which each variable is used by each method is maximally cohesive
in general it is neither advisable nor possible to create such maximally cohesive classes
but we do want high cohesion
having high cohesion means that the methods and variables of the class make logical sense to be together
the strategy of keeping functions small and parameter lists short can sometimes lead to a proliferation of instance variables that are used by a subset of methods
- when this happens, it normally means you can split it into smaller classes

Maintaining cohesion results in many small classes

example given of an original code version and a refactored version
original version was one long method
new version was many small classes and functions
- more readable, understandable, maintainable

Organizing for change

any time that you have to modify a class, you are introducing a risk that the system no longer works as expected
- open/closed principle was alluded to here
what to do when we spot a class that seems to violate SRP
- we should primarily consider system change over "fixing" a violation of SRP
- if we have no current reason to change a class that violates SRP and no need for change in the forseeable future, then we leave the class alone
  - introducing the change at all is still introducing risk
- if we do have a reason to change a class that violates SRP, it's time to strongly consider refactoring
  - now introducing the change is happening regardless, so we want to change it the right way

Isolating from change

client classes of concrete objects depend on those concrete objects' concrete details
these client classes are at risk when those concrete details change
minimize the impact of these changes with interfaces and abstract classes
- DIP (dependency inversion principle) says that our classes should depend on abstractions, not concrete details

Ch. 11 Systems

Separate constructing a system from using it

Software systems should separate the startup process, when the application objects are constructed and the dependencies are "wired" together, from the runtime logic that takes over after startup.

The startup process is a concern that the app must address
- the separation of concerns is important in software design
an example of NOT separating the startup concern:

public Service getService()
{
    if(service = null)
        service = new MyServiceImpl(...); // Good enough default for most cases?
    return service;
}

that's the lazy initialization/evaluation idiom
- merits:
  - you only incur construction overhead if you use the object
  - this can result in faster startup times
  - you ensure null isn't returned
- cons:
  - hard-coded dependency on MyServiceImpl and everything its constructor requires
  - you can't even compile without resolving these dependencies--even if you never use it
  - testing can be a problem especially if MyServiceImpl is a heavy object
  - construction logic is mixed in with normal runtime processing
    - breaking SRP
    - requires testing for more execution paths
one occurrence of lazy initialization isn't a serious problem
but too many occurrences of this can hurt modularity and cause duplication
we should have a global, consistent strategy for resolving major dependencies

Separation of Main

One way to separate construction from use is simply to move all aspects of construction to main, or modules called by main, and to design the rest of the system assuming that all objects have been constructed and wired up appropriately.

the flow of control is easy to follow
- all objects the system needs are constructed in main
- all dependency arrows go the same way crossing the barrier between main and the app
- they all point away from main
the app doesn't know about main, it just expects that everything it needs has been built

Factories

sometimes the app needs to control when objects are created
Abstract Factory pattern gives the app control of when but hides the details of how

Dependency Injection

DI is a mechanism for separating construction from use
DI is the application of IoC to dependency management
IoC moves secondary responsibilities from an object to other objects dedicated to those purposes
an object should not instantiate its dependencies
- this should be handled by another "authoritative" mechanism - inverting control
- setup is a global concern, so the authoritative mechanism should be "main" routine or a special-purpose container
example of "partial" DI:

MyService myService = (MyService)(jndiContext.lookup("NameOfMyService"));

this is "partial" DI, bc:
- the invoking object does not control what type of object is returned
- but, the invoking object actively resolves the dependency
True DI goes a step further
- the class takes no direct steps to resolve the dependency, it's passive
- it provides a setter method or constructor argument that are used to inject the D
example of True DI

public MyInvokingObject(MyService myService){...}

Scaling Up

it is a myth that we can get systems "right the first time"
scale at the code level
- implement what you need today, then refactor and expand tomorrow
scale at system level
- thinking in comparison to the approach at the code level
  - doesn't system architecture require preplanning?
  - certainly system architecture can't grow incrementally from simple to complex?
- the ephemeral nature of software systems allows incremental growth
  
  Software systems are unique compared to physical systems. Their architectures can grow incrementally, if we maintain the proper separation of concerns.

Cross-Cutting Concerns

example, persistence concerns
even though we want to achieve modularity - separation of concerns, implementations of persistency strategies must be spread across many many objects
AOP (Aspect-Oriented Programming)
- an approach intended to restore modularity for cross-cutting concerns
- modular constructs called "aspects" specify which points in the system should have their behavior modified in some consistent way to support a particular concern
- this specification is done using a succinct declarative or programmatic mechanism
in AOP, with persistence, you would declare which objects and attributes (or patterns thereof) should be persisted and then delegate the persistence tasks to your persistence framework
AOP Frameworks
- Spring AOP and JBoss AOP are examples of Pure Java AOP frameworks
- these tools automate the implementation of complicated proxy boilerplate (see Java Proxies section on p. 161, PDF p. 192)
- this allows the programmer to focus on their domain and implement business logic in POJOs
- the frameworks are incorporated via declarative configuration files or APIs
- this enables defining more consistent system-wide behavior
- the frameworks create nested decorator objects around the POJO

BDUF (big design up front) - bad

this is not only unnecessary, it's potentially harmful due to inhibiting adaptation to change
- psychologically difficult to discard work that resulted from a lot of previous effort
- architectural choices influence subsequent thinking about the design
building architects have to BDUF bc radical architectural changes aren't economically feasible after significant construction
it can be economically feasible for software architects to make radical changes after significant development if their application's structure has effectively separated concerns

Simple Design up front - good

a "naively simple" but nicely decoupled architecture allows for quicker delivery and future scalability

Optimize decision making

delay decisions as long as possible to ensure making the most informed choices

Use standards wisely - when they add demonstrable value

there are many mature standards and standards-based tools available, not all situations call for them though
i thought of WCF in this brief section

Ch. 12 Emergence - getting clean via emergent design

A design is "simple" if it: (Kent Beck's 4 rules of simple design - in order of importance)

Runs all the tests
Contains no duplication
Expresses the intent of the programmer
Minimizes the number of classes and methods

And, couched in TDD

Runs all the tests
- this enables refactoring safely

Refactoring: Contains no duplication

not only exact duplicate lines, but also this type of example:

int size() {}
boolean isEmpty() {}

// can eliminate duplication by something like this:
//  - calling size rather than repeating what size does and checking the result
boolean isEmpty()
{
    return 0 == size();
}

Refactoring: Expresses the intent of the programmer
Refactoring: Minimizes the number of classes and methods

Ch. 13 Concurrency

Objects are abstractions of processes. Threads are abstractions of schedule.

James O. Coplien

Concurrency is hard.

Concurrency is a decoupling strategy

it helps us decouple what gets done from when it gets done
a single-threaded app has highly coupled what and when
- can determine app's state by just looking the stack backtrace
- can set breakpoints and know the state by the breakpoint that's hit
decoupling what from when can dramatically improve both throughput and structures of an app
- structurally, this kind of app can look like a bunch of tiny computers collaborating rather than one main loop
- this offers powerful ways to separate concerns
- this can make the system easier to understand

example scenario where concurrency could improve a system

info aggregator that retrieves data from websites and reports a daily summary
- a lot of network I/O waiting - the websites list grows, combing all websites can take too long
a system handles one user at-a-time and requires 1 second per user
- fine for low number of users, but what about if users have to get in line of 150 users
a system that interprets large data sets but can only give complete solution after processing all sets
- perhaps each set should be processed by a different machine for parallel processing

Myths and misconceptions

concurrency always improves performance
- it can sometimes improve performance, but only when there is a lot of wait time that can be shared between multiple threads or multiple processors - neither situation is trivial
design does not change when designing concurrent programs
- the design of a concurrent algorithm can be remarkably different from a single-threaded
- the decoupling of what from when usually has a huge effect on the structure of the system
understanding concurrency issues is not important when working with a container such as a Web or EJB container
- you should know about concurrent update and deadlock (described later in book)

Some more balanced statements regarding concurrency

concurrency incurs some overhead, both in performance as well as writing additional code
correct concurrency is complex, even for simple problems
concurrency bugs aren't usually repeatable, so they are often ignored as one-offs instead of the true defects they are
concurrency often requires a fundamental change in design strategy

Challenges

consider class X

public class X
{
    private int lastIdUsed;

    public int getNextId()
    {
        return ++lastIdUsed;
    }
}

say we
- create an instance of X,
- then set the lastIdUsed field to 42,
- and then share the instance between 2 threads
now suppose both threads call getNextId() - there are 3 possible outcomes
1. thread 1 gets value 43; thread 2 gets value 44; lastIdUsed is 44
2. thread 1 gets value 44; thread 2 gets value 43; lastIdUsed is 44
3. thread 1 gets value 43; thread 2 gets value 43; lastIdUsed is 43
to understand how that 3rd result is possible, you need to understand the JIT compiler and Java memory model
- in short, there are 1,000's of possible paths for the threads to take within that getNextId method - most of the paths produce the right result
- the problem is that some of those paths produce the wrong result

Concurrency defence principles

Principles and techniques for defending your systems from problems of concurrent code

SRP

Concurrency is complex enough to be a reason to change in its own right, and therefore should be separated from the rest of the code.
unfortunately it is common for concurrency implementation details to be embedded directly into other production code
consider these things
- concurrency-related code has its own life cycle of development, change and tuning
- concurrency-related code has its own challenges, which are different from and often more difficult than nonconcurrency-related code
- the number of ways in which miswritten concurrency-based code can fail makes it challenging enough without the added burden of surrounding application code
recommendation: keep your concurrency-related code separate from other code

Corollary: limit the scope of data

2 threads modifying a shared object can interfere with each other
one solution is to use the synchronized keyword to protect a critical section in the code that uses the shared object
these critical sections must be minimized:
- you have to remember to protect each of these sections
- you want to avoid duplicating effort to protect multiple sections
- more of these sections makes it more difficult to determine sources of failures
recommendation: take data encapsulation to heart; severely limit the access of any data that may be shared

Corollary: use copies of data

try to avoid sharing data all together
sometimes it's possible to copy objects and treat them as read-only
sometimes it's possible to copy objects, collect results from multiple threads, and then merge the results in a single thread
the extra object creation is a possible concern
- experiment to determine if this is an issue
- more than likely, the savings in avoiding the intrinsic lock will make up for additional creation and GC overhead

Corollary: threads should be as independent as possible

consider writing your threaded code such that each thread is isolated, sharing no data with other threads
- each thread processes one client request with all of its required data coming from an unshared source and stored as local variables
it's common to eventually run into shared resources (e.g. db connections) despite this effort
recommendation: attempt to partition data into independent subsets that can be operated on by independent threads, possibly in different processors

Know your library

mention of java-specific libraries that are designed for use in concurrency

Thread-safe collections

recommendation: review the classes available to you. in the case of java, become familiar with (list of specific java packages)

Know your execution models

key definitions

bound resources: resources of a fixed size or number used in a concurrent environment (e.g. db connectiosn, fixed-size r/w buffers)
mutual exclusion: only one thread can access shared data or a shared resource at a time
starvation: one thread or a group of threads is prohibited from proceeding for an excessively long time or forever (e.g. always letting fast-running threads first could mean long-running threads never run if there's no end to fast-running threads)
deadlock: two or more threads waiting for each other to finish--each thread has a resource that the other requires and neither can finish until it gets the other resource
livelock: threads in lockstep, each trying to do work but finding another "in the way." due to resonance, threads continue trying to make progress but are unable to for an excessively long time or forever

Producer-consumer

what it is
- producer thread(s) create work and place on queue (bound resource) and signal that the queue is no longer empty
- consumer thread(s) get work from queue and complete it and signal that the queue is no longer full
- both potentially wait to be notified that they can continue
personal note: this sounds like what the filesystemwatcher library does
- the one that i found online and modified

Readers-writers

this is about finding a balance between the needs of writer's vs those of readers
- writers tend to block many readers, you could require the writer to wait until there are no readers
  - this emphasizes throughput at the expense of potentially starving writers
- prioritizing writers too much can hurt throughput by blocking readers for a long period of time
the challenge is finding the right balance and avoiding concurrent update issues

Dining philosophers

a metaphor for threads competing for resources
the philosophers--sitting at a circular table with single-forks between them--think util hungry and require a fork-per-hand to eat
this can result in deadlock, livelock, throughput, and efficiency degradation
recommendation: learn the basic algorithms and understand their solutions

Beware dependencies between synchronized methods

recommendation: avoid using more than one method on a shared object
sometimes you just need to do this, 3 ways to handle correctly:
1. client-based locking
  - client locks server before calling 1st method and releases after last method returns
2. server-based locking
  - server has method (which client calls) that locks server, calls all methods, then unlocks
3. adapted server
  - intermediary performs locking; this is server-based locking where original server can't be changed

Keep synchronized sections small

the synchronized keyword creates a lock to ensure a section of code only ever has 1 thread executing
locks create delays, add overhead, can result in contention and degraded performance
so use judiciously, and keep the sections small
recommendation: keep your synchronized sections as small as possible

Writing correct shut-down code is hard

writing graceful shutdown code is hard
common issues involve deadlock, with threads waiting for a signal to continue that never comes
e.g. parent thread spawns child threads, then waits for all to finish before shutting down
- what if a child thread is deadlocked? the system will never shut down
e.g. similar situation but system has been instructed to shutdown
- parent tells children to abandon work and finish--what if there was a pair of children working as producer/condsumer, the producer shuts down, and the consumer gets stuck waiting forever for a signal it expected from the producer
  - this would prevent the parent thread from finishing and the system would not shut down
recommendation: think about shut-down early and get it working early. it's going to take longer than you expect. review existing algorithms, bc this is probably harder than you think

Testing threaded code

recommendations:
- Treat spurious failures as candidate threading issues
  - do not ignore system failures as one-offs
- Get your nonthreaded code working first
  - don't chase nonthreading and threading bugs simultaneously
- Make your threaded code pluggable
  - make it runnable with one, several, and varied # of threads during execution
  - make it runnable with test doubles and test it with test doubles that are fast/slow/varied
  - test to run a number of iterations
- Make your threaded code tuneable
  - early on, find ways to test under many different configurations, and ensure the # of threads can be easily tuned
  - consider allowing that # to change while the system is running
  - consider self-tuning based on throughput and system utilization
- Run with more threads than processors
  - running with more threads than processors forces task swapping
  - more task swapping will cause defective code to the surface (like a chaos monkey scenario)
- Run on different platforms
  - run your code on all target platforms early and often
- Instrument your code to try and force failures
  - Hand-coded
    - inserting statements like Thread.yield(), wait(), etc.
    - this approach isn't great
      - odds still aren't with you that you choose the right statement or location to insert it
      - might accidentally leave this code in prod
  - Automated
    - use jiggling strategies to ferret out errors
    - this is a unit testing-based approach
    - the simplistic suggestion to create the below example class with static method used to instrument the code with "jiggle points"
    - inserting ThreadJigglePoint.jiggle(); in a number of places
    - then write one implementation that simply does nothing for prod
    - then write another implementation that randomly chooses between different thread jiggling statements (e.g. Thread.yield(), wait(), etc.) or nothing
    - then have testing automatically run this a 1,000 times
    - then you either find an error or at least demonstrate due diligence

public class ThreadJigglePoint
{
    public static void jiggle(){}
}

Ch. 14 Successive Refinement

Incrementalism

big changes bad
- it's tempting to make massive changes to a program structure to "improve" it
- it's very hard to get a program working the same way it worked before the "improvement"
incremental changes good
- use TDD to help
- make the small changes, run the tests to ensure everything still works
- always keep the system running!
refactoring is like solving a Rubik's cube

Ch. 15 JUnit Internals

Refactoring can take a long time, will be iterative, can involve reversals of previous changes before being done.

Ch. 16 Refactoring SerialDate

SerialDate is a class within a Java package.

Unit Tests

existing test suite covered about 50%
Before beginning to refactor this class, he created a suite of tests to provide full coverage of the class's functionality
- These tests did not all pass from the start--they defined the behavior he thought the class should have
New test suite started at 92% coverage (with some tests commented out to start)
run tests after every change to keep it working as you refactor

Ch. 17 Smells and Heuristics

A helpful summary chapter!

This was more or less a summarizing of previous points in the book under a helpful set of categories.

Appendix A: Concurrency II

Concurrency is really hard.

Appendix B: org.jfree.date.SerialDate

This is just reference code for an earlier chapter in the book.

Appendix C: Cross References of Heuristics

Reference for earlier chapter in the book.

Clean Code - egnomerator/misc GitHub Wiki

Clean Code by Robert C. Martin

Most impactful concepts for me

What is clean code?

Forward:

Introduction

Craftsmanship

Ch. 1 Clean Code

What is clean code?

Ch. 2 Meaningful Names

Ch. 3 Functions

Small!

One level of abstraction per function

Function arguments

Have no side effects

Ch. 4 Comments

Ch. 5 Formatting

Ch. 6 Objects and Data Structures

Data/Object anti-symmetry

Ch. 7 Error Handling

Null

Ch. 8 Boundaries

Ch. 9 Unit Tests

Ch. 10 Classes

Ch. 11 Systems

Ch. 12 Emergence - getting clean via emergent design

Ch. 13 Concurrency

Challenges

Concurrency defence principles

Ch. 14 Successive Refinement

Ch. 15 JUnit Internals

Ch. 16 Refactoring SerialDate

Ch. 17 Smells and Heuristics

Appendix A: Concurrency II

Appendix B: org.jfree.date.SerialDate

Appendix C: Cross References of Heuristics

⚠️ GitHub.com Fallback ⚠️

Clean Code - egnomerator/misc GitHub Wiki

Clean Code by Robert C. Martin

Most impactful concepts for me

What is clean code?

Forward:

Introduction

Craftsmanship

Ch. 1 Clean Code

What is clean code?

Ch. 2 Meaningful Names

Ch. 3 Functions

Small!

One level of abstraction per function

Function arguments

Have no side effects

Ch. 4 Comments

Ch. 5 Formatting

Ch. 6 Objects and Data Structures

Data/Object anti-symmetry

Ch. 7 Error Handling

Null

Ch. 8 Boundaries

Ch. 9 Unit Tests

Ch. 10 Classes

Ch. 11 Systems

Ch. 12 Emergence - getting clean via emergent design

Ch. 13 Concurrency

Challenges

Concurrency defence principles

Ch. 14 Successive Refinement

Ch. 15 JUnit Internals

Ch. 16 Refactoring SerialDate

Ch. 17 Smells and Heuristics

Appendix A: Concurrency II

Appendix B: org.jfree.date.SerialDate

Appendix C: Cross References of Heuristics

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️