Reusable Component Success Factors - jmadison222/knowledge GitHub Wiki

| Home |

Reusable components are units of code that follow the write-once-use-many rule. The value is clear to see, but making it happen is very difficult. The following aspirations increase the odds of successfully building reusable components.

Corporate Context - This document is intended for internal corporate reusable components. Not all points will apply to industry-level reusable components, but most will.

1. Planning
2. Building
3. Delivery
- 3.1. Make All Source Code Easily Discoverable
- 3.2. Phone Home for Usage and Errors

1. Planning

1.1. Manage the Personal Ownership

This is an anti-pattern that comes up often so is worth addressing first. Nearly every engineer with sufficient experience will start making reusable components. If they’re very skilled, the components will be very good. But such "reusable components" will be far more specific to their original context than the authors will want to understand. Further, other very skilled engineers will often have competing "reusable components". What follows is a conflict over whose is the best. Usually none of those competing options adheres to most of the rules here, so typically none of them is really ready for broad consumption, so all of them should be considered suspect, and the process below should be followed instead.

1.2. Plan to Spend More Time Supporting Than Developing

Building reusable components is fun, exciting, and cool on the resume. Supporting components is not as glamorous. The success of components must be judged by their long-term adoption and support. That support will vary depending on the nature of the components, but in general, assume that the support will be far more than anyone thinks at the start, and plan for it accordingly. Within a few years, you’ll likely be spending more time supporting existing components than developing new ones.

1.3. Drive Adoption Through Persuasion Not Mandate

It is harder to get people to use reusable components than it is to make them—by far. Faced with this difficulty, the owners of the reusable components may resort to mandating the use. Resist this urge. Get others to use the components through the value of the components and selling that value. Once more than half of the target population adopts the components by persuasion, then consider mandates for those who might be resisting without justification.

1.4. Include Consumers as You Design and Code

When you do the iterative, design-driven coding, include the consumer base in the iterations. Such consumers should use the components in their context to verify things are worked as desired. The feedback of the consumers should be used to constantly improve the design and implementation.

2. Building

2.1. Design Before You Code (Incrementally)

Up-front design and continuous delivery of working software both have their place. Moving quickly to working software is best when the customer base is well-defined and immediately available. Reusable components challenge typically lack these two properties, instead tending to serve a wide array of consumers who my be far-flung. In such situations, it becomes more valuable to gather requirements and design the system before building working software. Not too much before, of course, lest we revert to waterfall. But be prepared to find value in rounding up a more representative user base in advance and having more precise design sessions than you would with normal development.

For example, FORTRAN is arguably the most influential programming langue of all time, as measured by its influence on the programming language family tree, with FORTRAN’s influences flowing through Algol, C, C++, Java, Python, even Smalltalk and Ruby. FORTRAN was very much designed before it was implemented. Per Wikipedia, consider the years shown:

4062f542 d8ea 4c3e b203 8a92db3f4483

The vision was for FORTRAN was laid out thoughtfully, and to this day, we use many if its designs in our modern languages.

2.2. Provide an Environment Checking Function

When the component moves into the wild, the component owners won’t be able to control the environment. The component therefore must have functionality that allows it to check its environment proactively and report whether the environment is capable of supporting the component.

For example, a bash script that provides a view on all needed tools being present:

d2e38814 8ef4 406c 89d7 8b1cda916eff

2.3. Produce an Instant-On Reference Implementation

The component must be able to install itself with a minimum number of instructions and no assistance from experts. The installation should then do a "hello world" level of execution to show that it is installed correctly and capable of operation. This immediate feedback will give confidence to the component consumers.

For example, the default installation of Tomcat provides a working web application around Tomcat itself with a surprising amount of functionality:

e801a1c4 0958 4c14 8137 ed9c5515fa77

2.4. Build As Much Test Code As Component Code

When we think of the "component", we think of the code that directly implements it. But a good part of any code is the test code that goes around it at all levels: unit, smoke, integration, user acceptance, QA, performance. This need for testing code is even more pronounced for reusable components since they will encounter more variance of use in the wild. When building reusable components, err on the much higher side of having testing code. It’s not unreasonable to have as much testing code as component code for important components.

For example, running at a logical date rather than the system date is both important and tricky. Here is some code we wrote to achieve that outcome:

def getProcessingDates(mode='month_previous', verbose=False):

    processing_date = datetime.strptime(os.getenv('TODAY'), '%Y-%m-%d')
    if bUsingJupyter:
        processing_date = datetime.now().replace(minute=0, hour=0, second=0, microsecond=0)

    start_date = None
    end_date = None

    if mode == 'month_current':
        start_date = processing_date.replace(day=1)
        end_date = start_date.replace(day = calendar.monthrange(start_date.year, start_date.month)[1])

    elif mode == 'month_previous':
        end_date = processing_date.replace(day=1)
        end_date = end_date - timedelta(days=1)
        start_date = end_date.replace(day=1)

    if verbose:
        print('Using Jupyter:   {}'.format(bUsingJupyter))
        print('Mode:            {}'.format(mode))
        print('Processing date: {}'.format(processing_date))
        print('Start date:      {}'.format(start_date))
        print('End date:        {}'.format(end_date))

    dictReturn = {
        "processing_date" : processing_date,
        "start_date" : start_date,
        "end_date" : end_date,
    }

    return dictReturn

That is about three dozen lines.

Here is the code we have to test it in Python and Bash:

>cat getProcessingDates.py
################################################################################
# getProcessingDates.py
################################################################################

from util import util

dictDates = util.getProcessingDates('month_previous', True)
print ("Within, prev:    {}, {}, {}".format(dictDates['processing_date'], 
    dictDates['start_date'], dictDates['end_date']))

dictDates = util.getProcessingDates('month_current', True)
print ("Within, curr:    {}, {}, {}".format(dictDates['processing_date'], 
    dictDates['start_date'], dictDates['end_date']))

dictDates = util.getProcessingDates()
print ("Within, empty:   {}, {}, {}".format(dictDates['processing_date'], 
    dictDates['start_date'], dictDates['end_date']))


> cat getProcessingDates.sh
################################################################################
# getProcessingDates.sh
################################################################################

echo "===== Do a default run."
echo "----- Doing: $TODAY";
python getProcessingDates.py

echo "===== Do several days around today.  Adjust these for future testing."
export TODAY=2024-12-01; echo "----- Doing: $TODAY"; python getProcessingDates.py
export TODAY=2024-12-02; echo "----- Doing: $TODAY"; python getProcessingDates.py
export TODAY=2024-12-03; echo "----- Doing: $TODAY"; python getProcessingDates.py
export TODAY=2024-12-04; echo "----- Doing: $TODAY"; python getProcessingDates.py
export TODAY=2024-12-05; echo "----- Doing: $TODAY"; python getProcessingDates.py
export TODAY=2024-12-06; echo "----- Doing: $TODAY"; python getProcessingDates.py

echo "===== Do things around year end as a special case."
export TODAY=2023-12-31; echo "----- Doing: $TODAY"; python getProcessingDates.py
export TODAY=2023-12-31; echo "----- Doing: $TODAY"; python getProcessingDates.py

echo "===== Make sure it works well into the future."
export TODAY=2050-01-01; echo "----- Doing: $TODAY"; python getProcessingDates.py

echo "===== Back to a default run."
. util # Puts things back to normal.
echo "----- Doing: $TODAY";
python getProcessingDates.py

################################################################################

That is about four dozen lines of code. Even more than the amount for the main functionality.

The more critical a piece of code is, the shorter it tends to be, and the more critical that the ratio of testing code be higher.

2.5. Use Industry Libraries

Industry libraries are essentially reusable components written by someone else and given to the industry. Obviously, if one suites your needs, you use it; and if it doesn’t, you don’t. But there is a gray area in the middle. Because writing new code is more fun than learning someone else’s, the temptation is to not use industry libraries in the gray area. Avoid this mistake. Give extra consideration to industry libraries that almost meet your needs, and consider that they might actually meet your needs, then use them, even if it might be more inspiring to build your own. Remember that if you don’t like the last 10% of an industry library, you’re going to take on the other 90% that works if you attempt to make your own.

2.6. Favor Fine-Grain Over Course-Grain (Then Compose)

When coding for a specific context, it can be fairly low-risk to glob lots of functionality into course-grain units of work and use them directly. Reusable components tend to find themselves trying to serve a broader array of purposes, some that may be hard to determine in advance. Such needs are better served by making fine-grain functionality in the components so that consumers can compose them in ways that meet their needs—which may not be predictable by the owners of the reusable components in advance. In an attempt to capture both flexibility and usability, component owners can also release a course-grain layer in the component library so that consumers who truly do just need the large units of functionality can use them directly.

2.7. Use Environment Decoupling Mechanisms

Operating systems provide a number of mechanisms that allow applications to be decoupled from the environment. Utilize them heavily. This includes environmental variables, paths to the system and libraries; well-known commands and end-points, and so on. For environmental variables, there are those defined by the system, but there is a whole layer of environmental variables the component itself can provide in its install process. These should be logically designed and easily understood by component users.

For example, the Snowflake database can be connected to from Python with something like this in code:

ctx = snowflake.connector.connect(
    user='foobar',
    account='foobar.us-east-1.private',
    private_key=pkb,
    warehouse='FOOBAR_WHS',
    database='FOOBAR_DB',
    schema='FOOBAR_DM',
    role='foobar_role'
    )

Or you can have zero code in Python, and instead use the following config file:

> cat ~/.snowsql/config
[connections.foobar]
accountname      = foobar.us-east-1.private
username         = foobar
rolename         = foobar_role
warehousename    = foobar_whs
dbname           = foobar_db
schemaname       = foobar
private_key_path = /export/home/foobar/.ssh/id_foobar

Then use the config file on the command line like this:

snowsql -c foobar -f 'foobar.py'

2.8. Use Variable Parameters and Return Values

Functions in reusable components need to have an above average level of flexibility. This includes both the input parameters and the return values. For both of those, favor techniques that are highly flexible. Use fixed parameters only if it’s obvious they will be mandatory in all context, and keep them to a minimum. Favor dictionary objects heavily (or dictionary-like depending on language and context). This includes sending in most parameters in extensible languages like XML (yuk!), JSON, TOML, or YAML. Return values should also be dictionaries. Within the dictionaries, favor dictionaries, for the same reason. Absolutely no "magic numbers" in any parameters or return values—give everything a descriptive name, even if all it does is map to some integer.

3. Delivery

3.1. Make All Source Code Easily Discoverable

Code is the ultimate documentation, since it is what the code actually does. Thorough higher-level documentation is obviously needed since not everyone has the time or skill to dig into the code behind the components. But when the component is in the field, there will be users of it who have the skill and interest to know exactly what the component is doing, and it should be easy for such people to find the code. This also means that your code should be well organized, well designed, and kept clean. If your code is a mess, it will be harder for power-consumers to understand it. All the test cases should be immediately discoverable in the code base as well, so that all aspects of the components can be thoroughly understood.

3.2. Phone Home for Usage and Errors

Once your components are released into the wild, you will benefit more than ever from knowing what they are doing. Build functionality into your components to have them report their usage and errors back to the owning organization. Provide a mechanism for the consumer to turn this off in case they are uncomfortable with it. Use these metrics to get a read on the actual reality of how your components are used, and use that knowledge in future releases.