Soft Delete And Cascade - NFSandbox/sh_trade_backend GitHub Wiki

This project use Soft Delete pattern when handling row deletions in database. However, SQLAlchemy, the ORM Framework we are currently using, does not support well with such pattern.

This article will briefly introduce the code pattern we used to deal with soft delete in our backend system.

Query

After searching, I noticed that it's possible to use the SQLAlchemy built-in feature Adding Global WHERE Criteria, However, I didn't dive into it because of limitation of time, and continuing the method of manually adding Entities.deleted == False as the WHERE clause when querying.

We are now using a Python package called sqlalchemy-easy-softdelete to automatically handle the soft delete behaviour in our ORM query. (GitHub Repo of this package)

The feature includes:

  • Add global WHERE criteria (Could be overriden by .execution_options(include_deleted=True))
  • delete() and undelete() function

Check out the GitHub repo of that package for more info.

Cascade Deletion

The cascade behaviour with soft deletion need to be implemented by ourselves, this article determine the basic standard of the implementation patterns.

Creating Cascade Function

For any entity that may need to perform Cascade Deletion, we create a function with name starts with: get_cascade_[...]

For example, there is One-To-Many relation between User and Item class, then we should create a function called:

async def get_cascade_items_from_users(ss: Session, users: Sequence[User]):
    # check database for all items have relationship with the users in the list
    return items

Here is an example in production:

async def get_cascade_questions_from_items(
    ss: SessionDep, items: Sequence[orm.Item]
):
    stmt = (
        select(orm.Question)
        .join(orm.Question.item)
        .where(orm.Item.item_id.in_([i.item_id for i in items]))
    )

    res = await ss.scalars(stmt)
    return res.all()

We should keep in mind that this function will later be used to determine the entities to be delete in cascade, so the Join Condition should be concur with the logic of Cascade Deletion.

For entities that have more than one relationships, there may be a need to write several different cascade functions for different relationships.

Cascade Delete Function

With the help with the utility function above, the cascade deletion will become much more easier.

async def remove_items_cascade(
    ss: SessionDep, items: Sequence[orm.Item], commit: bool = True,
) -> List[gene_sche.BlukOpeartionInfo]:
    # record the effective delete count of all entities
    q_count = gene_sche.BlukOpeartionInfo(operation="Cascading delete questions")
    i_count = gene_sche.BlukOpeartionInfo(operation="Delete items")
    
    try:
        # delete question cascade of item
        questions = await get_cascade_questions_from_items(ss, items)
        # notice, if questions also have cascade deletion behaviuor, 
        # we may need to call remove_questions_cascade() function 
        # instead of delete it directly.
        for q in questions:
            if not q.deleted:
                q_count.inc()
            q.delete() # Entity.delete() function is provided by soft delete plug Mixin.

        # delete items itself
        for i in items:
            if not i.deleted:
                i_count.inc()
            i.delete()
        
        # commit if flag enabled
        if commit:
            await ss.commit()

        # return info
        return [i_count, q_count]
    except:
        await ss.rollback()
        raise

We will create a function called delete_[entity_name]_cascade() function.

In this function:

  1. Use get_cascade_[...] function to determine all entities in other tables that should be deleted
  2. Perform cascade delete
  3. Finally, delete entities in this table
  4. Commit the change if needed

You could consider we are doing DFS iteration through these tables to perform deletions.

Note that since we are using Soft Delete, so the sequence is not matter in SQL level, and no Foreign Key Constraint error will occurred even if the order is changed. But we still recommend delete the cascade entities first.

Commit Flag

Here we use a parameter flag commit (default to True) to decide if we need to try committing the change before function return. This is because there are two different use cases of these Cascade Delete function:

  • Used as root delete entry point: This means user want to delete the entities of current table, so we call cascade delete function of this table. In this case, we should commit the changes after operation finished, so commit=True.
  • Called by other delete function as multilevel cascade delete: This function is called because user are trying to delete entities in other tables, and this function being called to delete the cascade entities because those tables has relationship with this table. In this case, the final commit work should be done by the entry cascade delete function but not this one, so commit=False.

Cascade Soft Delete

The image above illustrates an how nested cascade deletion works in our system by taking User Removal as an example.

Operation Feedback

Notice the return of the Cascade Delete function is with type of List[BlukOpeartionInfo].

BlukOpeartionInfo is a Pydantic Class used to return the bulk operation info to the requester of the API. Below is the basic structure of this class:

class BlukOpeartionInfo(BaseModel):
    success: bool = True
    operation: str | None = None
    total: NonNegativeInt = 0
	
	# member functions
	def inc(self, delta: int = 1):
	    ...

For requester, they will finally got a list of json object with the schema below:

[
    {
        "success": true,
        "operation": "Delete items",
        "total": 0
    },
    {
        "success": true,
        "operation": "Cascading delete questions",
        "total": 0
    }
]

Notice that we should always make the first object of the list to be the deletion info for current tables, not cascade tables. Some of the project test flows depends on this attributes.