Database Transaction Management - mkleehammer/pyodbc GitHub Wiki

TL;DR Database transactions are controlled at the connection level, not the cursor level.

What is a Database Transaction?

Those of you who come from a database background will be familiar with the idea of database transactions, i.e. where a series of SQL statements are committed together (or rolled-back) in one operation. Transactions are crucial if you need to make multiple updates to a database where each update in isolation would leave the database in an invalid or inconsistent state, albeit temporarily. The classic example of this is processing a check, where money is transferred from one bank account to another, i.e. a debit from one account and a credit to another account. It is important that both the debit and credit are committed together otherwise it will appear as if money has been (temporarily) created or destroyed.

Note, this whole article is relevant only when autocommit is set to False on the pyodbc connection (fyi, False is the default). When autocommit is set to True, the database executes a commit automatically after every SQL statement, so transaction management by the client is not possible (and it should be noted those automatic commits are executed by the database itself, not pyodbc). With autocommit set to True, the database effectively runs every SQL statement within its own transaction, so any calls by pyodbc to commit or rollback are essentially pointless.

Database Transactions in pyodbc

When using pyodbc with autocommit=False, it is important to understand that you never explicitly open a database transaction in your Python code. Instead, a database transaction is implicitly opened when a Connection object is created with pyodbc.connect(). That database transaction is then either committed or rolled-back by explicitly calling commit() or rollback(), at which point a new database transaction is implicitly opened. SQL statements are executed using the Cursor execute() function, hence the equivalent of the following SQL:

BEGIN TRANSACTION
  UPDATE T1 SET ...
  DELETE FROM T1 WHERE ...
  INSERT INTO T1 VALUES ...
COMMIT TRANSACTION
BEGIN TRANSACTION
  INSERT INTO T2 VALUES ...
  INSERT INTO T3 VALUES ...
COMMIT TRANSACTION

in Python would be:

cnxn = pyodbc.connect('mydsn', autocommit=False)
crsr = cnxn.cursor()
crsr.execute("UPDATE T1 SET ...")
crsr.execute("DELETE FROM T1 WHERE ...")
crsr.execute("INSERT INTO T1 VALUES ...")
cnxn.commit()
crsr.execute("INSERT INTO T2 VALUES ...")
crsr.execute("INSERT INTO T3 VALUES ...")
cnxn.commit()
cnxn.close()

As you can see, no database transaction is ever explicitly opened using pyodbc but they are explicitly committed.

Cursors Do NOT Control Database Transactions

To be clear, database transactions are managed through connections, not cursors. This is important to understand. Cursors are merely vehicles to execute SQL statements and manage their results, nothing more. Database transactions are committed and/or rolled back at the connection level. Yes, there is a convenience function commit() on the Cursor object but that simply calls commit() on the cursor's parent Connection object. Bear in mind too that when commit() is called on a connection, ALL the updates from ALL the cursors on that connection are committed together (ditto for rollback()). For example:

cnxn = pyodbc.connect('mydsn', autocommit=False)

# create two cursors on the same connection
crsr1 = cnxn.cursor()
crsr2 = cnxn.cursor()

# execute statements on both cursors
crsr1.execute("DELETE FROM T1 WHERE...")
crsr1.execute("UPDATE T1 SET ...")
crsr2.execute("DELETE FROM T2 WHERE...")
crsr2.execute("UPDATE T2 SET ...")

# call "commit()" on the first cursor (note, not directly on the connection)
crsr1.commit()  # this commits the results of ALL FOUR crsr1 and crsr2 "execute" statements as one database transaction

cnxn.close()

Concurrent Database Transactions

If you want to have separate concurrent transactions, you will probably need to create a separate connection object for each transaction. For example:

# create two connections (with cursors), one for each database transaction
cnxn1 = pyodbc.connect('mydsn', autocommit=False)
crsr1 = cnxn1.cursor()
cnxn2 = pyodbc.connect('mydsn', autocommit=False)
crsr2 = cnxn2.cursor()

# execute the first transaction, but don't commit
crsr1.execute("DELETE FROM T1 WHERE...")
crsr1.execute("UPDATE T1 SET ...")

# whilst keeping the first transaction open, execute and commit a second transaction:
crsr2.execute("DELETE FROM T2 WHERE...")
crsr2.execute("UPDATE T2 SET ...")
cnxn2.commit()  # fyi here, crsr2.commit() would have exactly the same effect

# finally, commit the first transaction
cnxn1.commit()

cnxn2.close()
cnxn1.close()

Always Remember to Commit!

When autocommit is False, you must positively commit a transaction otherwise the transaction will almost certainly get rolled back eventually. For example, when a connection is closed with the close() function, a rollback is always issued on the connection. When a Connection object goes out of scope before it's closed (e.g. because an exception occurs), the Connection object is automatically deleted by Python, and a rollback is issued as part of the deletion process. The default database behavior is to rollback transactions so always remember to commit your transactions.

Specifying a Transaction Isolation level

Database management systems that support transactions often support several levels of transaction isolation to control the effects of multiple processes performing simultaneous operations within their own transactions. ODBC supports four (4) levels of transaction isolation:

  • SQL_TXN_READ_UNCOMMITTED
  • SQL_TXN_READ_COMMITTED
  • SQL_TXN_REPEATABLE_READ
  • SQL_TXN_SERIALIZABLE

You can specify one of these in your Python code using the Connection set_attr() method on SQL_ATTR_TXN_ISOLATION, e.g.,

cnxn = pyodbc.connect(conn_str, autocommit=True)
cnxn.set_attr(pyodbc.SQL_ATTR_TXN_ISOLATION, pyodbc.SQL_TXN_SERIALIZABLE)
cnxn.autocommit = False  # enable transactions

Note that a particular database engine may not support all four isolation levels. For example, Microsoft Access only supports SQL_TXN_READ_COMMITTED.