Tips and Tricks by Database Platform - mkleehammer/pyodbc GitHub Wiki
The following is a collection of tips for working with specific database platforms. For additional tips that apply to pyodbc as a whole, see Features beyond the DB API.
Microsoft Access | Microsoft SQL Server | Sybase
Microsoft Access
- Databases containing ODBC Linked Tables
- Creating a new database file
- Limitations of the Access ODBC driver
- UnicodeDecodeError when calling Cursor.columns
Databases containing ODBC Linked Tables
pyodbc can work with Access databases that contain ODBC Linked Tables if we disable connection pooling before connecting to the Access database:
import pyodbc
pyodbc.pooling = False
cnxn = pyodbc.connect(r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ= ... ")
Creating a new database file
Access DDL does not support CREATE DATABASE
(or similar). If you need to create a new, empty database file you can use the (free) third-party msaccessdb module.
Limitations of the Access ODBC driver
The Access ODBC driver is not able to fully support some of the "complex" column types like "(multi-valued) Lookup" fields and "Attachment" fields. In some cases the ODBC driver can read values from such a column, but is unable to perform INSERT/UPDATE/DELETE operations. If you have to perform such tasks then you may need to use Access DAO (Data Access Objects) along with pywin32.
There is a list of more general Microsoft Access specifications and limitations here.
UnicodeDecodeError when calling Cursor.columns
There is an issue with the Access ODBC driver that can cause a UnicodeDecodeError when trying to use the Cursor.columns method if the table definition in Access includes optional column "Description" information:
Field Name Data Type Description
---------- ---------- --------------------
ID AutoNumber identity primary key
LastName Text Family name
FirstName Text Given name(s)
DOB Date/Time Date of Birth
For a discussion of the issue and possible workarounds, see issue #328.
Microsoft SQL Server
- Authenticate using an Access Token from Azure Active Directory
- Stored Procedures with output parameters and/or return values
- Connecting to a named instance of SQL Server from a Linux client
- DATETIMEOFFSET columns (e.g., "ODBC SQL type -155 is not yet supported")
- TIME columns
- SQL Server Numeric Precision vs. Python Decimal Precision
- Using fast_executemany with a #temporary table
- Always Encrypted
- Non-numeric float values
- Passing row-oriented parameter data as a JSON string
- Speeding up filtering of indexed varchar columns
- unixODBC connection pooling
Authenticate using an Access Token from Azure Active Directory
The ODBC Driver for SQL Server, in addition to the authentication methods built-in to the driver such as username/password, Windows Integrated (Kerberos), Azure Active Directory Password, Managed Service Identity, etc., also supports authenticating to Azure SQL and DW directly using an access token obtained from any of the authentication methods provided by Azure Active Directory; since pyODBC 4.0.24, this can be specified as the driver-specific pre-connect attribute 1256 using the following example code to supply the token in the correct format:
(Python 2.x)
token = "eyJ0eXAiOi..."
exptoken = ""
for i in token:
exptoken += i
exptoken += chr(0)
tokenstruct = struct.pack("=i", len(exptoken)) + exptoken
conn = pyodbc.connect(connstr, attrs_before = { 1256:bytearray(tokenstruct) })
(Python 3.x)
token = b"eyJ0eXAiOi..."
exptoken = b"".join(bytes([i, 0]) for i in token)
tokenstruct = struct.pack("=i", len(exptoken)) + exptoken
conn = pyodbc.connect(connstr, attrs_before = { 1256:tokenstruct })
(See also: original issue where this feature was requested)
Stored Procedures with output parameters and/or return values
See the Calling Stored Procedures page for an example of how to use a bit of T-SQL to retrieve these values.
Connecting to a named instance of SQL Server from a Linux client
Microsoft's SQL Server ODBC Driver for Linux is unable to resolve SQL Server instance names. However, if the SQL Browser service is running on the target machine we can use the (free) third-party sqlserverport module to look up the TCP port based on the instance name.
DATETIMEOFFSET columns (e.g., "ODBC SQL type -155 is not yet supported")
Use an Output Converter function to retrieve such values. See the examples on the Using an Output Converter function wiki page.
Query parameters for DATETIMEOFFSET columns currently must be sent as strings. Note that SQL Server is rather fussy about the format of the string:
# sample data
dto = datetime(2018, 8, 2, 0, 28, 12, 123456, tzinfo=timezone(timedelta(hours=-6)))
dto_string = dto.strftime("%Y-%m-%d %H:%M:%S.%f %z") # 2018-08-02 00:28:12.123456 -0600
# Trying to use the above will fail with
# "Conversion failed when converting date and/or time from character string."
# We need to add the colon for SQL Server to accept it
dto_string = dto_string[:30] + ":" + dto_string[30:] # 2018-08-02 00:28:12.123456 -06:00
TIME columns
Due to legacy considerations, pyodbc uses the ODBC TIME_STRUCT
structure for datetime.time
query parameters. TIME_STRUCT
does not understand fractional seconds, so datetime.time
values have their fractional seconds truncated when passed to SQL Server.
crsr.execute("CREATE TABLE #tmp (id INT, t TIME)")
t = datetime.time(hour=12, minute=23, second=34, microsecond=567890)
crsr.execute("INSERT INTO #tmp (id, t) VALUES (1, ?)", t)
rtn = crsr.execute("SELECT CAST(t AS VARCHAR) FROM #tmp WHERE id=1").fetchval()
print(rtn) # 12:23:34.0000000
The workaround is to pass the query parameter as a string
crsr.execute("INSERT INTO #tmp (id, t) VALUES (1, ?)", str(t))
rtn = crsr.execute("SELECT CAST(t AS VARCHAR) FROM #tmp WHERE id=1").fetchval()
print(rtn) # 12:23:34.5678900
Note that TIME columns retrieved by pyodbc have their microseconds intact
rtn = crsr.execute("SELECT t FROM #tmp WHERE id=1").fetchval()
print(repr(rtn)) # datetime.time(12, 23, 34, 567890)
SQL Server Numeric Precision vs. Python Decimal Precision
Python's decimal.Decimal type can represent floating point numbers with greater than 35 digits of precision, which is the maximum supported by SQL server. Binding parameters that exceed this precision will result in an invalid precision error from the driver ("HY104 [Microsoft][...]Invalid precision value").
Using fast_executemany with a #temporary table
fast_executemany
can have difficulty identifying the column types of a local #temporary table under some circumstances (#295).
UseFMTONLY=Yes
or ColumnEncryption=Enabled
-- Workaround 1: ODBC Driver 17 for SQL Server and Use "ODBC Driver 17 for SQL Server" (or newer) and include UseFMTONLY=Yes
or ColumnEncryption=Enabled
in the connection string, e.g.,
cnxn_str = (
"Driver=ODBC Driver 17 for SQL Server;"
"Server=192.168.1.144,49242;"
"UID=sa;PWD=_whatever_;"
"Database=myDb;"
"UseFMTONLY=Yes;"
)
or
cnxn_str = (
"Driver=ODBC Driver 17 for SQL Server;"
"Server=192.168.1.144,49242;"
"UID=sa;PWD=_whatever_;"
"Database=myDb;"
"ColumnEncryption=Enabled;"
)
Cursor.setinputsizes
-- Workaround 2: pyodbc 4.0.24 and Upgrade to pyodbc 4.0.24 (or newer) and use setinputsizes
to specify the parameter type, etc..
crsr.execute("""\
CREATE TABLE #issue295 (
id INT IDENTITY PRIMARY KEY,
txt NVARCHAR(50),
dec DECIMAL(18,4)
)""")
sql = "INSERT INTO #issue295 (txt, dec) VALUES (?, ?)"
params = [('Ώπα', 3.141)]
# explicitly set parameter type/size/precision
crsr.setinputsizes([(pyodbc.SQL_WVARCHAR, 50, 0), (pyodbc.SQL_DECIMAL, 18, 4)])
crsr.fast_executemany = True
crsr.executemany(sql, params)
-- Workaround 3: Use a global ##temporary table
If neither of the previous workarounds is feasible, simply use a global ##temporary table instead of a local #temporary table.
Always Encrypted
SQL Server 2016 and later support client-side encryption and decryption of data, when used with the ODBC Driver 13.1 for SQL Server or later. Retrieving data from encrypted columns through pyODBC has no special considerations; however, inserting data into encrypted columns is subject to some limitations of the ODBC driver itself, the two most important in pyODBC usage being:
-
Data to be inserted into or compared with encrypted columns must be passed through a parameterised query; literals in the T-SQL directly will not work.
-
Due to the lack of server-side type conversion, it may be necessary to use the
setinputsizes()
function extension (see https://github.com/mkleehammer/pyodbc/pull/280 for details) on specific columns to cause pyODBC to send the correct data type to the server for insertion or comparison with encrypted data. If you receive 'Operand type clash' errors, this is likely to be the case.
Non-numeric float values
SQL Server float
columns cannot store non-numeric floating point values like "Infinity". Attempting to do so …
crsr.execute("CREATE TABLE #tmp (id int primary key, f float)")
crsr.execute("INSERT INTO #tmp (id, f) VALUES (?, ?)", (1, float("Infinity")))
… results in an error like this:
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 5 (""): The supplied value is not a valid instance of data type float. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision. (8023) (SQLExecDirectW)')
This is not a deficiency of pyodbc or ODBC; the same error occurs if we try to save float.PositiveInfinity
to a float
column using C# and System.Data.SqlClient
.
Passing row-oriented parameter data as a JSON string
With SQL Server 2016+ we can use OPENJSON()
and pass row-oriented parameter data as a JSON string instead of passing a list of tuples. Not only can this be considerably faster but it can also work around issues with tuple-based TVP data like the one discussed here.
For example, given a user-defined table type
CREATE TYPE dbo.my_table_type AS TABLE
(
id int NOT NULL,
txt nvarchar(50) NULL,
PRIMARY KEY (id)
)
and a stored procedure that consumes it
CREATE PROCEDURE echo_tvp
@tvp dbo.my_table_type READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT * FROM @tvp;
END
this works (no None
in the first row)
tvp_data = [(1, "Alfa"), (2, "Bravo")]
results = crsr.execute("EXEC echo_tvp ?", (tvp_data,)).fetchall()
print(results)
# [(1, 'Alfa'), (2, 'Bravo')]
but this fails
tvp_data = [(1, None), (2, "Bravo")]
results = crsr.execute("EXEC echo_tvp ?", (tvp_data,)).fetchall()
# Invalid SQL data type (0) (SQLBindParameter)
However, with SQL Server 2016+ this works
tvp_data = [(1, None), (2, "Bravo")]
tvp_json = [dict(zip(["id", "txt"], row)) for row in tvp_data]
sql = """\
SET NOCOUNT ON;
DECLARE @tvp dbo.my_table_type;
INSERT INTO @tvp (id, txt)
SELECT id, txt FROM OPENJSON(?)
WITH (
id int '$.id',
txt nvarchar(50) '$.txt'
);
EXEC echo_tvp @tvp
"""
results = crsr.execute(sql, json.dumps(tvp_json, default=str)).fetchall()
print(results)
# [(1, None), (2, 'Bravo')]
Speeding up filtering of indexed varchar columns
By default, pyodbc sends string parameter values as Unicode. Therefore
results = crsr.execute(
"SELECT id FROM million_rows WHERE varchar_col = ?", "row012345"
).fetchall()
sends the equivalent of
SELECT id FROM million_rows WHERE varchar_col = N'row012345'
Unfortunately, this causes SQL Server to perform an "index scan" because it cannot compare Unicode (nvarchar) and SBCS (varchar) characters directly.
SQL Server will be able to perform and "index seek" (much faster) if we use setinputsizes()
to tell pyodbc to send the string parameter values as varchar
crsr.setinputsizes([(pyodbc.SQL_VARCHAR, 255)])
results = crsr.execute(
"SELECT id FROM million_rows WHERE varchar_col = ?", "row012345"
).fetchall()
which sends the equivalent of
SELECT id FROM million_rows WHERE varchar_col = 'row012345'
unixODBC connection pooling
The most recent (as of August 2024) updates to both unixODBC and pyodbc have changed the behaviour of connection pooling under unixODBC. With libmsodbcsql-18.4.so.1.1 (ODBC Driver 18 for SQL Server) and pyodbc >= 5.1.0:
unixODBC <= 2.3.9 — Connection pooling does not work at all. pyodbc uses wide (Unicode) connection functions and unixODBC did not start supporting those for connection pooling until 2.3.10.
unixODBC == 2.3.11 — pyodbc defaults to pyodbc.pooling = True
but unixODBC ignores that setting, so connection pooling is disabled by default. Adding Pooling = Yes
to the [ODBC]
section of odbcinst.ini enables connection pooling. Details in the unixODBC documentation here.
unixODBC >= 2.3.12 — unixODBC now respects the pyodbc.pooling
setting, so connection pooling is enabled by default. Pooling can be completely disabled by setting pyodbc.pooling = False
before the first connection. Pooling can be disabled for a specific driver by adding CPTimeout = -1
(not 0, as currently documented) to the driver's section of odbcinst.ini. Note that pyodbc.pooling = True
takes precedence over the Pooling = Yes/No
setting in odbcinst.ini.
For more detailed discussion, see Issue 774.
Sybase
Parameterized INSERTs and @@IDENTITY
Sybase ASE treats parameterized queries as LightWeight Procedures, or LWPs, which are basically stored procedures with a cached query plan. Hence, the session-global @@IDENTITY
value is reset to zero when the LWP exits:
cnxn = pyodbc.connect('DSN=mysybasedsn', autocommit=True)
crsr = cnxn.cursor()
crsr.execute("INSERT INTO dbo.mytable (foo) VALUES (?)", 'bar')
crsr.execute("SELECT @@IDENTITY AS id")
id = crsr.fetchone()[0] # always returns zero
To get the @@IDENTITY
value, add the query after the INSERT, separated by a newline character:
cnxn = pyodbc.connect('DSN=mysybasedsn', autocommit=True)
crsr = cnxn.cursor()
crsr.execute("INSERT INTO dbo.mytable (foo) VALUES (?)\nSELECT @@IDENTITY", 'bar')
id = crsr.fetchone()[0]
Ref: https://github.com/FreeTDS/freetds/issues/337#issuecomment-640070962