CommonNotes - x-ian/dha-mis-migration GitHub Wiki

General differences between Access and MSSQL

Features to be lost / Not migrate-able

Automatic 'Push' for data updates between different Access-only users. In MSSQL an explicit pull/refresh is required if data on the server changed. At least within one Access DB data changes are automatically available to other users. Doesn't appear to be relevant here and unclear if this even works with the current split in frontend and backend DBs.
The completely-offline-scenario by simply copying the DB files doesn't work. This could be useful for performance reasons, presentation, testing or dev purposes, but requires a local MSSQL installation and some custom scripting/processes.
Relationships Diagram; now mostly the auto-defining of links between tables in Query Wizard which doesn't work anymore

Lookup Columns

In Access Lookup columns with its RowSource setting serve these multiple purposes from below and blurry the line between client (UI) and server (backend) with convenient features:

Define foreign key relationship optionally define a constraint to ensure only valid foreign keys are used, e.g. map District ID to District name
Define a 'filtered' foreign key relationship with a WHERE clause, e.g. only allow relationship to certain parent concepts
Define the UI elements/component to be used for edit, e.g. a Combobox
Define a static set of possible values
Define formatting and behavior of the UI element, e.g. size of dropdown box and restriction to values of underlying query
'Derived forms' inherit/copy these settings as defaults to set up standard UI elements

Lookup column rowsource properties can be defined in different flavours' (TODO: describe what the SSMA is doing in each case, if at all)

Lookup columns referring to pre-defined Access (Select) queries
Lookup columns using SQL SELECT statements (in Access SQL dialect)
Lookup columns with static set of possible values (e.g. 'N', 'Y', '')

Current usage of Lookup columns:

109 Lookup columns
17 Lookups make use of 5 predefined select queries
18 have static mappings like 0 for N, 1 for Y, 9 for blank
- 5 'categories' of mappings, could be worthwhile to also refer to an externalised definition to avoid future copy&paste and increase consistency (e.g. always representing 0 with N instead of No in one case)
74 have direct SQL SELECT statements
- ~30 queries across 6 categories are exact clones, for maintenance & consistency already candidates to be externalised/unified in dedicated queries/views
- ~13 overall 'categories' of mappings (assuming minor differences can be unified), worthwhile to try and unify the ones with minor differences

(Notes to myself: The Lookup definition can be summarized and partially analysed by custom CSV dump of Access table definitions and some scripting: tr '|' '\n' < TableCreateDefinition.csv | grep Lookup | sort)

Configuring Lookup Columns and in general Field properties for Query columns works, but it is unclear why sometimes the link between the query column and the underlying properties of the table column breaks, e.g. in psm_relocate_report. The lookup columns for Sending and Receiving District work for a newly created query. Also specifying the lookup properties for the column Transaction directly at this query works. But giving an additional Caption in the field properties seems to break these connections between query and underlying linked table for all existing lookup columns of this query. Weirdly these changes are working as expected as long as Access is not closed. But once it is closed (compacted?) and re-opened, only the IDs for the facilities are shown. Once this happens, there seems no way back to the previous state where the lookup columns were working. So for now any properties of lookup columns (including its Caption) appear to non-setable without breaking these lookups.

Establish linked tables

The Linked Table manager is still the main tool to manually establish links between an Access frontend and a MSSQL backend. This time an ODBC DSN (datasource name) is used. DSNs can be defined system-wide for every user (via registry), just configured for a specific user (also via registry), taken from a file or set up programmatically (DSN-less).

Unlike the system or user DSNs the File-based DSNs does not require any upfront system configuration and therefore can be easily passed around. Furthermore Access 'imports' the contents from the File-DSN (just a text/ini file) and basically uses a DSN-less connection after it was used. With this the frontend is 'self-contained' and can be used by other users/workstations without re-config as it knows by itself how and where to connect.

However it needs to be carefully used as the password for the connection can be (and sometimes needs to be saved) somewhere. It will be cleartext in the DSN-file (which should be ok as only 'database admins' need access to it), but it seems that with enough effort it could also be possible to extract this password from the Linked of the Access frontend.

Additionally MSSQL offers Trusted connections where the current Windows Domain User (from an Active Directory) is used to authenticate against MSSQL without exposing the passwords. Given the current understanding of both the environment at the office and this technical implementation, it seems like the best way forward.

MS offers several ODBC drivers to connect to a MSSQL database:

SQL Server: Pre-installed with Windows and extremely outdated
SQL Server Native Client 11.0: Used to be the primary ODBC driver back in the days; still comes with SQL Server 2014 installation; but according to https://msdn.microsoft.com/library/ms131415.aspx no longer supported
ODBC Driver 11 for SQL Server: Nowadays the preferred choice of the day https://msdn.microsoft.com/library/mt654048.aspx. If MSSQL is not installed, it requires an upfront installation of the ODBC driver package on every system to be connected from.

When using the Linked Table manager to link to MSSQL tables for the first time, every tables is prefixed with the schema name (dbo_ is the default). Tables can be manually renamed to match the current names, but this is a tedious activity. The following VBA script automates this process

Sub CDDB_ChopTableName()
	Dim tdf As DAO.TableDef
	For Each tdf In CurrentDb.TableDefs
		If Mid(tdf.Name, 1, 4) = "dbo_" Then
			tdf.Name = Mid(tdf.Name, 5, 64)
		End If
	Next
	Debug.Print "done"
End Sub

Even more convenient is the script 'RelinkAllTables' which automatically links (and re-links) to all known tables by simply removing (if necessary) and creating even the initial setup of the linked tables in any DB. With this script the File-based DSN is no longer needed and it can quickly establish the relevant links in any Access DB. (But when using username/password for the ODBC connection they need to be hard-coded in the script.) This approach could also be used to switch between different databases, e.g. for production, development, or testing.

Linked Tables need to be updated whenever the definition of the underlying tables changes in MSSQL (columns and/or indexes).

Keys used in Access-to-MSSQL

"When you create an ODBC-linked table, Office Access looks first for a clustered unique index. Each SQL Server table can only have one clustered index, which determines the order in which the data is physically stored. The clustered index does not have to be the primary key and does not even have to be a unique index (if not, SQL Server adds a value for each row that makes it unique). If all the unique indexes are nonclustered, Office Access uses the one that happens to be listed first alphabetically, which may not be the most efficient choice."

"If you have an index that you want Office Access to use for the keyset, it should either be a clustered index or it should have a name that sorts to the top of an alphabetical list. To control the choice of index programmatically or when you don't have control over the index names on SQL Server, there is no Office Access property that you can set in code. However, you can do something that seems like it shouldn't be possible: you can execute a Data Definition Language (DDL) query in Office Access that creates an index on the Office Access link itself, without affecting the linked SQL Server table. If you want to change a link from using one unique index to using another, you first must drop the index being used, because Office Access prevents you from creating a second unique index for a link that already has one assigned." (from https://technet.microsoft.com/en-us/library/bb188204(v=sql.90).aspx)

The conclusion is that for ease of use and consistency a linked table should always a dedicated unique, not nullable Primary Key.

Workaround to bring back some UI-related conveniences:

Unlike Access states when opening the Design view of a linked table, it still allows to change some of its settings for fields. E.g. Format patterns for a field like phone number or Lookup Columns to be automatically used in the default data sheet view. However these changes will get overwritten after re-linking the table. It is unclear if such UI-related configs on Linked Tables will always work the same way as in the Access-only environment (e.g. automatically being used in derived forms, ...). As these settings get overwritten during each (re)link (as there is no default place on the MSSQL table definitions), custom VBA code to store and retrieve Field properties of tables to and from a dedicated table in the DB was written (see VBA module ReapplyFieldProperties).

SQL differences between Access and MSSQL

String concatenation in Access is done through &, e.g. person.lastname & " " & person.firstname. If in a result column of a query one of the concatenated values is NULL, then an empty string '' is used instead. In MSSQL however a concatenation is done with + (like person.lastname + ' ' + person.firstname) and will result in a NULL value itself for the whole field, if one or more concatenated values contain NULL. Instead of using + (as it is also done through the migration assistant), the function CONCAT can be used to avoid testing for IsNull on every nullable field. This leads to CONCAT(person.lastname, ' ', person.firstname).
In SQL 'ordinary' views can't have an order criteria. As a workaround, a view definition with TOP can be used.
In general Access SQL is a superset of MSSQL SQL, therefore there are some expressions within Access that do not (easily) match to a MSSQL statement. These statements either need to be rewritten, or are simply executed on the frontend side (with most likely a severe performance bottleneck because more data needs to be fetched to the local DB).
Performance limitations: As Access SQL is more expressive (e.g. with the Iif statements, Date arithmetic, Trim) and the SQL parser of Access will not always find a way to efficiently create pure SQL statements for MSSQL, it simply retrieves the whole data set and does the processing client-side on the frontend. This can be very inefficient.
SELECT columns in SQL can't refer to other (named) columns, whereas in Access it is possible to 'label' a complex calculation in a Select column and reuse this calculation by its column name in another column. This is kind of inline user-defined functions that are not supported in MSSQL. Here a stored procedure needs to be written.

Security

The MSSQL backend (as well as newer Access versions) doesn't work with MDW (workgroup) files; permission/security needs to be removed for the migration:

Remove workgroup security from MDB file
Create empty DB
Import external data (all tables except the internal MSys... & queries) from another DB

Replacement thoughts:

A couple of ODBC connection choices:
- Trusted Connection through current Windows provider
- Dedicated user created on SQL Server
  - Share one SQL Server account for all ODBC connections and have a Access-local setup to verify users
  - Create a SQL Server user account for every valid DB user
Create/update user/timestamp fields are currently managed in VBA code for every form. So these fields only get updated if this code is executed. Updates to the generic datasheet view won't automatically pick up the relevant values.
If the SQL Server knows about all valid users, then it might automatically update these values on server side with triggers http://stackoverflow.com/questions/31634918/how-do-i-add-a-last-modified-and-created-column-in-a-sql-server-table
But the current VBA code should continue to work with most likely only minor modification to replace the CurrentUser calls with something else.

Querying the currently logged in user from the Access frontend is done with the query 'select system_user'. It returns either the current local SQL Server username or the Windows AD user (as domain\username). Within a SQL passthrough query this will return the current user login of MSSQL.

A custom public function CurrentUser() can overwrite the built-in Access Security function.

Queries

Access SQL is a superset of MSSQL in terms of available expressions and function calls. Furthermore also comparable functionality is sometimes/often(?) expressed in different ways (e.g. use of wildcards, Iif statement, Access string concatenation is done with &, in MSSQL with + or the CONCAT function). So Access needs to parse and potentially convert its query statements before handing over to MSSQL. For cases where Access doesn't know how to convert, the whole processing is done in the frontend. If subset of a dataset is required by the frontend and it can't properly 'ask' MSSQL to pre-filter it, then the whole dataset is pulled by the frontend and the filtering is done client-side. For big datasets this introduces a potentially big penalty on execution time. This is one of the major reasons why (maybe against common sense/wisdom), an Access-only environment with an Access backend DB is faster than with a dedicated MSSQL backend: The Access backend/optimizer has more chances to reduce the amount of data transferred from backend to frontend because it stays in its own environment.

Unfortunately it is difficult (impossible?) to automatically detect if the query processing is done in the Access frontend or the MSSQL backend. Among the known limitations where Access is not outsourcing the work are the following situations ()https://technet.microsoft.com/en-us/library/bb188204(v=sql.90).aspx):

Outer joins
Group by
Date functions
String concatenation
Aggregate expressions like count()
Nested named queries where one Access query refers to the result of another Access query
Passing row values to VBA functions
Queries that use Office Access–specific SQL syntax, such as PIVOT...TRANSFORM

In other words: Don't use query constructs that cause processing to be done by Access on the local computer. The following query operations force the Jet database engine to perform local data processing:

Join operations between table that are linked to different data source (I.e. a join between a SQL table and a linked Access table)
Join operations based on query that uses the DISTINCT keyword, or a query that contains a GROUP BY clause.
Outer joins that contain syntax that is not directly supported by the remote database server.
The LIKE operator used with Text or Memo fields
Multi-level grouping and totaling operations
GROUP BY clauses that are based on a query with the DISTINCT keyword, or the GROUP BY clause.
Crosstab queries that have more than one aggregate, or that have field, row, or column headings that contain aggregates, or that have an ORDER by clause
User-defined functions, or functions that are not supported by the remote server
Complex combinations of INNER JOIN, LEFT JOIN, or RIGHT JOIN operations in nested queries.

In some cases 'SQL passthrough queries' can be helpful. They completely bypass the Access parser and execution engine and are directly invoked on MSSQL. In most cases this should provide the highest performance as it is bypassing some layers within the Access stack. But often it requires changes in the query to be MSSQL-compliant (e.g. date calculations). Passthrough queries can not be defined through the Query Designer, they only work with the SQL view. Results of passthrough queries are readonly. Plus passthrough queries don't support input parameters from Access through the UI; they would need to be coded in VBA and then programmatically invoked. Additionally every passthrough query is configured with its own ODBC datasource. When using filebased DSNs or DSN-less connections and a change is required, every passthrough query also needs to be updated (just like with Linked Tables). However the Linked Table manager doesn't include queries, therefore custom VBA code to change these settings is required. Passthrough queries seem to be particular helpful when invoking mass updates via INSERT/UPDATE statements.

Creating views on MSSQL can be another approach. With this the server-side processing can be enforced to reduce the number of recordset processed by client-side Access. However views on MSSQL can not be parametrised and therefore can also not provide performance improvements in every case. Views will appear as normal linked tables in Access and can also only provide the starting point with simple/simpler calculations. However if access to frontend-local tables is required, views can't be used either. Lastly the setup and administration is higher (compared to the current Access-only approach) and views tend to be more 'heavyweight' in terms of setup effort.

So it seems that the best approach is to try and identify the most problematic queries first, e.g. the ones that are called most often, the ones that take most time and/or where the speed of the frontend is most crucial. From there a rewrite of the query/queries to make sure the most 'time expensive' operations to break down a bigger query into subqueries to help filtering the majority of recordsets in MSSQL. Afterwards have other queries doing the Access-specific things with the already limited dataset.

Debug with ODBC SQL logging (via sqlout.log) https://msdn.microsoft.com/en-us/library/bb177636(v=office.12).aspx http://blog.dafran.ca/post/2011/09/20/MSAccess-(2003)-SQL-Trace-(TraceSQLMode).aspx

pseudo index in access for linked tables:

apparently just to indicate access which 'identify' column to use on linked tables. otherwise they are only readonly.
apparently no speed impact
should be always a clustered index (the PK in SQL Server), but could be another one if not available (no PK) or not exposed through the ODBC driver (as with oracle): unexpected auto choose of pseudo index http://support.microsoft.com/default.aspx?scid=KB;EN-US;207745

Open Datasheet in Excel

Opening an Access datasheet in Excel changed a bit from Access 2003 to 2016. Now go to External Data - Export to Excel spreadsheet - Export data with formatting & Open destination file.

In Excel mark all (hitting CTRL-A or clicking top left cell) and hit 'Wrap text' in Ribbon bar twice.

It is possible to save these export steps, but stupiditly requires an absolute path (with therefore most likely the username in it). Thus making this less applicable for different computers and users. Additionally the saved export didn't automatically open Excel (but maybe I made a mistake while testing).

Updateable views

With MSSQL tables Access looses the implicit feature of having RowSource Lookup Column (semi-) automatically populate the generic datasheet with 'proper names' for relations with foreign keys (instead of displaying the IDs).

Through updateable views there might be a way to gain back some/most of this convenience: View and Edit with RowSource Lookup columns appear to be possible; Create and Delete most likely not. However when then editing a row make sure that only columns from one underlying table of the view is changed. Another problem is that the RowSource property of a LinkedTable disappears, when relinking/updating the links. So these RowSource properties would need to be redefined as well. Maybe there is a clever way to automate this setup.

Setup:

A materialised view in MSSQL (https://msdn.microsoft.com/en-us/library/ms180800.aspx) with linking the Lookup values together
Linking this view as a linked table into Access
Define the identifying column for the 'base table'
Configure RowSource properties for every column accordingly

An alternative approach could be to create a 'continuation form' for every relevant linked table where a RowSource Lookup feature could be helpful.

Query sorting

The order of records within the queries seem to differ at time. In the SQL standard an ORDER BY can not be provided during view creation. MSSQL offers a workaround, but even with Access queries using linked tables appear to have (minor) differences in the sorting. Current assumption is that this doesn't matter too much.

Creation of AutoNumber IDs

Within Access AutoNumber IDs are already generated when a new record is about to be created. In Forms and also datasheet tablegrid views these numbers already show up as soon as the data entry is ready to enter data for the new record. With MSSQL IDs are only assigned once the data is saved (via ODBC).

Todo: Check all place with AddNew code.

ODBC call failed (error 3146):

Maybe old version of jet engine? https://support.microsoft.com/en-us/kb/303257. But seems that even with Access 2007 this error persists.

Validation messages are not user friendly

Often/always? quite cryptic and generic messages are displayed from the ODBC connection, e.g. when a constraint violation occurred on MSSQL.

Possible custom error handler

Sub Update_Temp()
On Error GoTo ErrorTrap
    ' Execute connect code at this point
Exit_errortrap:
    Exit Sub
ErrorTrap:
    Dim myerror As Error
    For Each myerror In DBEngine.Errors
        With myerror
            If .Number <> 3146 Then
                MsgBox .Description
            End If
        End With
    Next
    Resume Exit_errortrap

End Sub

Datatypes Float vs. Real vs. Decimal/numeric

In general prefer Decimal whenever less precision than 38 digits is needed http://stackoverflow.com/questions/1056323/difference-between-numeric-float-and-decimal-in-sql-server.

When comparing data through the Access frontend technology with a direct MSSQL connection (through its internal TDS protocol from a nodejs Javascript client), different representations of numbers occurred. In particular the MSSQL type float tends to return slightly different raw numbers. However when comparing the output of the Access frontend linked to the Access backend and an Access frontend linked to the MSSQL database, both Access frontends behave the same. Therefore no action points arise from this, just documented for reference.

Float representation of 0:

rounding differences in table obs
-0.00 instead of 0.00
-many 'close-to-0' floats like 1.0e-03 (0.001)

Datatype Currency:

used 3x in pop_map and pop_sex_district
will be converted to MSSQL datatype Money

YesNo Datatype

SQL doesn't have a dedicated boolean type; often datatype bit is used and queries need to explicitly test for =0 (false) or =1 (true)
Yes/No fields are not-nullable in Access
SSMA message: "The bit column 'Voided' has no default value. To avoid problems with null values in Access applications, the default value '0' was added to target column."
Converted to nullable bit with default value 0, TODO: why not nullable in the first place?
In VBA code 0: true, -1: false; In Access export and SQL 1: true, 0: false; Impact on VBA code or other expressions?
Used in 12 tables

Datatype Date/Time

Per default project setting the SMAA converts the Access Date/Time into the SQL type datetime2. In MSSQL datetime2 was introduced to be SQL-compliant and increases both precision (below 1/3 second) and time range (for dates below year 1751 or so). However a simple addition of dates with '+' doesn't work anymore (most likely also a Non-Standard-SQL expression). Instead the dateadd function needs to be used. But as it looks the Access SQL parser doesn't (always?) translates the expressions.

SMAA - Default Project Settings - Type Mappings - date: Change from datetime2 to datetime.

DoCmd.RunSQL ("delete from art_drug_stocks where art_drug_stocks.expiry_date_min < #01/01/2000#") DoCmd.RunSQL ("delete from code_hdepartment where code_hdepartment.service_start < #01/01/1900#")

[[Section: Working with SQL Server datetime data ranges][https://blogs.msdn.microsoft.com/ssma/2011/03/06/access-to-sql-server-migration-understanding-data-type-conversions/]] http://stackoverflow.com/questions/31963130/sql-server-migration-assistant-for-access-date-data-type

Time-only date fields

Access can use “time-only datetime fields” where the date part is hidden with an Input mask. Still Access stores a full timestamp including the date part, but defaults the date fraction automatically to '12/30/1899' (default minimum date value in Access?). Validation Rules in Access for these fields do not need to specify this 'default' date fraction and the automatic migration also works, but constraints within MSSQL need to explicitly include this date. Like: (([start_time]>='12/30/1899 6:30:0' AND [start_time]<='12/30/1899 18:0:0' OR [start_time] IS NULL)) Time-only fields are only actively used in table art_clinic_obs for the fields start_time and end_time and defined for table htc_site_obs which doesn't have any records.

SSMA_TimeStamp

Due to different locking and concurrency behaviour, some tables (depending on the used columns) need an additional readonly column as a 'rowversion'. This is used by Access to detect and potentially cancel concurrent edits.

Relevant for Access Frontends for concurrency as for some columns it can't detect whether its local dataset is still the most recent one on the server; apparently Access automatically detects such a column in linked tables and includes it in its UPDATE and DELETE statements.
Useful to add a TimeStamp column as a default to every MSSQL table.
MS recommends to switch to datatype rowversion, but the SSMS doesn't support it. Appears as if rowversion and timestamp are fully synonymic.
Should not/never be updated manually, e.g. via INSERT/UPDATE statements or (partial) data imports.
http://stackoverflow.com/a/12943961
https://support.office.com/en-us/article/Move-Access-data-to-a-SQL-Server-database-by-using-the-Upsizing-Wizard-5d74c0df-c8cd-4867-8d07-e6e759d72924?CorrelationId=6463e63a-3477-4526-8b62-5ae4f6c4b788&ui=en-US&rs=en-US&ad=US&ocmsassetID=HA010275537

Alternative Migration tools

MySQL Workbench Migration

Access 2007
Import external data
Grant access to MSysRelationship
Show hidden and system objects in navigation pane
Save as 2007 (accdb)
ODBC User DSN, no system database
MySQL Workbench migration

Outcome:

No views
Table migration without warnings, errors (!!!!)

Access Upsizing Wizard

Upsizing wizard within Access only seems to know about older-ish SQL Server version. At least Access 2007 doesn't want to connect to SQL Server 2014 and according to some resource it only works with SQL Server 2005.

CommonNotes - x-ian/dha-mis-migration GitHub Wiki

General differences between Access and MSSQL

Features to be lost / Not migrate-able

Lookup Columns

Establish linked tables

Keys used in Access-to-MSSQL

Workaround to bring back some UI-related conveniences:

SQL differences between Access and MSSQL

Security

Queries

Open Datasheet in Excel

Updateable views

Query sorting

Creation of AutoNumber IDs

ODBC call failed (error 3146):

Validation messages are not user friendly

Datatypes Float vs. Real vs. Decimal/numeric

YesNo Datatype

Datatype Date/Time

Time-only date fields

SSMA_TimeStamp

Alternative Migration tools

MySQL Workbench Migration

Access Upsizing Wizard

Resources

⚠️ GitHub.com Fallback ⚠️

CommonNotes - x-ian/dha-mis-migration GitHub Wiki

General differences between Access and MSSQL

Features to be lost / Not migrate-able

Lookup Columns

Establish linked tables

Keys used in Access-to-MSSQL

Workaround to bring back some UI-related conveniences:

SQL differences between Access and MSSQL

Security

Queries

Open Datasheet in Excel

Updateable views

Query sorting

Creation of AutoNumber IDs

ODBC call failed (error 3146):

Validation messages are not user friendly

Datatypes Float vs. Real vs. Decimal/numeric

YesNo Datatype

Datatype Date/Time

Time-only date fields

SSMA_TimeStamp

Alternative Migration tools

MySQL Workbench Migration

Access Upsizing Wizard

Resources

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️