0400 Consistent Authentication Issue - microsoft/CSS_SQL_Networking_Tools GitHub Wiki
0400 Consistent Authentication Issue
0400.1 Typical Error Messages
- Cannot generate SSPI context
- Login failed for user '(null)'
- Login failed for user ''
- Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'
- Login failed for user 'JohnDoe'
- Login failed for user 'CONTOSO\JohnDoe'
- Login failed. The login is from an untrusted domain and cannot be used with Windows authentication.
- SQL Server does not exist or access denied (this can also be a network error)
0400.2 Moving Parts
The initial goal is to try to isolate which of the moving parts is causing the problem.
0400.3 Pre-Work
0400.3.1 Please perform the initial data collection and narrowing steps: 0100 Initial Data Collection and Scoping Questions. This will help get a macro perspective of the scope of an issue, such as whether the issue affects multiple computers or just one, or whether only those computers in a specific data center are facing issues. This can help focus the troubleshooting steps. It will also make you prepared for discussing the issue with Microsoft Support should you choose to do so.
0400.3.2 Make sure you understand the application architecture. Make a summary in a succinct form, similar to the below description:
- There are two domains involved: CONTOSO and FABRIKAM.
- The client (SPARKY.CONTOSO.COM) is Windows 2012.
- The user (CONTOSO\JOHNDOE) runs EDGE and connects to a web server (_HTTP://WEB01.CONTOSO.COM/Accounting) using Integrated security.
- The IIS app pool runs as (CONTOSO\WEB_SVC).
- The web server connects to SQL Server 2014 (SQLProd01.FABRIKAM.COM\Accounting on port 1433) using the SqlClient .NET 4.6.2 Provider and delegates the user credentials to SQL Server via integrated security.
- The SQL Server service account is FABRIKAM\SQL_SVC_01.
0400.3.3 Collect the SPN information based on the service accounts identified in the description. e.g.
SETSPN -L CONTOSO\WEB_SVC > c:\temp\spns.txt > creates a new file
SETSPN -L FABRIKAM\SQL_SVC_01 >> c:\temp\spns.txt >> appends to a file
0400.4 Directory Services Specific Error Messages
If the SQL Server ERRORLOG file contains the following messages and you have confirmed that this is the issue you are dealing with:
Error -2146893039 (0x80090311): No authority could be contacted for authentication.
Error -2146893052 (0x80090304): The Local Security Authority cannot be contacted.
then this is an Active Directory issue. The domain controller cannot be contacted by Windows on the SQL Server machine, or the local security service (LSASS) is having a problem.
If you need assistance, the Microsoft Active Directory team can assist in diagnosing this issue.
0400.5 Login Failed Error Codes
If you are troubleshooting a Login Failed error message, the SQL Server ERRORLOG file can give more information in the SQL State value with Error 18456 (Login Failed).
State | Description |
---|---|
1 | Error information is not available. This state usually means you do not have permission to receive the error details. Contact your SQL Server administrator for more information. |
2 | User ID is not valid. |
5 | User ID is not valid. |
6 | An attempt was made to use a Windows login name with SQL Server Authentication. |
7 | Login is disabled, and the password is incorrect. |
8 | The password is incorrect. |
9 | Password must be changed. |
11 | Login is valid, but server access failed. One possible cause of this error is when the Windows user has access to SQL Server as a member of the local administrators group, but Windows is not providing administrator credentials. To connect, start the connecting program using the Run as administrator option, and then add the Windows user to SQL Server as a specific login. |
12 | Login is valid login, but server access failed. |
18 | Password must be changed. |
These errors should be relatively easy to correct.
0400.6 Special Login Failed Errors
0400.6.1 Login Failed for user '' (empty string)
Empty string means that SQL tried to hand-off the credentials to LSASS but there was some problem. Either LSASS was not available or the domain controller could not be contacted. You may also see the corresponding SSPI error codes logged as noted in section 0400.4 above.:
SSPI handshake failed with error code 0x80090311 while establishing a connection with integrated security; the connection has been closed.
SSPI handshake failed with error code 0x80090304 while establishing a connection with integrated security; the connection has been closed.
These codes translate as following:
Error -2146893039 (0x80090311): No authority could be contacted for authentication.
Error -2146893052 (0x80090304): The Local Security Authority cannot be contacted.
You may also see a NETLOGON event posted in the System event log on the SQL machine. e.g.
Source: NETLOGON
Date: 8/12/2012 8:22:16 PM
Event ID: 5719
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: SQLProd01
Description:
This computer was not able to set up a secure session with a domain controller in domain NORTHAMERICA due to the following: The remote procedure call was cancelled. This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.
You can take the offending DC off-line or use NLTEST.EXE to switch DCs on the fly.
The query the DC: NLTEST /SC_QUERY:CONTOSO
To change the DC: NLTEST /SC_RESET:CONTOSO\DC03
If you need assistance, the Microsoft Active Directory team can assist in diagnosing this issue.
0400.6.2 Login Failed for user '(null)'
This means that LSASS was not able to decrypt the security token using the SQL Server service account credentials. The main reason for this is that the SPN is associated with the wrong account.
SETSPN -X and -Q are good commands to check for duplicate or misplaced SPNs.
You can also download the Kerberos Configuration Manager.
Also: https://support.microsoft.com/en-us/help/2985455/kerberos-configuration-manager-for-sql-server-is-available
0400.6.3 Login failed for user 'NT AUTHORITY\ANONYMOUS LOGIN'
There are at least three scenarios for this issue:
- The application using NTLM credentials and trying to do a double-hop. For double-hop, if on the same machine, you can set the DisableLoopback or BackConnectionHostNames registry values. If you intend on using Kerberos, then the SPN may be specifying the wrong server name, DNS-suffix, or port number, or you may have to create an SPN. The service account may not be trusted for delegation. The user account may be marked as Sensitive in Active Directory. There are many more reasons why Kerberos may not function as intended.
- If no double-hop is involved, then it could also mean that there are Duplicate or misplaced SPNs and the client is running as LocalSystem or other machine account that gets NTLM credentials instead of Kerberos credentials.
- The Local Security Policy | Security Options | Network Security can be configured not to use the machine account for local accounts going off-box; it would use Anonymous credentials instead.
0400.6.4 Login failed for 'username' or 'domain\username'
If the domain name is not specified, then it is a failing SQL login. If it is specified, then it is a failing Windows Integrated login.
The two main causes for this message are either (a) the database requested is off-line or otherwise not available or (b) the user does not have permissions to the requested database. In either case, this is generally an internal SQL permissions issue and not related to issues with NTLM or Kerberos infrastructure.
0400.7 SSPI Context Messages
- Download and run the SSPIClient tool from the client to the SQL Server machine.
- Use the following KLIST command to manually try to retrieve the SQL Server SPN. e.g.
KLIST GET MSSQLSvc\SQLProd01.contoso.com:1433
0400.8 Get a Security Network Capture
The following is if you want to engage Microsoft to assist with troubleshooting. This is a combination of network capture, Problem Steps Recorder (PSR.EXE) capture, and NETLOGON capture.
Run on the client and server machines at the same time. If the application is a 3-tier or n-tier architecture, run on intermediate servers, as well.
- Install NETMON or WIRESHARK on all affected machines, or you can use the built-in NETSH command (Windows 2008 or newer). No reboot is required.
- Enable NETLOGON debug logging on the client and all servers: NLTEST /DBFLAG:2080FFFF
- If possible, do ONE of the following: (a) reboot the client machine, (b) have the user log off and back on again, or (c) close the client application and re-open it.
- On the client, start the Problem Steps Recorder (psr.exe) and click "Start Record". This will accurately capture all user actions that lead up to the problem and save to a ZIP file at the end.
- Start the network capture on all machines. If using NETSH, use: NETSH TRACE START CAPTURE=YES TRACEFILE=C:\TEMP%computername%.ETL (use an appropriate file/path name).
- Flush the DNS cache on all machines: IPCONFIG /FLUSHDNS
- Clear the NETBIOS cache on all machines: NBTSTAT /RR
- Purge client Kerberos tickets: KLIST purge
- Clear tickets on each server: KLIST -li 0x3e7 purge (Note: Please type the command and do not use copy/paste into the command-line. The dash may get converted to a hyphen and break the command. KLIST is case-sensitive.)
- Reproduce the issue.
- Stop the PSR.EXE recording.
- Stop the network captures and save with a meaningful name, e.g. SQLProd01.netmon.cap. For NETSH: NETSH TRACE STOP. Wait for the command prompt to reappear. Do not close the command window until this happens.
- Copy the NETLOGON log (c:\windows\debug\netlogon.log) and rename to a meaningful name, e.g. SQLProd01.netlogon.log.
- Disable logging: NLTEST /DBFLAG:0x0