HDP 2.6.5 3.1 and Active Directory - stanislawbartkowski/wikis GitHub Wiki

Inspiration

Enabling HDP for Kerberos/Active Directory is an easy and straightforward task but can be a nightmare for the first time. One can spend a lot of time trying to guess what to do next or when problems come up. Below I will describe step by step how to secure HDP cluster by Active Directory.

Prerequisites

The procedure is described here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_kerberos_overview.html
Test environment:

Create container and administrator account in Active Directory

To avoid mixing Hadoop principals with regular Windows users, it is a good practice to create a separate container in Active Directory tree and a specialize admin account to manage the container. The admin account should have delegated control of "Create, delete, and manage user accounts" in the container. The admin account is necessary only during enabling Kerberos and any time a new HDP service is added or deleted. If HDP layout is stable, the account can be disabled to avoid any security risk.
alt
In the example above:

  • Container CN : CN=hadoop,DC=fyre,DC=net
  • Admin account : CN=hadoopadmin,CN=hadoop,DC=fyre,DC=net
  • Admin Kerberos principal: [email protected]

Important
If the container administrator is created, make sure the permissions for the container are granted "for this object and all descendants objects"

alt

Configure secure LDAP connection between Ambari host and Active Directory

Test non-secure connection

Check AD server is responding on non-secure port

nc -zv verse1.fyre.ibm.com 389

Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 9.30.54.109:389.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

Test that non-secure connection is working. Run the command:

ldapsearch -b "cn=hadoop,dc=fyre,dc=net" -W -D [email protected] -H ldap://verse1.fyre.net

Get AD certificate

openssl s_client -connect verse1.fyre.net:636 -showcerts

Obtain Active Directory Domain certificate authority (CA). Windows:
Certificate Authority -> (domain) -> Properties -> General -> View Certificates -> Details -> Copy to file -> Base-4 encoded X.509 alt The certificate is a text file and can be safely copied and pasted.

-----BEGIN CERTIFICATE-----
MIIF0zCCBLugAwIBAgITSAAAAAS50zoLbfaRKwAAAAAABDANBwwwkdleoe12l2l3
.............
1qaw23w044kfkBh/XIac6H4qHuYmH9cHX3wl3IVpUx8R/Mharls0SRMpvG2hk8x0
/pseem9sgQ==
-----END CERTIFICATE-----

Configure CentOS secure LDAP

Copy Active Directory CA to /etc/openldap/ad.cert (can be any other location)

vi /etc/openldap/ad.cert

(copy and paste)
Modify /etc/openldap/ldap.conf file

vi /etc/openldap/ldap.conf

...
#TLS_CACERTDIR  /etc/openldap/certs
TLS_CACERT /etc/openldap/ad.cert
...

Run ldapsearch again using secure connection (the only difference to non-secure is ldaps)

ldapsearch -b "cn=hadoop,dc=fyre,dc=net" -W -D [email protected] -H ldaps://verse1.fyre.net

The command should yield the same result as non-secure

LDAPS troubleshooting

If the non-secure connection is working and secure is failing then the problem is related to an invalid certificate. Run ldapsearch again with debug option and try to figure out the reason.

ldapsearch -b "cn=hadoop,dc=fyre,dc=net" -W -D [email protected] -H ldaps://verse1.fyre.net -d 1

Prepare Ambari server for Active directory

Create keystore and include Active Directory CA

keytool -import -file /etc/openldap/ad.cert -alias ambari-server -keystore /etc/openldap/truststore.jks

Instead of /etc/openldap/truststore.jks any other location can be used.
Verify the content of keystore using command

keytool -list -keystore /etc/openldap/truststore.jks

Enter keystore password:  
Keystore type: jks
Keystore provider: SUN

Your keystore contains 1 entry

ambari-server, 2019-02-04, trustedCertEntry, 
Certificate fingerprint (SHA1): 67:E2:01:ED:36:1C:1F:4B:AA:2C:B5:07:D1:92:E6:5E:B3:70:ED:8E

Import keystore into Ambari server

ambari-server setup-security

Using python  /usr/bin/python
Security setup options...
===========================================================================
Choose one of the following options: 
  [1] Enable HTTPS for Ambari server.
  [2] Encrypt passwords stored in ambari.properties file.
  [3] Setup Ambari kerberos JAAS configuration.
  [4] Setup truststore.
  [5] Import certificate to truststore.
===========================================================================
Enter choice, (1-5): 4
Do you want to configure a truststore [y/n] (y)? y
TrustStore type [jks/jceks/pkcs12] (jks):
Path to TrustStore file :/etc/openldap/truststore.jks
Password for TrustStore:
Re-enter password: 
Ambari Server 'setup-security' completed successfully.

Restart ambari-server

ambari-server restart

Check Active Directory prerequisites

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_installing_the_jce.html
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_use_an_existing_active_directory_domain.html

Enable Kerberos Security

Collect all information

Information Description Example
KDC host AD hostname verse1.fyre.net
Realm name AD/Kerberos realm name FYRE.NET
Secure LDAP URL AD/LDAP URL ldaps://verse1.fyre.net
AD container DN of previously created AD container OU=hadoop,DC=fyre,DC=net
KAdmin host AD hostname verse1.fyre.net
Admin principal Admin account for AD container [email protected]
Admin password

Run Kerberos Wizard

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_launching_the_kerberos_wizard_automated_setup.html

Troubleshooting

Lesson learned: If Kerbers Wizard is stopped or cancelled in the middle of the process then the cluster can be reported as "Kerberos enabled" which is wrong. Also "Disable Kerberos" Wizard is not operating. The solution is to remove security marker directly in Postgres/MySQL database. Run the command:

 update clusters set security_type='NONE';

and restart Ambari server.

If Wizard fails, the log files can be found in /var/log/ambari-server/ambari-server.log.
Possible problems:

Caused by: javax.naming.NoPermissionException: [LDAP: error code 50 - 00000005: SecErr: DSID-03152870, problem 4003 (INSUFF_ACCESS_RIGHTS), data 0
^@]; remaining name 'cn=mycluster-020419,CN=hadoop,DC=fyre,DC=net'
        at com.sun.jndi.ldap.LdapCtx.mapErrorCode(LdapCtx.java:3162)
        at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:3100)
        at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:2891)
        at com.sun.jndi.ldap.LdapCtx.c_createSubcontext(LdapCtx.java:812)
        at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_createSubcontext(ComponentDirContext.java:341)
        at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.createSubcontext(PartialCompositeDirContext.java:268)
        at javax.naming.directory.InitialDirContext.createSubcontext(InitialDirContext.java:202)
        at org.apache.ambari.server.serveraction.kerberos.ADKerberosOperationHandler.createPrincipal(ADKerberosOperationHandler.java:336)
        ... 8 more

Solution: verify that container administrator has required privileges.

04 lut 2019 12:37:16,169  INFO [ambari-client-thread-2974] AmbariManagementControllerImpl:4173 - Received action execution request, clusterName=MyCluster, request=isCommand :true, action :null, command :KERBEROS_SERVICE_CHECK, inputs :{HAS_RESOURCE_FILTERS=true}, resourceFilters: [RequestResourceFilter{serviceName='KERBEROS', componentName='null', hostNames=[]}], exclusive: false, clusterName :MyCluster
04 lut 2019 12:37:16,303  WARN [ambari-client-thread-2974] ADKerberosOperationHandler:470 - Failed to communicate with the Active Directory at ldaps://verse1.fyre.net: simple bind failed: verse1.fyre.net:636
javax.naming.CommunicationException: simple bind failed: verse1.fyre.net:636 [Root exception is javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target]
        at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:219)
        at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2791)
        at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:319)

Incorrect AD certificate.
Solution: verify the ambari-server trusted keystore. Import again a valid CA. Restart ambari-server.

Test1, authentication

Authenticate user1 on remote client

kinit [email protected]

Log into HDP host

ssh user1@host1

Verify group membership, should reflect group memberhip in Active Directory tree.

hdfs groups sb

[email protected] : domain users

hdfs groups user1

user1 : domain users centosgroup

List HDFS directory

hdfs dfs -ls /tmp

Found 6 items
drwx------   - ambari-qa hdfs          0 2019-02-03 20:57 /tmp/ambari-qa
drwxr-xr-x   - hdfs      hdfs          0 2019-02-03 20:19 /tmp/entity-file-history
-rwxr-xr-x   3 hdfs      hdfs       1450 2019-02-04 13:48 /tmp/id10ac4abf_date480419
-rwxr-xr-x   3 hdfs      hdfs       1515 2019-02-03 20:53 /tmp/id10acdb97_date530319
drwxr-xr-x   - ambari-qa hdfs          0 2019-02-03 20:57 /tmp/tezsmokeinput
drwxr-xr-x   - ambari-qa hdfs          0 2019-02-04 13:49 /tmp/tezsmokeoutput

Invalidate Kerberos ticket

kdestroy

Try to access HDFS again

hdfs dfs -ls /tmp

9/02/04 19:25:12 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "host1.fyre.ibm.com/172.16.158.91"; destination host is: "host2.fyre.ibm.com":8020; 

Test2, authorization

Test description

In Active Directory create a group datascience and make user3 the member of the group. Then create HDFS directory /apps/datalake and make user2 the owner of the directory and datascience group allowed to read only the data.
Expected result

  • user1 - not belonging to datascience group, not having any access to data in /apps/datalake
  • user2 - data owner, can upload, modify or delete data in /apps/datalake
  • user3 - proud datascience member, can read data but is not allowed to modify data any way.

Prepare datalake directory

As hdfs user:

hdfs dfs -mkdir /datalake
hdfs dfs -chown user2:datascience /datalake
hdfs dfs -chmod 750 /datalake

Data owner, upload data

As user2 user:

echo "Hello, confidential data" >data.txt
hdfs dfs -copyFromLocal data.txt /datalake
hdfs dfs -ls /datalake

Found 1 items
-rw-r--r--   3 user2 datascience         25 2019-02-10 20:19 /datalake/data.txt

Not authorized access

As malicious user1 user, try to steal data:

hdfs dfs -cat /datalake/data.txt

cat: Permission denied: user=user1, access=EXECUTE, inode="/datalake/data.txt":user2:datascience:drwxr-x---

hdfs dfs -copyToLocal /datalake/data.txt

copyToLocal: Permission denied: user=user1, access=EXECUTE, inode="/datalake/data.txt":user2:datascience:drwxr-x---

Authorized read-only access

As a trusted user3 user:

hdfs dfs -cat /datalake/data.txt

Hello, confidential data

Try to delete data

hdfs dfs -rm -skipTrash /datalake/data.txt

rm: Permission denied: user=user3, access=EXECUTE, inode="/datalake/data.txt":hdfs:hdfs:drwx------

Try to upload new data

echo "Hello, confidential data" >data1.txt
hdfs dfs -copyFromLocal data1.txt /datalake

copyFromLocal: Permission denied: user=user3, access=WRITE, inode="/apps/datalake/data1.txt._COPYING_":user2:datascience:drwxr-x---

Test3, Java client and authorization

Java application

In order to prove that our cluster is protected also against raw Java API access I created a simple Java application.

https://github.com/stanislawbartkowski/JavaHadoopClient

Just clone the project.

git clone https://github.com/stanislawbartkowski/JavaHadoopClient.git
cd JavaHadoopClient
mvn package

Review env.rc source file and adjust it to your environment if necessary. The application should be executed inside Hadoop cluster.

Data owner, upload data

As user2 user. upload.sh, create a text file in confidential /aps/datalake directory

./upload.sh

Deleted /apps/datalake/uploaded.txt
19/03/14 20:51:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 20:51:30 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Create file /apps/datalake/uploaded.txt

Now I'm reading using hdfs command line

I was created directly but Hadoop Java Clinet
I'm good
How are you? How things are going?

Not authorized access

As malicious user1 user, try to download confidential data /apps/datalake/uploaded.txt created by data owner user2

./download.sh

9/03/14 20:58:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 20:58:44 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Output file /apps/datalake/uploaded.txt
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=EXECUTE, inode="/apps/datalake/uploaded.txt":hdfs:hdfs:drwx------
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)
	at org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:538)

./list.sh

19/03/14 21:00:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 21:00:20 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
List directory content :/apps/datalake
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=READ_EXECUTE, inode="/apps/datalake":hdfs:hdfs:drwx------
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:252)

Authorized read-only access

As a trusted user3 user, try to read or download data.

./download.sh

19/03/14 21:02:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 21:03:00 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Output file /apps/datalake/uploaded.txt
I was created directly but Hadoop Java Clinet
I'm good
How are you? How things are going?

Try to create data. Beforehand, remove previously created /apps/datalake/uploaded.txt file.

./upload.sh

rm: `/apps/datalake/uploaded.txt': No such file or directory
19/03/14 21:08:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 21:08:59 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Create file /apps/datalake/uploaded.txt
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=user3, access=EXECUTE, inode="/apps/datalake/uploaded.txt":hdfs:hdfs:drwx------
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)

Good news! Your HDP cluster is guarded by combined forces of Kerberos and Active Directory.

HDP and MIT Kerberos

Securing cluster by MIT Kerberos requires almost the same steps as for Active Directory and provides the same level of protection. MIT Kerberos is only KDC service provider, to get LDAP services, the additional solution should be configured, for instance, OpenLDAP.
Information how to enable RedHat/CentOS machine for MIT Kerberos and OpenLDDAP https://github.com/stanislawbartkowski/wikis/wiki/CentOS---Kerberos---LDAP

Server not found in Kerberos database while getting initial credentials

This problem happens if the Kerberos machine hostname is not identical with the hostname used by client machine. It could happen in case of Docker Kerberos. The client machine is expecting kadmin/{hostname}@{REALM} principal in KDC database. To sort this problem, the expected kadmin principal should be added manually. Example assumig kerberos.sb.com Kerberos machine hostname and CENTOS.COM.REALM realm.

kadmin -p admin/admin

Authenticating as principal admin/admin with password.
Password for admin/[email protected]: 

addprinc -randkey kadmin/[email protected]

WARNING: no policy specified for kadmin/[email protected]; defaulting to no policy
Principal "kadmin/[email protected]" created.

kinit troubleshooting

export env KRB5_TRACE=/dev/stdout

Zookeeper cannot start after Kerberization

java.io.IOException: Could not configure server because SASL configuration did not allow the  ZooKeeper server to authenticate itself properly

Add property to krb5.conf template.

udp_preference_limit = 1

⚠️ **GitHub.com Fallback** ⚠️