HDP 2.6.5 3.1 and Active Directory - stanislawbartkowski/wikis GitHub Wiki
Enabling HDP for Kerberos/Active Directory is an easy and straightforward task but can be a nightmare for the first time. One can spend a lot of time trying to guess what to do next or when problems come up. Below I will describe step by step how to secure HDP cluster by Active Directory.
The procedure is described here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_kerberos_overview.html
Test environment:
- Windows 2016 and Active Directory (hostname: verse1.fyre.net, domain name: FYRE.NET)
- CentOS 7.6 cluster, already controlled by Active Directory (https://github.com/stanislawbartkowski/wikis/wiki/CentOS---Active-Directory)
- HDP 2.6.5/3.1 installed
To avoid mixing Hadoop principals with regular Windows users, it is a good practice to create a separate container in Active Directory tree and a specialize admin account to manage the container. The admin account should have delegated control of "Create, delete, and manage user accounts" in the container. The admin account is necessary only during enabling Kerberos and any time a new HDP service is added or deleted. If HDP layout is stable, the account can be disabled to avoid any security risk.
In the example above:
- Container CN : CN=hadoop,DC=fyre,DC=net
- Admin account : CN=hadoopadmin,CN=hadoop,DC=fyre,DC=net
- Admin Kerberos principal: [email protected]
Important
If the container administrator is created, make sure the permissions for the container are granted "for this object and all descendants objects"
Check AD server is responding on non-secure port
nc -zv verse1.fyre.ibm.com 389
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 9.30.54.109:389.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
Test that non-secure connection is working. Run the command:
ldapsearch -b "cn=hadoop,dc=fyre,dc=net" -W -D [email protected] -H ldap://verse1.fyre.net
openssl s_client -connect verse1.fyre.net:636 -showcerts
Obtain Active Directory Domain certificate authority (CA).
Windows:
Certificate Authority -> (domain) -> Properties -> General -> View Certificates -> Details -> Copy to file -> Base-4 encoded X.509
The certificate is a text file and can be safely copied and pasted.
-----BEGIN CERTIFICATE-----
MIIF0zCCBLugAwIBAgITSAAAAAS50zoLbfaRKwAAAAAABDANBwwwkdleoe12l2l3
.............
1qaw23w044kfkBh/XIac6H4qHuYmH9cHX3wl3IVpUx8R/Mharls0SRMpvG2hk8x0
/pseem9sgQ==
-----END CERTIFICATE-----
Copy Active Directory CA to /etc/openldap/ad.cert (can be any other location)
vi /etc/openldap/ad.cert
(copy and paste)
Modify /etc/openldap/ldap.conf file
vi /etc/openldap/ldap.conf
...
#TLS_CACERTDIR /etc/openldap/certs
TLS_CACERT /etc/openldap/ad.cert
...
Run ldapsearch again using secure connection (the only difference to non-secure is ldaps)
ldapsearch -b "cn=hadoop,dc=fyre,dc=net" -W -D [email protected] -H ldaps://verse1.fyre.net
The command should yield the same result as non-secure
If the non-secure connection is working and secure is failing then the problem is related to an invalid certificate. Run ldapsearch again with debug option and try to figure out the reason.
ldapsearch -b "cn=hadoop,dc=fyre,dc=net" -W -D [email protected] -H ldaps://verse1.fyre.net -d 1
keytool -import -file /etc/openldap/ad.cert -alias ambari-server -keystore /etc/openldap/truststore.jks
Instead of /etc/openldap/truststore.jks any other location can be used.
Verify the content of keystore using command
keytool -list -keystore /etc/openldap/truststore.jks
Enter keystore password:
Keystore type: jks
Keystore provider: SUN
Your keystore contains 1 entry
ambari-server, 2019-02-04, trustedCertEntry,
Certificate fingerprint (SHA1): 67:E2:01:ED:36:1C:1F:4B:AA:2C:B5:07:D1:92:E6:5E:B3:70:ED:8E
ambari-server setup-security
Using python /usr/bin/python
Security setup options...
===========================================================================
Choose one of the following options:
[1] Enable HTTPS for Ambari server.
[2] Encrypt passwords stored in ambari.properties file.
[3] Setup Ambari kerberos JAAS configuration.
[4] Setup truststore.
[5] Import certificate to truststore.
===========================================================================
Enter choice, (1-5): 4
Do you want to configure a truststore [y/n] (y)? y
TrustStore type [jks/jceks/pkcs12] (jks):
Path to TrustStore file :/etc/openldap/truststore.jks
Password for TrustStore:
Re-enter password:
Ambari Server 'setup-security' completed successfully.
ambari-server restart
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_installing_the_jce.html
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/_use_an_existing_active_directory_domain.html
Information | Description | Example |
---|---|---|
KDC host | AD hostname | verse1.fyre.net |
Realm name | AD/Kerberos realm name | FYRE.NET |
Secure LDAP URL | AD/LDAP URL | ldaps://verse1.fyre.net |
AD container | DN of previously created AD container | OU=hadoop,DC=fyre,DC=net |
KAdmin host | AD hostname | verse1.fyre.net |
Admin principal | Admin account for AD container | [email protected] |
Admin password |
Lesson learned: If Kerbers Wizard is stopped or cancelled in the middle of the process then the cluster can be reported as "Kerberos enabled" which is wrong. Also "Disable Kerberos" Wizard is not operating. The solution is to remove security marker directly in Postgres/MySQL database. Run the command:
update clusters set security_type='NONE';
and restart Ambari server.
If Wizard fails, the log files can be found in /var/log/ambari-server/ambari-server.log.
Possible problems:
Caused by: javax.naming.NoPermissionException: [LDAP: error code 50 - 00000005: SecErr: DSID-03152870, problem 4003 (INSUFF_ACCESS_RIGHTS), data 0
^@]; remaining name 'cn=mycluster-020419,CN=hadoop,DC=fyre,DC=net'
at com.sun.jndi.ldap.LdapCtx.mapErrorCode(LdapCtx.java:3162)
at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:3100)
at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:2891)
at com.sun.jndi.ldap.LdapCtx.c_createSubcontext(LdapCtx.java:812)
at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_createSubcontext(ComponentDirContext.java:341)
at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.createSubcontext(PartialCompositeDirContext.java:268)
at javax.naming.directory.InitialDirContext.createSubcontext(InitialDirContext.java:202)
at org.apache.ambari.server.serveraction.kerberos.ADKerberosOperationHandler.createPrincipal(ADKerberosOperationHandler.java:336)
... 8 more
Solution: verify that container administrator has required privileges.
04 lut 2019 12:37:16,169 INFO [ambari-client-thread-2974] AmbariManagementControllerImpl:4173 - Received action execution request, clusterName=MyCluster, request=isCommand :true, action :null, command :KERBEROS_SERVICE_CHECK, inputs :{HAS_RESOURCE_FILTERS=true}, resourceFilters: [RequestResourceFilter{serviceName='KERBEROS', componentName='null', hostNames=[]}], exclusive: false, clusterName :MyCluster
04 lut 2019 12:37:16,303 WARN [ambari-client-thread-2974] ADKerberosOperationHandler:470 - Failed to communicate with the Active Directory at ldaps://verse1.fyre.net: simple bind failed: verse1.fyre.net:636
javax.naming.CommunicationException: simple bind failed: verse1.fyre.net:636 [Root exception is javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target]
at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:219)
at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2791)
at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:319)
Incorrect AD certificate.
Solution: verify the ambari-server trusted keystore. Import again a valid CA. Restart ambari-server.
Authenticate user1 on remote client
kinit [email protected]
Log into HDP host
ssh user1@host1
Verify group membership, should reflect group memberhip in Active Directory tree.
hdfs groups sb
[email protected] : domain users
hdfs groups user1
user1 : domain users centosgroup
List HDFS directory
hdfs dfs -ls /tmp
Found 6 items
drwx------ - ambari-qa hdfs 0 2019-02-03 20:57 /tmp/ambari-qa
drwxr-xr-x - hdfs hdfs 0 2019-02-03 20:19 /tmp/entity-file-history
-rwxr-xr-x 3 hdfs hdfs 1450 2019-02-04 13:48 /tmp/id10ac4abf_date480419
-rwxr-xr-x 3 hdfs hdfs 1515 2019-02-03 20:53 /tmp/id10acdb97_date530319
drwxr-xr-x - ambari-qa hdfs 0 2019-02-03 20:57 /tmp/tezsmokeinput
drwxr-xr-x - ambari-qa hdfs 0 2019-02-04 13:49 /tmp/tezsmokeoutput
Invalidate Kerberos ticket
kdestroy
Try to access HDFS again
hdfs dfs -ls /tmp
9/02/04 19:25:12 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "host1.fyre.ibm.com/172.16.158.91"; destination host is: "host2.fyre.ibm.com":8020;
In Active Directory create a group datascience and make user3 the member of the group. Then create HDFS directory /apps/datalake and make user2 the owner of the directory and datascience group allowed to read only the data.
Expected result
- user1 - not belonging to datascience group, not having any access to data in /apps/datalake
- user2 - data owner, can upload, modify or delete data in /apps/datalake
- user3 - proud datascience member, can read data but is not allowed to modify data any way.
As hdfs user:
hdfs dfs -mkdir /datalake
hdfs dfs -chown user2:datascience /datalake
hdfs dfs -chmod 750 /datalake
As user2 user:
echo "Hello, confidential data" >data.txt
hdfs dfs -copyFromLocal data.txt /datalake
hdfs dfs -ls /datalake
Found 1 items
-rw-r--r-- 3 user2 datascience 25 2019-02-10 20:19 /datalake/data.txt
As malicious user1 user, try to steal data:
hdfs dfs -cat /datalake/data.txt
cat: Permission denied: user=user1, access=EXECUTE, inode="/datalake/data.txt":user2:datascience:drwxr-x---
hdfs dfs -copyToLocal /datalake/data.txt
copyToLocal: Permission denied: user=user1, access=EXECUTE, inode="/datalake/data.txt":user2:datascience:drwxr-x---
As a trusted user3 user:
hdfs dfs -cat /datalake/data.txt
Hello, confidential data
Try to delete data
hdfs dfs -rm -skipTrash /datalake/data.txt
rm: Permission denied: user=user3, access=EXECUTE, inode="/datalake/data.txt":hdfs:hdfs:drwx------
Try to upload new data
echo "Hello, confidential data" >data1.txt
hdfs dfs -copyFromLocal data1.txt /datalake
copyFromLocal: Permission denied: user=user3, access=WRITE, inode="/apps/datalake/data1.txt._COPYING_":user2:datascience:drwxr-x---
In order to prove that our cluster is protected also against raw Java API access I created a simple Java application.
https://github.com/stanislawbartkowski/JavaHadoopClient
Just clone the project.
git clone https://github.com/stanislawbartkowski/JavaHadoopClient.git
cd JavaHadoopClient
mvn package
Review env.rc source file and adjust it to your environment if necessary. The application should be executed inside Hadoop cluster.
As user2 user. upload.sh, create a text file in confidential /aps/datalake directory
./upload.sh
Deleted /apps/datalake/uploaded.txt
19/03/14 20:51:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 20:51:30 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Create file /apps/datalake/uploaded.txt
Now I'm reading using hdfs command line
I was created directly but Hadoop Java Clinet
I'm good
How are you? How things are going?
As malicious user1 user, try to download confidential data /apps/datalake/uploaded.txt created by data owner user2
./download.sh
9/03/14 20:58:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 20:58:44 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Output file /apps/datalake/uploaded.txt
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=EXECUTE, inode="/apps/datalake/uploaded.txt":hdfs:hdfs:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)
at org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:538)
./list.sh
19/03/14 21:00:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 21:00:20 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
List directory content :/apps/datalake
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=READ_EXECUTE, inode="/apps/datalake":hdfs:hdfs:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:252)
As a trusted user3 user, try to read or download data.
./download.sh
19/03/14 21:02:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 21:03:00 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Output file /apps/datalake/uploaded.txt
I was created directly but Hadoop Java Clinet
I'm good
How are you? How things are going?
Try to create data. Beforehand, remove previously created /apps/datalake/uploaded.txt file.
./upload.sh
rm: `/apps/datalake/uploaded.txt': No such file or directory
19/03/14 21:08:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Current user:[email protected] (auth:KERBEROS)
19/03/14 21:08:59 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Create file /apps/datalake/uploaded.txt
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=user3, access=EXECUTE, inode="/apps/datalake/uploaded.txt":hdfs:hdfs:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)
Securing cluster by MIT Kerberos requires almost the same steps as for Active Directory and provides the same level of protection. MIT Kerberos is only KDC service provider, to get LDAP services, the additional solution should be configured, for instance, OpenLDAP.
Information how to enable RedHat/CentOS machine for MIT Kerberos and OpenLDDAP https://github.com/stanislawbartkowski/wikis/wiki/CentOS---Kerberos---LDAP
This problem happens if the Kerberos machine hostname is not identical with the hostname used by client machine. It could happen in case of Docker Kerberos. The client machine is expecting kadmin/{hostname}@{REALM} principal in KDC database.
To sort this problem, the expected kadmin principal should be added manually.
Example assumig kerberos.sb.com Kerberos machine hostname and CENTOS.COM.REALM realm.
kadmin -p admin/admin
Authenticating as principal admin/admin with password.
Password for admin/[email protected]:
addprinc -randkey kadmin/[email protected]
WARNING: no policy specified for kadmin/[email protected]; defaulting to no policy
Principal "kadmin/[email protected]" created.
export env KRB5_TRACE=/dev/stdout
java.io.IOException: Could not configure server because SASL configuration did not allow the ZooKeeper server to authenticate itself properly
Add property to krb5.conf template.
udp_preference_limit = 1