HBase Ranger - stanislawbartkowski/hdpactivedirectory GitHub Wiki
Make sure that AD test users are prepared according to https://github.com/stanislawbartkowski/hdpactivedirectory/blob/master/README.md#ad-users-and-groups-used-for-testing.
| User | Group | Role |
|---|---|---|
| user1 | - | Malicious user, access blocked |
| user2 | dataadmin | Data administrator, can read and modify the data |
| user3 | datascience | Data consumer, can read but cannot modify the data |
Enable Ranger plugin for HBase.
Cloudera.

As hbase superuser, create HBase namespace datalake
In CPD cluster, hbase cannot be accessed directly. Create alternative hbase superuser. https://github.com/stanislawbartkowski/wikis/wiki/IBM-BigSQL-and-Cloudera#hbase
su - hbase
kinit
hbase shellcreate_namespace 'datalake'
- user2:dataadmin - can create and load data into HBase any table in datalake namespace
- user3:datascience - is allowed only to read data in datalake namespace, cannot modify anything
- user1:(no group) - is denied any access to datalake namespace.
As HBase table enter datalake:*. The policy should be defined at the group level.

hbase shell
create 'datalake:testdata','cf1'
put 'datalake:testdata',1,'cf1:name','Hello'
put 'datalake:testdata',1,'cf1:number',1
put 'datalake:testdata',2,'cf1:name','Hello2'
put 'datalake:testdata',2,'cf1:number',2
put 'datalake:testdata',3,'cf1:name','Hello3'
put 'datalake:testdata',3,'cf1:number',3
scan 'datalake:testdata'
ROW COLUMN+CELL
1 column=cf1:name, timestamp=1561061404909, value=Hello
1 column=cf1:number, timestamp=1561061417353, value=1
2 column=cf1:name, timestamp=1561061498304, value=Hello2
2 column=cf1:number, timestamp=1561061498347, value=2
3 column=cf1:name, timestamp=1561061498419, value=Hello3
3 column=cf1:number, timestamp=1561061500005, value=3
3 row(s)
Took 0.8145 seconds
hbase shell
scan 'datalake:testdata'
ROW COLUMN+CELL
1 column=cf1:name, timestamp=1561061404909, value=Hello
1 column=cf1:number, timestamp=1561061417353, value=1
2 column=cf1:name, timestamp=1561061498304, value=Hello2
2 column=cf1:number, timestamp=1561061498347, value=2
3 column=cf1:name, timestamp=1561061498419, value=Hello3
3 column=cf1:number, timestamp=1561061500005, value=3
3 row(s)
Took 0.8145 seconds
Try to modify data
put 'datalake:testdata',3,'cf1:number',3
ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user ‘[email protected]',action: put, tableName:datalake:testdata, family:cf1, column: number
Try to disable table
disable 'datalake:testdata'
ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user '[email protected]' (action=create)
Try to create another table
create 'datalake:mytable','cf1'
ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user '[email protected]' (action=create)
hbase shell
scan 'datalake:testdata'
ROW COLUMN+CELL
ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user ‘[email protected]',action: scannerOpen, tableName:datalake:testdata, family:cf1.
There is a mistake in HDP 3.1. HBase REST API does not impersonate users, all activities are executed as hbase user. It means that any user having access to the HBase REST API server has full privileges in HBase regardless of any security settings. The fix is to replace hbase-rest jar delivered with the standard HDP payloads with the latest hbase-rest version. The test was conducted using rel/2.2.0 GitHub version.
The same problem persists in Cloudera, CDP 7.1.4. The workaround below was tested only for HDP.
git clone https://github.com/apache/hbase.git -b branch-2.0
cd hbase
mvn package -DskipTests
(as root user)
cd /usr/hdp/3.1.0.0-78/hbase/lib
(archive existing hbase-test jar file)
mkdir arch
mv mv hbase-rest-2.0.2.3.1.0.0-78.jar arch/
unlink hbase-rest.jar
(assuming /home/hbase/hbase as cloned Git repository)
ln -s /home/hbase/hbase/hbase-rest/target/hbase-rest-2.0.6-SNAPSHOT.jar hbase-rest.jar
Custom hbase-site.xml
| Parameter | Value |
|---|---|
| hbase.rest.support.proxyuser | true |
| hbase.rest.authentication.type | kerberos |
| hbase.rest.authentication.kerberos.keytab | /etc/security/keytabs/spnego.service.keytab |
| hbase.rest.authentication.kerberos.principal | <appropriate principal> |
| hbase.rest.keytab.file | /etc/security/keytabs/hbase.service.keytab |
| hbase.rest.kerberos.principal | <appropriate principal> |
HDFS, custome core-site.xml
| Parameter | Value |
|---|---|
| hadoop.proxyuser.hbase.hosts | * |
| hadoop.proxyuser.hbase.groups | * |
Restart all services affected.
- HDP :
HBase REST API is not enabled as a default. Should be manually activated when the cluster is started.
As hbase user on the host where HBase Master is installed.
/usr/hdp/current/hbase-master/bin/hbase-daemon.sh start rest -p 9090
Wait several minutes until the service is ready.
HDP: Verify that the server is responding.
nc -vz localhost 9090
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:9090.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
CDP: default port is 20550
nc -zv pimiento3 20550
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "http://http://hurds1.fyre.ibm.com:9090/version"
...............
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Version JVM="Oracle Corporation 1.8.0_201-25.201-b09" Jersey="" OS="Linux 3.10.0-957.10.1.el7.x86_64 amd64" REST="0.0.3" Server="jetty/9.3.25.v20180904"/>
List all tables.
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "http://http://hurds1.fyre.ibm.com:9090"
Stop or restart
/usr/hdp/current/hbase-master/bin/hbase-daemon.sh stop rest -p 9090
/usr/hdp/current/hbase-master/bin/hbase-daemon.sh restart rest -p 9090
The code samples below assume that HBase REST API server node is hurds1.fyre.ibm.com and the server is listening on port 9090. Replace with the values corresponding to your environment.
User1 is a malicious user and should be denied any access to datalake tables.
Authenticate as user1
curl -ik --negotiate -u : -X GET -H "Accept: text/xml" "http://hurds1.fyre.ibm.com:9090/datalake:testdata/*"
............
<body>
<h2>HTTP ERROR: 500</h2>
<p>Problem accessing /datalake:testdata/*. Reason:
<pre> Request failed.</pre></p>
<hr />
</body>
User2 is data administrator and should be able to read and modify the data.
Authenticate as user2.
Read datalate:testdata. The result data is Base64 encoded.
curl -ik --negotiate -u : -X GET -H "Accept: text/xml" "http://hurds1:9090/datalake:testdata/*"
............
HTTP/1.1 200 OK
........
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="MQ=="><Cell column="Y2YxOm5hbWU=" timestamp="1561061404909">SGVsbG8=</Cell><Cell column="Y2YxOm51bWJlcg==" timestamp="1561061417353">MQ==</Cell></Row><Row key="Mg=="><Cell column="Y2YxOm5hbWU=" timestamp="1561061498304">SGVsbG8y</Cell><Cell column="Y2YxOm51bWJlcg==" timestamp="1561061498347">Mg==</Cell></Row><Row key="Mw=="><Cell column="Y2YxOm5hbWU=" timestamp="1561061498419">SGVsbG8z</Cell><Cell column="Y2YxOm51bWJlcg==" timestamp="1561063342967">Mw==</Cell></Row><Row key="NA=="><Cell column="Y2YxOm5hbWU=" timestamp="1564786098120">SGVsbG80</Cell></Row><Row key="NAo="><Cell column="Y2YxOm5hbWUK" timestamp="1564785434372">SGVsbG80Cg==</Cell></Row></CellSet>
.........
Modify the data. The curl command below is the equivalence of hbase shell command: put 'datalake:testdata',4,'cf1:name','Hello4'
curl -ik --negotiate -u : -X PUT -H "Accept: text/xml" -H "Content-Type: text/xml" -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="NA=="><Cell column="Y2YxOm5hbWU=">SGVsbG80</Cell></Row></CellSet>' "http://hurds1:9090/datalake:testdata/1"
........
HTTP/1.1 200 OK
WWW-Authenticate: Negotiate
........
Using hbase shell verify that the expected data is added or modified.
hbase shell
scan 'datalake:testdata'
ROW COLUMN+CELL
1 column=cf1:name, timestamp=1561061404909, value=Hello
1 column=cf1:number, timestamp=1561061417353, value=1
2 column=cf1:name, timestamp=1561061498304, value=Hello2
2 column=cf1:number, timestamp=1561061498347, value=2
3 column=cf1:name, timestamp=1561061498419, value=Hello3
3 column=cf1:number, timestamp=1561063342967, value=3
4 column=cf1:name, timestamp=1564828239454, value=Hello4
User3 is data scientist and should be allowed to access the datalake but is unable to modify the data.
Authenticate as user3.
curl -ik --negotiate -u : -X GET -H "Accept: text/xml" "http://hurds1:9090/datalake:testdata/*"
......
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="MQ=="><Cell column="Y2YxOm5hbWU=" timestamp="1561061404909">SGVsbG8=</Cell><Cell column="Y2YxOm51bWJlcg==" timestamp="1561061417353">MQ==</Cell></Row><Row key="Mg=="><Cell column="Y2YxOm5hbWU=" timestamp="1561061498304">SGVsbG8y</Cell><Cell column="Y2YxOm51bWJlcg==" timestamp="1561061498347">Mg==</Cell></Row><Row key="Mw=="><Cell column="Y2YxOm5hbWU=" timestamp="1561061498419">SGVsbG8z</Cell><Cell column="Y2YxOm51bWJlcg==" timestamp="1561063342967">Mw==</Cell></Row><Row key="NA=="><Cell column="Y2YxOm5hbWU=" timestamp="1564828239454">SGVsbG80</Cell></Row><Row key="NAo="><Cell column="Y2YxOm5hbWUK" timestamp="1564785434372">SGVsbG80Cg==</Cell></Row></CellSet>[
.....
Try to modify the data.
curl -ik --negotiate -u : -X PUT -H "Accept: text/xml" -H "Content-Type: text/xml" -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="NA=="><Cell column="Y2YxOm5hbWU=">SGVsbG80</Cell></Row></CellSet>' "http://hurds1:9090/datalake:testdata/1"
........
Forbidden
org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user ‘user3',action: put, tableName:datalake:testdata, family:cf1, column: name
at org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor.requirePermission(RangerAuthorizationCoprocessor.java:584
...
Knox Gateway is a recommended mean to interact with HDP/Hadoop UIs and Web API REST services. The Knox Gateway is a proxy between the service and the user or developer without exposing the HBase service and the node directly.
To configure Knox HBase. Ambari->Knox->Configs->Advanced Topology. The default template is incorrect, the port number should point to HBase Rest API server, not HBbase Master port number.
<service>
<role>WEBHBASE</role>
<url>http://{{hbase_master_host}}:{{hbase_master_port}}</url>
</service>Replace with valid host name and port number.
<service>
<role>WEBHBASE</role>
<url>http://hurds1.fyre.ibm.com:9090</url>
</service>Restart the Knox service.
Verify that Knox HBase is active and responding.
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "https://a1.fyre.ibm.com:8443/gateway/default/hbase/version"
Server: Jetty(9.4.12.v20180830)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Version JVM="Oracle Corporation 1.8.0_212-25.212-b04" Jersey="" OS="Linux 3.10.0-957.21.3.el7.x86_64 amd64" REST="0.0.3" Server="jetty/9.3.25.v20180904"/>
The tests are exactly the same as for HBase REST API directly. The only difference is that instead of the URL of HBase REST API, the Knox URL should be used.
The tests assume that Knox hostname is a1.fyre.ibm.com and the Knox port is 8443.
Cloudera: In Cloudera, topology name is cdp-proxy-api, example curl call is:
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "https://pimiento1.fyre.ibm.com:8443/gateway/cdp-proxy-api/hbase/version"
Cloudera: during my test, I was unable to authorize using Kerberos ticket, *--negotiate -u : *. Only providing user credentials gave access to the service.
curl -ik --negotiate -u user1:password -H "Accept: text/xml" -X GET "https://pimiento1.fyre.ibm.com:8443/gateway/cdp-proxy-api/hbase/version"
The access should be denied.
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "https://a1.fyre.ibm.com:8443/gateway/default/hbase/datalake:testdata/*"
Both tests, modify data and read data, should pass.
curl -ik --negotiate -u : -X PUT -H "Accept: text/xml" -H "Content-Type: text/xml" -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="NA=="><Cell column="Y2YxOm5hbWU=">SGVsbG80</Cell></Row></CellSet>' "https://a1.fyre.ibm.com:8443/gateway/default/hbase/datalake:testdata/1"
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "https://a1.fyre.ibm.com:8443/gateway/default/hbase/datalake:testdata/*"
Request to modify data should fail.
curl -ik --negotiate -u : -X PUT -H "Accept: text/xml" -H "Content-Type: text/xml" -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="NA=="><Cell column="Y2YxOm5hbWU=">SGVsbG80</Cell></Row></CellSet>' "https://a1.fyre.ibm.com:8443/gateway/default/hbase/datalake:testdata/1"
Reading the data should be successful
curl -ik --negotiate -u : -H "Accept: text/xml" -X GET "https://a1.fyre.ibm.com:8443/gateway/default/hbase/datalake:testdata/*"