Hadoop - animeshtrivedi/notes GitHub Wiki
First need to setup the native environment. You can check with
$ hadoop checknative -a
you need to set up the java.library.path
, so put hadoop lib path in the LD_LIBRARY_PATH
as
export LD_LIBRARY_PATH="/home/your_hdfs/lib/native/":$LD_LIBRARY_PATH
then in the ./bin/hadoop script as
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS -Djava.library.path=$LD_LIBRARY_PATH:"
this is to start with.
Now we need to create a UNIX socket domain. I created a folder
at /var/lib/hadoop-hdfs/
and gave access to me
user. And that was it. Then in the hdfs-site.xml
put these
<configuration>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/lib/hadoop-hdfs/dn_socket</value>
</property>
</configuration>
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/NativeLibraries.html
https://hortonworks.com/blog/dmmq/
https://hortonworks.com/blog/ddm/
In crail
./bin/crail fsck -t getLocations -f /sql/parquet-100m/part-00002-fc266a6a-663b-4ece-a2c8-453d54f784b9.parquet -y 0 -l 1280255769
In HDFS
./bin/hdfs fsck /sql/parquet-100m/part-00000-505ae4a0-0f0a-4210-a27c-bd854d95787e.parquet -files -blocks -locations