Alluxio - animeshtrivedi/notes GitHub Wiki
Download alluxio 1.5 build for 2.7 hadoop here: http://www.alluxio.org/download
Extract it and here is the content of the files in the conf
directory
$cat alluxio-env.sh
[...]
ALLUXIO_MASTER_HOSTNAME=${ALLUXIO_MASTER_HOSTNAME:-"flex11-40g0"}
ALLUXIO_WORKER_MEMORY_SIZE=${ALLUXIO_WORKER_MEMORY_SIZE:-"128GB"}
ALLUXIO_RAM_FOLDER=${ALLUXIO_RAM_FOLDER:-"/mnt/tmpfs/alluxio"}
ALLUXIO_UNDERFS_ADDRESS="/mnt/tmpfs/alluxio/underFSStorage"
$cat alluxio-site.properties
# Common properties
alluxio.master.hostname=flex11-40g0
alluxio.underfs.address=/mnt/tmpfs/alluxio/underFSStorage
# Security properties
# alluxio.security.authorization.permission.enabled=true
# alluxio.security.authentication.type=SIMPLE
# Worker properties
alluxio.worker.memory.size=128GB
alluxio.worker.tieredstore.levels=1
alluxio.worker.tieredstore.level0.alias=MEM
alluxio.worker.tieredstore.level0.dirs.path=/mnt/tmpfs/alluxio/
# User properties
# alluxio.user.file.readtype.default=CACHE_PROMOTE
# alluxio.user.file.writetype.default=MUST_CACHE
$cat core-site.xml
<configuration>
<!--
<property>
<name>fs.defaultFS</name>
<value>alluxio://flex11-40g0:19998</value>
</property>
-->
<property>
<name>fs.alluxio.impl</name>
<value>alluxio.hadoop.FileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
<name>fs.alluxio-ft.impl</name>
<value>alluxio.hadoop.FaultTolerantFileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x) with fault tolerant support</description>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
<description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>
</configuration>
Files masters
contains the fault tolerant configuration. I left it as localhost
. And the workers
contains the hostname of workers.
To start the master (-w
says that wait for the process to end)
./bin/alluxio-start.sh -w master
and go to a worker to start it by hand as
./bin/alluxio-start.sh -w worker NoMount
I have it on NoMount as the /mnt/tmpfs
is already mounted. Once these processes are up (that is they did not quit unexpectedly), check the logs and copy some local file for some sanity tests.
When all seems normal then you can start the whole cluster as
./bin/alluxio-start.sh master
./bin/alluxio-start.sh workers NoMount
You can browse the current file system state at: http://your_host:19999/home
Error I have this error in the browser
Inconsistent Files on Startup (run fs checkConsistency for details):
[...]
On Startup, 1 inconsistent files were found. This check is only checked once at startup, and you can restart the Alluxio Master for the latest information.
The following files may be corrupted:
\
As far as I can tell all seems fine. So I am ignoring this error for now.
There are a few changes to use alluxio with Spark. First you need to tell core-site.xml
about alluxio. My hadoop core-site.xml
now contains crail and alluxio details as
$cat core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://flex11-40g0:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>1048576</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/tmpfs/tmp</value>
</property>
<property>
<name>fs.crail.impl</name>
<value>com.ibm.crail.hdfs.CrailHadoopFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.crail.impl</name>
<value>com.ibm.crail.hdfs.CrailHDFS</value>
</property>
<property>
<name>fs.alluxio.impl</name>
<value>alluxio.hadoop.FileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
<description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>
</configuration>
and then you have to copy the jar file into Spark class path. I have extra-jars path set so I copied the file there.
cp ~/alluxio/client/default/alluxio-1.5.0-default-client.jar ./extra-jars/
or for more details follow these instructions: http://www.alluxio.org/docs/master/en/Debugging-Guide.html#usage-faq
That should be all.