Alluxio - animeshtrivedi/notes GitHub Wiki
Download alluxio 1.5 build for 2.7 hadoop here: http://www.alluxio.org/download
Extract it and here is the content of the files in the conf directory
$cat alluxio-env.sh
[...]
ALLUXIO_MASTER_HOSTNAME=${ALLUXIO_MASTER_HOSTNAME:-"flex11-40g0"}
ALLUXIO_WORKER_MEMORY_SIZE=${ALLUXIO_WORKER_MEMORY_SIZE:-"128GB"}
ALLUXIO_RAM_FOLDER=${ALLUXIO_RAM_FOLDER:-"/mnt/tmpfs/alluxio"}
ALLUXIO_UNDERFS_ADDRESS="/mnt/tmpfs/alluxio/underFSStorage"$cat alluxio-site.properties
# Common properties
alluxio.master.hostname=flex11-40g0
alluxio.underfs.address=/mnt/tmpfs/alluxio/underFSStorage
# Security properties
# alluxio.security.authorization.permission.enabled=true
# alluxio.security.authentication.type=SIMPLE
# Worker properties
alluxio.worker.memory.size=128GB
alluxio.worker.tieredstore.levels=1
alluxio.worker.tieredstore.level0.alias=MEM
alluxio.worker.tieredstore.level0.dirs.path=/mnt/tmpfs/alluxio/
# User properties
# alluxio.user.file.readtype.default=CACHE_PROMOTE
# alluxio.user.file.writetype.default=MUST_CACHE$cat core-site.xml
<configuration>
<!--
<property>
<name>fs.defaultFS</name>
<value>alluxio://flex11-40g0:19998</value>
</property>
-->
<property>
<name>fs.alluxio.impl</name>
<value>alluxio.hadoop.FileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
<name>fs.alluxio-ft.impl</name>
<value>alluxio.hadoop.FaultTolerantFileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x) with fault tolerant support</description>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
<description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>
</configuration>Files masters contains the fault tolerant configuration. I left it as localhost. And the workers contains the hostname of workers.
To start the master (-w says that wait for the process to end)
./bin/alluxio-start.sh -w masterand go to a worker to start it by hand as
./bin/alluxio-start.sh -w worker NoMountI have it on NoMount as the /mnt/tmpfs is already mounted. Once these processes are up (that is they did not quit unexpectedly), check the logs and copy some local file for some sanity tests.
When all seems normal then you can start the whole cluster as
./bin/alluxio-start.sh master
./bin/alluxio-start.sh workers NoMountYou can browse the current file system state at: http://your_host:19999/home
Error I have this error in the browser
Inconsistent Files on Startup (run fs checkConsistency for details):
[...]
On Startup, 1 inconsistent files were found. This check is only checked once at startup, and you can restart the Alluxio Master for the latest information.
The following files may be corrupted:
\
As far as I can tell all seems fine. So I am ignoring this error for now.
There are a few changes to use alluxio with Spark. First you need to tell core-site.xml about alluxio. My hadoop core-site.xml now contains crail and alluxio details as
$cat core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://flex11-40g0:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>1048576</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/tmpfs/tmp</value>
</property>
<property>
<name>fs.crail.impl</name>
<value>com.ibm.crail.hdfs.CrailHadoopFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.crail.impl</name>
<value>com.ibm.crail.hdfs.CrailHDFS</value>
</property>
<property>
<name>fs.alluxio.impl</name>
<value>alluxio.hadoop.FileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
<description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>
</configuration>and then you have to copy the jar file into Spark class path. I have extra-jars path set so I copied the file there.
cp ~/alluxio/client/default/alluxio-1.5.0-default-client.jar ./extra-jars/or for more details follow these instructions: http://www.alluxio.org/docs/master/en/Debugging-Guide.html#usage-faq
That should be all.