Hadoop - big-data-europe/README GitHub Wiki
Website | http://hadoop.apache.org/ |
Supported versions | 2.7.1 |
3.12, 5.18 | |
Current responsible(s) | Ivan Ermilov @ InfAI -- [email protected] |
Docker image(s) | organization/name:tag |
bde2020/hadoop-base:1.0.0-hadoop2.7.1 | |
bde2020/hadoop-namenode:1.0.0-hadoop2.7.1 | |
bde2020/hadoop-datanode:1.0.0-hadoop2.7.1 | |
bde2020/hadoop-resourcemanager:1.0.0-hadoop2.7.1 | |
bde2020/hadoop-historyserver:1.0.0-hadoop2.7.1 | |
bde2020/hadoop-nodemanager:1.0.0-hadoop2.7.1 | |
More info | https://github.com/big-data-europe/docker-hadoop |
Short description
Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.
Example usage
To deploy an example HDFS cluster please refer to instructions on github repo
Scaling
Hadoop datanodes can be scaled by deploying hadoop-datanode docker containers on docker swarm nodes.