Home - nave91/miner GitHub Wiki
Miner
Command line tool for data mining.
Structure
The module is arranged in:
- tools/*.py - command line tools for data mining
- data/*.csv - input data for tools
Requirements
For miner to work properly, Please install:
Python
Install from command line using:
sudo apt-get install python #if ubuntu
sudo yum install python #if fedora
To install from deb packages or source: https://www.python.org/download
Scikit-learn
Install python pip first:
sudo apt-get install python-pip
Check dependencies and install from http://scikit-learn.org/stable/install.html
Then
sudo pip install --user --install-option="--prefix=" -U scikit-learn
Testing Requirements
Run python --version
to check if python is working
Run python -c 'import sklearn'
to check if scikit-learn is installed and working
Preprocessing
For datasets to be used by miner, they must manually be preprocessed. The user must specify the type of input(independent) variables in dataset and which output(dependent) variables must be predicted.
Explained in detail in section "Preprocessing"