Tutorials - QAML/S3QACoreFramework GitHub Wiki

How to deploy the pipeline on your own cluster

Here we explain how to deploy the S3QA on a cluster. The different modules (feature extraction, learning, and classification) have to be deployed in order to call them and do the corresponding computations afterwards.

Client machines communicate with the pipeline through a Broker. After compiling and packaging the project files (see installation instruction in the README.md file), the following steps have to be followed. Note that, regardless of what you want to do, the feature extraction module must be deployed.

Deploying the broker

This is the machine responsible for the communication between the service and the client. To deploy it, run the following command:

java -jar target/s3qa-uima-pipeline-core-0.0.1-SNAPSHOT.jar --start-broker -ip [machine_ip]

where [machine_ip] should be the IP address of the computer with the pipeline (do not add the brackets).

With the broker up, we are now ready to communicate with the server. Note that [machine_ip] will be the value we have to use in the rest of instructions.

Deploying the feature extraction module

Deploying this module is necessary, regardless of what computation you want to perform). To do it run the following command:

java -jar target/s3qa-uima-pipeline-core-0.0.1-SNAPSHOT.jar --deploy-feature-extraction -ip [machine_ip] --queue-name feature_extraction --scaleout [threads] -s -t -r

where [threads] is the number of threads (e.g., 5), each one executing one instance of the pipeline, on the machine that the command has been launched on. In order to distribute the computation, the same command above can be executed on a different machine on the cluster (keeping the same name for the queue). Parameters -s, -t, and -r refer to the available similarities, parse trees, and ranking feature. Details on these representations are available here.

Deploying the learning module

If you want to learn, deploy the corresponding module as follows:

java -jar target/s3qa-uima-pipeline-core-0.0.1-SNAPSHOT.jar --deploy-learning -ip [machine_ip] --queue-name learning --feature-extraction-queue-names feature_extraction -s -t -r

In this case, the queue is "learning" and we are instructing it where to look at to extract the features form the dataset (--feature-extraction-queue-names)

Deploying the classification module

Now, if you have a previously-computed model, you can classify new entries. To deploy the classification module, run:

java -jar target/s3qa-uima-pipeline-core-0.0.1-SNAPSHOT.jar -dc -ip [machine_ip] --queue-name classification --feature-extraction-queue-names feature_extraction --model-file [path.to.local.file]

where [path.to.local.file] is a file in the client containing one single line: the name of the previously-computed model file in the server.

Now that the cluster is up and the required modules are running, you can execute exactly the same commands as in getting started, the difference being that now all the processing load is carried out on your cluster. Keep in mind that you need to use the same IP of the broker to send the jobs there.