cpp - HazyResearch/dimmwitted GitHub Wiki

This tutorial will walkthrough step-by-step of how to creating a DimmWitted application in C++. This tutorial assumes that you already went through the installation guideline. Apart from C++, you can also write your application Julia, and you can find the corresponding tutorial here.

The application that we are going to build is to train a logistic regression model on dense data set. You can find the code here, but we are going to walk through it step-by-step.

##A Primer for Logistic Regression

Before we start writing C++ code, lets go over some basic concepts of logistic regression to make sure we are on the same page.

In this example, we will encode a logistic regression over Boolean random variables, that can take values from 0 and 1. We will have a set of random variables {y_1,...,y_n}. For each random variable y_i, we have a set of features denoted as X_i={x_i1...x_im}. We also have a set of real-value weights {w_1,...,w_m}. Given these settings, we can define the probability distribution of each y_i equal to a certain value (0 or 1).

Assume that we already observe the value that each y_i should take, denoted as y_i*, training a logistic regression model is to find the set of weight that minimizes the negative log likelihood. To solve this mathematical optimization problem, we will implement an approach called Stochastic Gradient Descent (SGD). It contains multiple steps as follows

  1. Pick an i;
  2. Calculate the gradient.
  3. Update the weight with the gradient with a constant step size.
  4. Repeat 1.

We will then show how to write this simple SGD algorithm inside DimmWitted.

##Implementing Logistic Regression in DimmWitted

Before we start writing any code, we need to include the header file that contains DimmWitted-related functions by

#include "dimmwitted.h"

####Define the Workspace

In DimmWitted, the workspace contains a set of objects that will be changed during execution, in our case, this workspace contains all weights (pointed by p) that we are going to update. In DimmWitted, we need to define a class for the workspace

class GLMModelExample{
public:
  double * const p;
  int n;
  
  GLMModelExample(int _n):
    n(_n), p(new double[_n]){}

  GLMModelExample( const GLMModelExample& other ) :
     n(other.n), p(new double[other.n]){
    for(int i=0;i<n;i++){
      p[i] = other.p[i];
    }
  }
};

We can see that this class contains four components:

  1. Line 3-4: These two lines define a double-typed pointer, and the number of elements in this pointer. One can think about each double number here corresponds to one weight w_j.
  2. Line 6-7: These two lines define a constructor for the workspace. In this simple example, we take as input the number of elements, and allocate the memory space.
  3. Line 9-14: These six lines define a copy constructor. This function is highly recommended to implement because it will be used when DimmWitted decides to replicate your workspace for better performance.

####Define the Gradient Function

We then define the gradient function that we will use later with register_row to let DimmWitted know that this is the function to execute. This function takes as the input of a a vector (i.e., one x_i) and the point to a model (i.e., w_1,...,w_n) and update the model. The interface for DenseVector can be found here.

double f_lr_grad(const DenseVector<double>* const ex,
				 GLMModelExample* const p_model){

  double * model = p_model->p;
  double label = ex->p[ex->n-1];

  double dot = 0.0;
  for(int i=0;i<ex->n-1;i++){
    dot += ex->p[i] * model[i];
  }

  const double d = exp(-dot);
  const double Z = 0.00001 * (-label + 1.0/(1.0+d));

  for(int i=0;i<ex->n-1;i++){
    model[i] -= ex->p[i] * Z;
  }

  return 1.0;
}

####Create a DimmWitted Object and Execute

DenseDimmWitted<double, GLMModelExample, DW_DEBUG, DW_SHARDING, DW_ROW> 
    dw(examples, nexp, nfeat+1, &model);
unsigned int f_handle_grad = dw.register_row(f_lr_grad);
dw.exec(f_handle_grad);
⚠️ **GitHub.com Fallback** ⚠️