How To - JpEncausse/SARAH-Documentation GitHub Wiki

The official documentation has been moved to http://wiki.sarah.encausse.net/

.
.
.
.
.
.
.
.

Sending an HTTP Request

SARAH relies on NodeJS API, the Tutorial "Demo 5" explain how to build a plugin sending a request to a 3rd party HTTP API.

exports.action = function(data, callback, config, SARAH){
  var url = 'http://.../';
  var request = require('request');
  request({ 'uri' : url }, function (err, response, body){
    if (err || response.statusCode != 200) {
      return callback({'tts': "Action failed"});
    }
    // ... Here you should parse body ...
    // var json = JSON.parse(body);
    callback({'tts': 'The answer' });
  }
}

In this sample the body is parsed using NodeJS JSON libraries. But you could also parse XML or any other text processing feature.

var xml2js = require('xml2js');
var parser = new xml2js.Parser({trim: true});
parser.parseString(body, function (err, xml) {
  var root = xml.root; // see documentation
  // Remember code is asynchronous
  // So you should call callback({ ... }) here
}); 

Scraping with Cheerio or PhantomJS

Sometimes you can't simply call a JSOn or an XML. You have to work with WebBrowser and HTML.

Cheerio

Cheerio is a very light HTML Browser handling common issues. Like JSON or XML it parse the result of an HTTP request.

    var $ = require('cheerio').load(body, { 
      xmlMode: true, ignoreWhitespace: false, lowerCaseTags: false });
    
    // The $ works like jQuery to navigate throught HTML
    $('#prevision > H2').find('img').attr('alt')
}

Remember Cheerio can't handle JavaScript click... or HTML generated client side.

PhantomJS

PhantomJS is a third party Webkit Browser very close to NodeJS. You plugin will run in PhoantomJS VM and send back some little data. PhantomJS is very heavy but sometimes is the only way to fill HTML Form, click on buttons, etc ...

see Architecture for more info

Rewriting grammar

Scheduling actions