External Services - TrentoCrowdAI/slr-api GitHub Wiki

Services deployment and configuration

Here you can find instructions regarding the external services used by our API

Configuration

You can modify all relevant services URL calls(e.g. the pdf-parsing endpoint) in this config file.

Once you change the endpoints you will probably have to edit some parts of the code in order to make the calls work.

pdf parse (currently not working because the service is offline)

The endpoint we call for the pdf parsing( POST http://scienceparse.allenai.org/v1) is rather unstable, it may take quite some time to get a result(we set a timeout of 10 seconds but you can change it - sometimes the API will return the result after 1 or 2 minutes). So we suggest to find another service or to deploy this one locally with the instructions shown below

similar-paper search (fake service)

The endpoint we call for the similar paper search( POST https://crowdai-slr-api-dev.herokuapp.com/external/similar), as of right now, the similarity search doesn't work properly, it just returns the Scopus results for a search on the title of query paper. .

The parameters of input are:

paperData {object} is the object that contains all property data of paper to query,
start {integer} offset position where we begin to get
count {integer} number of papers

The output are:

results {array[object]} the list of paper found
totalResults {integer} number of results found

to modify the callback operation on response(you can modify in src/delegates/papers.delegate.js)

automated search (fake service)

The endpoint is( POST https://crowdai-slr-api-dev.herokuapp.com/external/automated),

The parameters of input are:

title {string} name of topic
description {string} description of topic
arrayFilter {array[object]} is a array of fiter object, where each filter containes(id,date_created,date_last_modified, date_deleted, data(project_id,predicate, inclusion_description, exclusion_description))
min_confidence {float} minimum confidence value of post (between 0 and 1)
max_confidence {float} maximum confidence value of post (between 0 and 1)
start {integer} offset position where we begin to get
count {integer} number of papers

The output are:

results {array[object]} the list of paper found
totalResults {integer} number of results found

to modify the callback operation on response(you can modify in src/delegates/papers.delegate.js)

automated screening (fake service)

The endpoint is( POST https://crowdai-slr-api-dev.herokuapp.com/external/automatedEvaluation),

The parameters of input are:

arrayPaper {array[object]} is a array of paper object to evaluate
arrayFilter {array[object]} is a array of fiter object, where each filter containes(id,date_created,date_last_modified, date_deleted, data(project_id,predicate, inclusion_description, exclusion_description))
project_id {int} is project id to mark the progress

The output is a object where it has all paper_id as property name and each property contains a another object that has "value" , is final confidence value of paper, "filter", is a array that contains the tuple of {filter_id -> filter_value}.

ES of output : { paper1 : { value: 0.50, filters:[ {filter1: 0.40}, {filter2: 0.60} ] }, paper2 : { value: 0.50, filters:[ {filter1: 0.40}, {filter2: 0.60} ]

The endpoint to get progress is( GET https://crowdai-slr-api-dev.herokuapp.com/external/automatedEvaluation),

all requests that arrive before 3 seconds after doing automatedEvaluation will return false and those arriving after 3 seconds will return true The parameters of input are:

project_id {int}

to modify the callback operation on response(you can modify in src/delegates/screening.delegate.js)

Localhost deployment

pdf parse

clone the realtive project from https://github.com/allenai/science-parse.
install JDK and SBT on your machine.
run sbt server/assembly from the root path of project to compile the server into a super-jar file
find the generated jar file
excute the command java -Xmx6g -jar jarfile.jar to start the service on http://localhost:8080

On first startup, the service will download ~2GB of model files.

The recommend free RAM is >= 8GB, where the heap memory is 6GB + off-heap memory 2GB.

If you want change the number of port, you should change 'server/src/main/scala/org/allenai/scienceparse/SPServer.java' line 97.

similar-paper search

https://github.com/allenai/citeomatic/blob/master/README.md