Spark Server API - pfriesch/onepercent GitHub Wiki

Jobs

JobSignature

{"jobID":"<jobIDAsHash>","name":"<name>","params":["<value>","<value>",...],"time":"<yyyy-mm-dd hh:mm:ss>"}

  • jobID: Unique Job ID
  • name: Identifier for the Job
  • params: Parameter for this Job
  • time: The timestamp of the initiation of this Job

Results

Success Case:

  • jobID: Unique Job ID
  • jobResult: Result Object of this Job

{"jobID":"<jobIDAsHash>","jobResult":{...see Jobs...}}

Failure Case

  • jobID: Unique Job ID
  • error: Error message

{"jobID":"<jobIDAsHash>","jobResult":{"errorMessage":"<message>", "errorCode":<Code>}}

Error Codes Description
100 Wrong job parameter
101 Job execution failed
400 Job not known or found
404 Json not parseable

Jobs

Name Params Result Additional
TopHashtagJob "<yyyy-mm-dd hh:mm:ss>","<topX>" {"topHashtags":[{"hashtag":"<hashtag>","count":<count>},{"hashtag":"<hashtag>","count":<count>}],countAllHashtags:<countAllCountedHashtags>}
ApacheFlumeJob "<method>" {"output":"<method output>"} Available Methods: start, stop, restart, status, log
ApacheSparkJob "<method>" {"output":"<method output>"} Available Methods: restart, status, log, debug, update
TweetsAtDaytimeJob "<daytime as timestamp>" {"countedTweets": [{"timestamp":"<time as timestamp>", "count":<tweet count> },...]} Allowed hours: 0-23; earliest possible current time -13h
WordSearchJob "<word to look for>", {"searchWord":"<word>", "countedTweets": [{"timestamp":"<timestamp" , "count": "<tweetcount>"}, {"timestamp":"<timestamp" , "count": "<tweetcount>"},...], "tweetIds":["<id>","<id>",...]} Input is evaluated by the Node.js server. Allowed characters: [#]?[a-zA-Zäüö0-9]+. The timestamp marks the start of the analysis.
OriginTweetsJob "<timestamp>" {"timestamp":"<timestamp>", "originTweetCount":<originTweetCout>, "retweetCount":<retweetCount>}
LanguageDistributionJob "<timestamp>" {"languages": [{"language":"<language>", "count":<count>},...]}
LearnClassifierJob {"msg":"<Success message>"}
CategoryDistributionJob "<timestamp>" {"distribution": [{"category":"<category>", "count":<count>},...],"totalCount":<count> } The ammount is calculated for one hour each.
CategoryDistributionJob "<timestamp>","<sampleTweetCount>" {"distribution": [{"category":"<category>", "count":<count>},...],"totalCount":<count>,"tweets":[{"text":"<tweetText>","categoryProb":[{"category":"<categorey1>","prob":<float 0..1>},{"category":"<categorey1>","prob":<float 0..1>}]}]}} The ammount is calculated for one hour each. In addition the number of <sampleTweetCount> is returned as sample tweets. Each tweet has a text and the probability to be in each category. Tweets below a certain threshold are unregarded.
⚠️ **GitHub.com Fallback** ⚠️