Features - Sablayrolles/debates GitHub Wiki
Debates wiki -- Features
Requirements
Need to run coreNLP server with this:
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -annotators "tokenize,ssplit,pos,lemma,parse,sentiment" -port 9000 -timeout 30000
Features Extaraction
Segmentation of dicourse in sentences tabular
import sys
sys.path.append("..")
import my_coreNLP.parseNLP as parseNLP
import features.saveData as saveData
import features.computeFeatures as computeFeatures
NLP = parseNLP.StanfordNLP()
txt = {"num": 1, "question": 1, "edu": "My cat is eating the mouse!"}
#num : number of EDU
#question : number of associate question
#edu : text
s = saveData.compute(txt, NLP)
f = computeFeatures.returnFeatures(s, ["as?", "as!", "nb1stPers", "nb2ndPers"])
print(f)
List of features
default
- num : number of the EDU in corpus
- edu : text of EDU
- question : number of the question associate in the debate
optionnals
- "as?" : return 1 if there is '?' character 0 else
- "as!" : return 1 if there is '!' character 0 else
- "as..." : return 1 if there is '...' character 0 else
- "nb1stPers" return number of 1st singular and plural personal pronoum
- "nb2ndPers" return number of 2nd singular and plural personal pronoum
- "nb3rdSingPers" return number of 3rd singular personal pronoum
- "nb3rdPluPers" return number of 3rd plural personal pronoum
Examples
Some of examples files in can help you to use this project
Files | Description |
---|---|
example_features_extract.py | Extraction of a feature from a sentence |