Machine Learning: Advanced features - QEDK/clarity GitHub Wiki
If you've read everything above and are ready to tinker, we have additional feature so you can get the maximum
performance. The default nlp.process()
makes asyncio
run the analysis functions concurrently, however
this is bottlenecked by the Python GIL, you can easily get better performance using multiprocessing
and spinning
off each async
function into a different thread. The API allows for this like:
import spacy
from ml.processtext import ProcessText
from textblob import TextBlob
sp = spacy.load("en_core_web_md")
nlp = ProcessText()
doc = sp("Some text to analyze")
blob = TextBlob(doc.text)
result1 = await nlp.get_formatted_entities(doc)
# You can then fork this using multiprocessing
# result2 = await nlp.get_sentiment(doc, blob)
# And so on....
Similar to how makemodel.py
can be run as a script, you can also use processtext.py
as a script like:
$ python3 processtext.py "<Write the text you want to analyze>"
You'll get the a JSON output similar to what's shown above and the duration of time taken for the output for utility purposes.