Oct 26, 2023 - UTMediaCAT/mediacat-docs GitHub Wiki

Crawler/server

try other domains on the Graham instance - Gy
re-start the small domain crawler on Graham - Gy
add counter for IA crawler - Ra
see how many were actual articles from Mondoweiss IA crawl - Ra
enter Mondoweiss IA Crawl into the Crawl index - Ra
on Saturday will create a 200 set result from Mondoweiss IA crawl, email Francisco and Alejandro - Ra
unit test for postprocessing - start developing - Ar
postprocess Washington Post Twitter results - Fr
check if difficult to accept article length for postprocessor - Ar
remove vulnerable files/libraries from archived postprocessor - Fr
look at adding article length (not crucial) - Ra
check if postprocessor applies tags to citations and citing articles - Fr
delete data on small instance that's running with local storage - Gy
check postprocessed result of small data set from Raazia - Fr