July 29, 2021 - UTMediaCAT/mediacat-docs GitHub Wiki
Agenda
- set up politics subdomain under different instance
- update Apify
- look at postprocessor
- any ideas for speeding up: write down for next devs
- look at compute canada questionnaire
- for meeting with Kirsta: SWPP answers
- update on crawls
- question: where are the results for the crawls being stored -- please add to MVP
Apify update
- updating code to Apify 1.0
Politics subdomain
- logs showing politics articles
Instances:
- spoke to Jacqueline about how instances were set up, and she explained to merge; however already have a large graham instance
Postprocessor
Crawler update
- NYT/Mid E crawl is still going : 100,000+ JSONs
- NYT twitter crawl: all done,