Sep 6, 2023 - UTMediaCAT/mediacat-docs GitHub Wiki

#Agenda

  • Update on training
  • Update on server/crawls
  • finding new time

Training

  • dug into postprocessor with Francisco
    • different directories and scripts
    • visualizations that were started
  • crawler:
    • need another session with Gy to get into nitty-gritty
    • Raazia & Aryan both gained access to crawler

weekly meeting

  • two time slots identified, check with Francisco

crawler/server update

  • graham server is now working!
  • different methods to install NPM package for new graham
    • adjusting the code according to crawl to work around block
    • so far not blocked
    • crawls are running, not super slow, need to wait two weeks
  • NPM installation so need to install puppeteer as well
    • need to keep trouble-shooting
  • disk image malformed errors: Nat has suggestions
    • will attempt some of this troubleshooting but prefers on new server rather than old

Action Items

  • continue to read code & think about role - Ar/Ra
  • meet with Gy to see how crawler is set up Friday at 4pm - Ar/Ra/Gy
    • consider some crawl issues that have come up and some of Nat's solutions
    • explain the trouble-shooting necessary with the new server to install NPM library
  • continue trouble-shooting new server - Gy
  • will try to work on disk image malform error - Gy