February 23, 2023 - UTMediaCAT/mediacat-docs GitHub Wiki
Agenda:
- next week 4pm
- update notes from last day
- restarted crawl?
- add documentation about the json-csv conversion and re-starting crawl
- do a bit of code review of the re-start function
- question about key pair
- postprocessor
restarted crawl
- number still increasing: checked count a few days ago, a few thousand more urls to some domains
- had to restart a few times: too many failed requests and saw a brake
- cycles through the domains
- email function not working
adding documentation?
- update readme files: push to github repository
- push the documentation
new small domain
key-pair
postprocessor
- spent a lot of time
- error-handling: errors go quietly, errors aren't getting logged
- no code logging errors in certain places
- thinks that he understands the error has to do with CSV file combination
- recombining after the meeting
- otherwise the documentation has been good enough
Action Items
- Alejandro: write to Globus support
- check on logs to see which domains giving us the rejections
- slow down restarted crawl
- look into email function
- add Irfan's key-pair and Alejandro's
- Alejandro: change password
- add more logging of errors to postprocessor
- Alejandro come up with domain crawl scope
- Alejandro: write to security person about adding Shawn to Graham