February 2, 2023 - UTMediaCAT/mediacat-docs GitHub Wiki
Agenda
- Alejandro's tasks - today!!!
- some meetings to 4pm: Feb 16th and Mar 2nd
- check in with Nat
- Globus non-cloud storage
- domain crawler
- look at numbers Shawn sent
- new small domain crawler working
Comms with shengsong
- needs a chance to go over documentation
Backing up using Globus
- Dig Alliance documentation: nearline space basically, files into a nearline directory, every 24 hours goes to ; min 10 GB max 1 TB at a time
- how to find nearline space
domain crawl
- working fine
- Shawn will check numbers
Action Items
- need to ask digital alliance about /nearline
- Nat: try to figure out how many webpages
- Shawn: will send 100 urls crawled from small domain crawl