Home - UTMediaCAT/mediacat-docs GitHub Wiki

About

Contact info

  • The Slack for this project is teammediacat.slack.com

Installation and Use

  • MediaCat code is stored in three repositories. Each repository contains information about how to run and manage this component of the MediaCat stack.

mediacat-twitter-API-crawler

MediaCat-twitter-API-crawler takes in a scope document in a prescribed format and crawls twitter handles, bringing back the contents of tweets. The end result is one or more .csvs containing all the tweets for the target twitter users. Detailed information for how to run and troubleshoot this application is available in the repository at:

mediacat-domain-crawler

mediacat-domain-crawler takes in a scope document in a prescribed format and crawls domains, bringing back the html contents of individual domains. Detailed information for how to run and troubleshoot this application is available in the repository at:

Post-processor

Post-processor takes in the data results from both the twitter and domain crawlers and produces a .csv file in a prescribed format from which a user can determine citational practices and approaches between scope twitter and news media sources. Detailed information for how to run and troubleshoot this application is available in the repository at:

Troubleshooting

  • Additional developer Documentation is linked in the readme

Credit

  • All students that worked on it and grant funding that supported it