Retrieving Traffic Data - dlsucomet/WebCrawler GitHub Wiki

To retrieve traffic, we can export them into a csv file and push them to the repository then can be accessed by whoever needs it.

Extract from console

  1. Access the server (How to access server here)
  2. Type into the console the following command psql mmda_traffic
  3. Use the COPY command format below to export a specific set of data into CSV format
  4. Enter \q to exit the psql console
  5. Navigate to CSV data folder cd /home/direksyon/WebCrawler/data/
  6. Push the CSV data to the repository**

**(You may use a new repository if retrieving big amounts of data such as '2017 traffic data', but I suggest to break it down by month to lessen the download time)

Extract specific month and year

\COPY (SELECT * FROM entries WHERE EXTRACT(YEAR FROM update_timestamp) = |_year_| AND EXTRACT(MONTH FROM update_timestamp) = |_month_|) TO '/home/direksyon/WebCrawler/data/|_filename_|.csv' WITH CSV DELIMITER ',' HEADER;

Extract specific year

\COPY (SELECT * FROM entries WHERE EXTRACT(YEAR FROM update_timestamp) = |_year_|) TO '/home/direksyon/WebCrawler/data/|_filename_|.csv' WITH CSV DELIMITER ',' HEADER;

Sample:

  • Extract data from December 2016 '\COPY (SELECT * FROM entries WHERE EXTRACT(YEAR FROM update_timestamp) = 2016 AND EXTRACT(MONTH FROM update_timestamp) = 12) TO '/home/direksyon/WebCrawler/data/2016-12.csv' WITH CSV DELIMITER ',' HEADER;'