TestingRedirectionsToNewSite - greenpeace/gpes-check-my-pages GitHub Wiki

Testing www.greenpeace.org redirections to the new site

This page describes how to test the redirections from the old site to the new. It requieres working in the terminal.

Install the scripts you need

Install the following:

Download the current sitemap

Use the browser or the command wget, like in the example bellow:

$ wget www.greenpeace.org/espana/sitemap.xml

Convert the sitemap to a simple file with the urls

Ecounter will search for urls in your sitemap.xml file and save them, one per line in sitemap.csv. Use the command:

$ ./ecounter -count=urls -input=sitemap.xml -output=sitemap.csv

Create a www.greenpeace.org redirects site

If the old Planet 3 site is still working you will need to create a redirect to your redirection subdomain.

First ensure that you have a server somewhere listening to requests to www.greenpeace.org and forwarding them to your redirects subdomain. If it's an Nginx redirects your .conf file will look like:

$ cat /etc/nginx/sites-available/www.greenpeace.org
server {
        listen 80;
        listen [::]:80;
        server_name www.greenpeace.org;
        return 301 $scheme://redirects-es.greenpeace.org$request_uri;
}

Of course this will not work for normal users. But it will work for you, if you change your hosts file in your own computer.

Change your hosts file

This is done differently for each operation system. In the mac:

Edit the /etc/hosts file:

$ sudo pico /etc/hosts

adding a line at the end of the hosts file like:

35.195.30.204 www.greenpeace.org

replacing 35.195.30.204 by the IP of your fake www.greenpeace.org server.

Flush your computer's DNS cache with:

$ sudo dscacheutil -flushcache;sudo killall -HUP mDNSResponder

(Example for Mac OSX)

and close all your browser's www.greenpeace.org pages and clear your browser's cache.

Now if you visit your www.greenpeace.org site, you'll be forwarded to your redirects script.

Test the urls

Assuming you have:

  1. The testing script check-my-pages installed in your computer
  2. The file sitemap.csv created from your sitemap.xml as described above.

you can start the tests. Depending on the size of your site, some tests can take a long time. Expect checking about 5 urls per second, 300 urls per minute.

Test the http responses and final urls

The first test is to check the final urls of your redirects and the http response codes. In the folder whereyou downloaded check-my-pages and sitemap.csv run:

./check-my-pages -urls=sitemap.csv -http -miliseconds=100

This will check the http responses, mime-type, file size and final url for each url in sitemap.csv. It will create a file named httpResponses.csv in the same folder.

Now open the csv and inspect the results:

  • Does each url has a 200 (ok) response (in the new site and the archive site)?
  • Are the redirects correct?
  • Are the mime types correct?

Urls that are going to be dead soon

You should check that your archive site doesn't contain urls that link to, or resources from your soon to be dead site.

First you should create a file with all urls from the archive archiveurls.csv.

Then calculate the pattern of urls to avoid. In our site it's https?://(\w|-)+.greenpeace.org/espana/.+

Finally you run the command with the checks:

./check-my-pages -urls=archiveurls.csv -linkpattern -cssjspattern -mediapattern -pattern='https?://(\w|-)+.greenpeace.org/espana/.+'

After your archive is scanned you can find the results in linkpattern.csv, cssjspattern.csv and mediapattern.csv.