www - EranOfek/AstroPack GitHub Wiki

Description

The www package contains functions for reading and manipulating data on the world wide web.

Some of the functions in this package are old and no longer work properly.

Functions:

  • www.allFunList - Functions and Classes list for the www package
  • www.cgibin_parse_query_str - Break a URL parameters query string to parameter names and values.
  • www.find_urls - Read the URL content and extract all the links within the URL
  • www.find_urls_ftp - Find files in a FTP link
  • www.ftp_dir_list - Return files URLs from FTP containing a file listing.
  • www.html_page - Create an HTML file
  • www.html_table - Create an HTML table
  • www.isURL - Return a vector of logicals indicating if each element in an array is a URL string
  • www.mwget - A wrapper around the wget command
  • www.parse_html_table - Parse columns from an HTML table into matlab
  • www.parse_html_table_old - Parse columns from an HTML table into matlab
  • www.pwget - Parallel wget to retrieve multiple files simultanously
  • www.r_files_url - Recursively get links to all files in www directory list.
  • www.rftpget - Recursively retrieve the entire directory tree in an FTP site
  • www.unitTest - Package Unit-Test
  • www.url2url_key_val - SHORT DESCRIPTION HERE
  • www.wget - A wrapper around wget. See also www.pwget
  • www.write_content_indexhtml - SHORT DESCRIPTION HERE

Selected examples:

www.find_urls

Given a URL, read the URL content and extract all the links within the URL and return a cell array of all the links. Optionally, the program can filter URL using regular expressions.

List=www.find_urls('http://www.weizmann.ac.il/home/eofek/matlab/');
List=www.find_urls(URL,'strfind','.m');
List= www.find_urls(URL,'match','http.*?\.m');
List= www.find_urls(URL,'match','.*?\.fits');

The output is a cell array of all URLs found in the URL.

www.pwget

Execute multiple (parallel) sessions of wget to retrieve data in URLs. This is much faster than executing a single wget.

www.pwget(Links,'--no-check-certificate -U Mozilla',10);