PhantomJS script - seagatesoft/webdext GitHub Wiki
- Install PhantomJS
- Install NodeJS and NPM
- Clone Webdext repository
git clone [email protected]:seagatesoft/webdext.git
- Enter Webdext directory and run
npm install
- Run
gulp build-phantom
and the required files will be built intobuild
directory
phantomjs intellextract.js <page_url> <output_path>
-
page_url
: URL of the web page containing list of data records -
output_path
: Path to store extraction result (JSON format)
phantomjs wrapperextract.js <wrapper_path> <page_url> <output_path>
-
wrapper_path
: Path to the wrapper file. You could create it using the Chrome extension -
page_url
: URL of the web page containing list of data records -
output_path
: Path to store extraction result (JSON format)