How To Use The Executable - michaeltelford/wgit GitHub Wiki
When you install the wgit gem you also get an executable by the same name. The wgit executable starts an interactive shell session (pry if installed, irb if not) with the Wgit gem already required.
Start the executable with:
$ wgit
Skipping .env load because 'dotenv' isn't installed
Searching for .wgit.rb file in local and home directories...
Using 'irb' REPL because 'pry' isn't installed
wgit v0.11.0
------------
irb(main):001> url = Wgit::Url.new 'http://example.com'
=> #<Wgit::Url url="http://example.com" crawled=false>
[2] pry(main)> doc = Wgit::Crawler.new.crawl url
=> #<Wgit::Document url="http://example.com" html_size=1255>
Type exit or press Ctrl+D to finish and exit your session.
Connecting to a Database
When connecting to a database with Wgit you can specify the connection string manually to methods like Wgit::Database.new or set the ENV['WGIT_CONNECTION_STRING'] value (allowing you to omit the connection string param).
Therefore, you can set the environment variable when you start the executable:
$ WGIT_CONNECTION_STRING="<your_connection_string>" wgit
$ [1] pry(main)> db = Wgit::Database.new
By setting the database's connection string in the ENV, you need not pass a connection string parameter.
An alternative to providing the connection string via the command line every time is to use .env and .wgit.rb files to set and store in the ENV Hash (see below).
Using .env to set the connection string
Start by creating a .env file. You'll also need to install the dotenv gem. For example:
$ gem install dotenv
$ touch .env
$ echo "WGIT_CONNECTION_STRING='<connection_string>'" >> .env
[Optional] Using .wgit.rb to connect to the database
The wgit executable will look for and eval a .wgit.rb file, if one can be found. The two locations that are searched (in order) are the local directory and the home directory.
You can therefore use a .wgit.rb file to store fixtures, configuration and define helper functions to easily index and search the web.
Start by creating a .wgit.rb file in either your local or home directory:
$ touch .wgit.rb
Save the following in your .wgit.rb file to connect to a database instance:
def db
# We omit the <connection_string> param because it's set in ENV
@db ||= Wgit::Database.new
end
Now, as soon as you start a shell session, you can access your database with commands like db.search(...) etc.
Tips & Tricks
- Because of scoping rules, any variables defined in
.wgit.rbshould be instance variables (e.g.@url) or be accessed via a getter method (e.g.def url; ...; end). require 'wgit/core_ext'in your.wgit.rbfile so you can use methods likeString#to_urletc.- Remove the
Wgitnamespace around its classes by addinginclude Wgitto your.wgit.rbfile. - Include the
Wgit::DSLwhich provides convenience methods for crawling, indexing and searching. - If you find yourself doing the same thing regularly e.g. indexing the same site, then define a helper function in your
.wgit.rbfile to execute with a single call. - It can be helpful to keep your personal
.wgit.rbfile in the home directory and override it with a local.wgit.rbfile when working on a specific project.