# How to query google with hpricot from your command line

so i found myself the other day sitting there in front of my command line just
about to look up something for reference from google. No big deal, google
search box is part of every major browser and myriads of other options of OS
integrated searches are available as well. But i'm so old school, the command
line is where i spend most of my time, which means using the web browser is a
context switch nonetheless. The Mac OS X became so appealing exactly because
it is basically Unix with a commandline shell **and** a nice GUI in
front of it, and not a GUI **instead** of a shell.

I put together a little ruby script(called 'g') to run some of the
most basic google queries from the command line, like:

$ g ruby command line tools <return>
-> shows index page with results found for further query

or a direct jump to the "I'm Feeling Lucky" result:

$ g :lucky ruby -python google <return>
-> directly jump open the browser on the first link found

I use `':'` to _modify_ the default behaviour of the script. Normaly you do
this with options but I prefered `':'` over `'-'` and reserve the dash for the
query itself. `:lucky` is just overwriting the default behaviour here. The
next modifiers to implement where `:count` and `:fight` modifiers like:

$ g :count your search here
about 1,330,000,000 results for <your search here> (0.09 seconds)

$ g :fight "left" "right"
http://www.googlefight.com

`:count` is obvious and `:fight` of course got its inspiration from the awesom
<http://www.googlefight.com/>.

I used [Hpricot][1] because i got used to it and it does actually everything
i need in an elegant and concise way. The script does **NOT** use the [Google
Ajax Search API][2], but does scrape its results from the _normal_ HTML
response page. I simple didn't want to make my simple script dependent on
signing up for an [Google API Key][2].

[1]: http://code.whytheluckystiff.net/hpricot/
[2]: http://code.google.com/apis/ajaxsearch/signup.html

With the right stuff in place,

require 'rubygems'
require 'cgi'
require 'open-uri'
require 'hpricot'

the `:lucky` screen scraping for example basically boils down to:

q = %wkleine suchanfrage.map { |w| CGI.escape(w) }.join("+")
url = "http://www.google.com/search?q=#q"
doc = Hpricot(open(url).read)
lucky_url = (doc/"div[@class='g'] a").first["href"]
system 'open #lucky_url'

and you can easily spot a problem here. `system 'open ...'` is hardly cross
platform, but on Mac OSX it opens the default browser with the given URL. To
give users a chance to customize things a little i put settings in a defaults
value hash which will be overwritten at startup by values loaded from a users
preference file in their home directory. My built-in default values are:

C = :count => 4, # number of results showed
:indent => 8, :tw => 70, # indentation and descripton width
:goog => "http://www.google.com", # where to ask?
:open => "system 'open ${url'", # loads into HTTP browser on Mac OSX
}

and the user preferences are loaded from: `~/.g`