Chimps on the Command Line
Infochimps is an online data marketplace and repository where anyone can find, share, and sell data.
Infochimps offers two APIs for users to access and modify data
a Query API to query data from particular rows of these datasets
Chimps is a Ruby wrapper for these APIs that makes interacting with them simple. You can embed Chimps inside your web application or any other software you write.
But if you finding yourself wishing that you could make queries, create datasets, &c. from your command line, where you already live, where you already keep your data…then Chimps CLI is for you:
# See your datasets
Installing Chimps CLI
Assuming you've already set up your Gem sources, just run
gem install chimps-cli
This will also install Chimps if it's not already present on your system.
Chimps CLI is just a command-line wrapper for the Chimps library. If Chimps is already properly configured with your API credentials then Chimps CLI will read them just fine without you having to do anything.
If you need to obtain API keys for either the Dataset API or the Query API then sign up at Infochimps.
You'll need to put your API keys into one of two files, either /etc/chimps/chimps.yaml or ~/.chimps. See the README for Chimps for more details on how to set up these configuration files.
$ chimps help
to make sure you can run the chimps command and to see an overview of what subcommands are available. You can get more detailed help as well as example usage on COMMAND by running
$ chimps help COMMAND
You can test and see whether your credentials are valid using the test command:
$ chimps test Authenticated as user 'Infochimps' for Infochimps Dataset API at http://www.infochimps.com Authenticated for Infochimps Query API at http://api.infochimps.com
If you get messages about missing keys and so on go back and read the Chimps installation instructions.
If you get messages about not being able to authenticate, double-check that the API keys in your configuration file (either ~/.chimps or /etc/chimps/chimps.yaml) match the credentials listed in your profile.
Commands to chimps accept arguments as well as options. Options always begin with two dashes and some options have single-letter flags as well.
Some options work for every chimps command. --verbose (-v), for example, is a great way to see what underlying HTTP request(s) a given command is making.
Operating on a Dataset, Source, License, &c.
Many requests can operate on a particular resource. The show command, for example, can be used to show a dataset (the default choice), a license, a source, or a user.
You can see what resources COMMAND can operate on with chimps help COMMAND. Two examples
# Will attempt to show the Dataset 'an-example'
Providing Data to a Command
Some commands (typically those that result in HTTP GET and DELETE requests) don't require you to pass any data to Infochimps.
Other commands (typically those that result in HTTP POST and PUT requests) do. These commands usually create or modify a dataset or other resource at Infochimps.
Say you wanted to create a new dataset on Infochimps with the title “List of hottest Salsas” and with description “All salsas were tried personally by me.”
There are two methods you can use to pass this data to a Chimps CLI command:
1) You can put the data you need to pass into a file on disk. Chimps understands YAML and JSON files formats and will automatically parse and serialize them properly when making a request. You could create the following file
# in salsa_dataset.yml --- title: "List of hottest Salsas" description: |- All salsas were tried personally by me.
and you can create the dataset with
$ chimps create --data=salsa_dataset.yml
2) You can pass parameters and values directly on the command line. You could create the same dataset as above with
$ chimps create title="List of hottest Salsas" description="All salsas were tried personally by me."
This will only work for a flat collection of parameters and values, as in this example. If you need to pass a nested data structure you should use a file and the --data option above.
Another example, which makes a query to the Query API and returns demographics on an IP address
$ chimps query web/an/ip_census ip=22.214.171.124
Basic HTTP Verbs
Infochimps' Dataset API is RESTful so it respects the semantics of HTTP verbs. You can use this “lower-level” interface to make simple GET, POST, PUT, and DELETE requests.
Here's how to return information on a Yahoo! Stocks dataset
$ chimps get /datasets/yahoo-stock-search
The default response will be in JSON but you can change the response format by explicitly passing a different one of xml, json, or yaml. This works for (almost) all Dataset API requests.
$ chimps get /datasets/yahoo-stock-search --response_format=yaml
Try running chimps help for the get command (chimps help get) as well as for the post, put, and delete commands.
Signed vs. Unsigned Requests
Some requests, like the GET request above, don't need to be signed in any way: using chimps to make a simple unsigned GET request isn't anything different than just doing it with curl.
All POST, PUT, and DELETE requests, however, need to be signed and using Chimps to do it makes it easy. Here's how you might create a dataset
$ chimps post --sign /datasets title="My dataset" description="Some text..."
If you leave out the --sign option then the request will fail with a 401 Authentication error.
The above request is really just the same as
$ chimps create title="My dataset" description="Some text..."
which is a little simple because create understands what you're trying to do and internally constructs the appropriate POST request.
You can find a list of all available requests, the correct HTTP verb to use, whether the request needs to be signed, and what parameters it accepts at www.infochimps.com/apis.
To round out this section, here's an example of a PUT request and a DELETE request (both of which must be signed):
# Update your existing dataset
Most things you might want to do with this “low-level” HTTP verb interface can be done with specialized chimps commands. Read on.
Core REST Actions
Since the Infochimps Dataset API is RESTful, it implements list, show, create, update, and destroy actions for all resources. Each of these actions has a corresponding Chimps command.
Here's how to list datasets:
$ chimps list
The list command is one of a few (search being another) that accepts the --my (-m) option. This will restrict the output to only datasets (or whatever resource you're listing) that are owned by you.
$ chimps list --my $ chimps list --my licenses
Here's how to show a dataset:
$ chimps show my-dataset
this returns YAML by default but you can specify a different response format by passing the --response_format option
$ chimps show my-dataset --response_format=json
You've already seen create in action a few times so here's update instead
$ chimps update my-dataset title="A new title"
And of course destroy
$ chimps destroy my-dataset
If you're curious about the underlying HTTP requests being sent, try running these commands with the --verbose (-v) flag.
Chimps CLI has a few special commands which aren't HTTP verbs or core REST actions.
Here's how to search Infochimps for datasets about music:
$ chimps search music
Here's the same search restricted to only datasets you own and pretty-printed:
$ chimps search --my music --pretty
If a dataset on Infochimps has a downloadable package then the download command can be used to download the data:
$ chimps download daily-1970-2010-open-close-hi-low-and-volume-nyse-exchange
The dataset must be free, you must own it, or you must have purchased it (through the website) before you can download it with Chimps.
You may want to include the --verbose (-v) flag so that you can see the progress of the download, especially if it is a large file.
Infochimps does not presently allow you to upload data by using an API. Please create a dataset first (you can do this with Chimps) and then go to that dataset's page in a browser and upload any data you wish.
This feature will be coming very, very soon!
chimps help and chimps help COMMAND should carry you a good ways with the examples and usage they output.
chimps test should confirm that your API keys are properly configured.
Chimps CLI is an open source project created by the Infochimps team to encourage adoption of the Infochimps APIs. The official repository is hosted on GitHub
Feel free to clone it and send pull requests.