earth

Earth is a collection of data models that represent various things found here on Earth, such as countries, automobiles, aircraft, zip codes, and pet breeds.

By default the data that these models represent is pulled from Brighter Planet's open reference data site using the taps gem. The data can also be imported directly from preconfigured authoritative sources.

Usage

require 'earth'
require 'earth/automobile/automobile_fuel'

Earth.init
ft = AutomobileFuel.first
# ...

Earth.init prepares the environment to load and download data for each data model. You can load all data models at once with Earth.init :all. There are several other options to init that configure data mining sources and database connections. See the rdocs for more details on the Earth module.

Data model categories

Category Models
:air Aircraft, Airline, Airport ...
:automobile AutomobileFuel, AutomobileMake, AutomobileModel ...
:bus BusClass, BusFuel ...
:computation ComputationCarrier, ComputationCarrierInstanceClass ...
:diet DietClass, FoodGroup ...
:fuel Fuel, FuelPrice, GreenhouseGas ...
:hospitality LodgingClass, CommercialBuildingEnergyConsumptionSurveyResponse ...
:industry Industry, CbecsEnergyIntensity ...
:locality CensusDivision, Country, ZipCode ...
:pet Breed, Gender, Species ...
:rail RailClass, RailFuel, RailCompany ...
:residence Urbanity, ResidenceClass, AirConditionerUse
:shipping Carrier, ShipmentMode ...

Data storage

You can store Earth data in any relational database. On your very first run, you will need to create the tables for data each model. You can either use the Rails standard rake tasks (see below) or with a call to Earth.reset_schemas!

Pulling data from data.brighterplanet.com

By default, Earth will pull data from data.brighterplanet.com, which continuously (and transparently) refreshes its data from authoritative sources. Simply call #run_data_miner! on whichever data model class you need. If there are any Earth classes that the chosen class depends on, they will be downloaded as well automatically:

require 'earth'
require 'earth/locality/zip_code'

Earth.init
ZipCode.run_data_miner!

Pulling data from the original sources

If you'd like to bypass the data.brighterplanet.com proxy and pull data directly from authoritative sources (e.g., automobile data from EPA), simply specify the :mine_original_sources option to Earth.init

require 'earth'
Earth.init :mine_original_sources => true

require 'earth/automobile'
AutomobileMake.run_data_miner!

Rake tasks

Earth provides handy rails tasks for creating, migrating, and data mining models whether you're using it from a Rails app or a standalone Ruby app.

In your Rakefile, add:

require 'earth/tasks'
Earth::Tasks.new

If you're using Earth outside of Rails, all of the default rake db:* tasks will now be available. Within rails, certain tasks are augmented to help manage your Earth models using data_miner and active_record_inline_schema in addition to standard migrations.

Of note are the following tasks:

  • rake db:migrate runs .create_table! on each Earth resource model.
  • rake db:seed runs .run_data_miner! on each Earth resource model.

Collaboration cycle

Brighter Planet vigorously encourages collaborative improvement.

You

  1. Fork the earth repository on GitHub.
  2. Write a test proving the existing implementation's inadequacy. Ensure that the test fails. Commit the test.
  3. Improve the code until your new test passes and commit your changes.
  4. Push your changes to your GitHub fork.
  5. Submit a pull request to brighterplanet.

Brighter Planet

  1. Receive a pull request.
  2. Pull changes from forked repository.
  3. Ensure tests pass.
  4. Review changes for scientific accuracy.
  5. Merge changes to master repository and publish.
  6. Direct production environment to use new library version.