sequel_pg

sequel_pg overwrites the inner loop of the Sequel postgres adapter row fetching code with a C version. The C version is significantly faster (2-6x) than the pure ruby version that Sequel uses by default.

Real world difference

The speed up that sequel_pg gives you depends on what you are selecting, but it should be noticable whenever many rows are selected. Here’s an example that shows the difference it makes on a couple of models:

$ irb -r model -r benchmark 
irb(main):001:0> Track.count
=> 140854
irb(main):002:0> Album.count
=> 5579
irb(main):003:0> puts Benchmark.measure{Track.each{}}
 10.740000   0.190000  10.930000 ( 11.875343)
=> nil
irb(main):004:0> puts Benchmark.measure{10.times{Album.each{}}}
  7.920000   0.070000   7.990000 (  8.482130)
=> nil
irb(main):005:0> require '/data/code/sequel_pg/ext/sequel_pg/sequel_pg'
=> true
irb(main):006:0> puts Benchmark.measure{Track.each{}}
  2.360000   0.400000   2.760000 (  3.723098)
=> nil
irb(main):007:0> puts Benchmark.measure{10.times{Album.each{}}}
  1.300000   0.190000   1.490000 (  2.001393)
=> nil

Here’s an example that uses a modified version of swift’s benchmarks (github.com/shanna/swift/tree/master/benchmarks/):

benchmark         sys       user      total     real     rss
sequel #select    0.090000  2.020000  2.110000  2.246688 46.54m
sequel_pg #select 0.000000  0.250000  0.250000  0.361999  7.33m

sequel_pg also has code to speed up the map, to_hash, to_hash_groups, select_hash, select_hash_groups, select_map, and select_order_map Dataset methods, which is on by default. It also has code to speed up the loading of model objects, which is off by default as it isn’t fully compatible. It doesn’t handle overriding Model.call, Model#set_values, or Model#after_initialize, which may cause problems with the following plugins that ship with Sequel:

  • class_table_inheritance

  • force_encoding

  • serializiation

  • single_table_interitance

  • typecast_on_load

  • update_primary_key

If you want to extract that last ounce of performance when loading model objects and you can live with the limitations, you can enable the model optimization via:

# All datasets
DB.optimize_model_load = true

# Specific dataset
Artist.dataset.optimize_model_load = true

Streaming

If you are using PostgreSQL 9.2beta3 or higher on the client, then sequel_pg should enable streaming support. This allows you to stream returned rows one at a time, instead of collecting the entire result set in memory (which is how PostgreSQL works by default). You can check if streaming is supported by:

Sequel::Postgres.supports_streaming?

If streaming is supported, you can load the streaming support into the database:

DB.extension(:pg_streaming)

Then you can call the Dataset#stream method to have the dataset use the streaming support:

DB[:table].stream.each{|row| ...}

If you want to enable streaming for all of a database’s datasets, you can do the following:

DB.stream_all_queries = true

Note that pg 0.14.1+ is required for streaming to work. This is not required by the gem, as it is only a requirement for streaming, not for general use.

Installing the gem

gem install sequel_pg

The standard gem requires compiling from source, so you need a working compiler toolchain. Since few Windows users have a working compiler toolchain, a windows binary gem is available that works on both 1.8 and 1.9.

Running the specs

sequel_pg doesn’t ship with it’s own specs. It’s designed to replace a part of Sequel, so it just uses Sequel’s specs. Specifically, the spec_postgres rake task from Sequel.

Reporting issues/bugs

sequel_pg uses GitHub Issues for tracking issues/bugs:

http://github.com/jeremyevans/sequel_pg/issues

Contributing

The source code is on GitHub:

http://github.com/jeremyevans/sequel_pg

To get a copy:

git clone git://github.com/jeremyevans/sequel_pg.git

There are only a few requirements, which you should probably have before considering use of the library:

  • Rake

  • Sequel

  • pg

  • libpq headers and library

Building

To build the library from a git checkout, after installing the requirements:

rake build

Platforms Supported

sequel_pg has been tested on the following:

Operating Systems/Platforms

  • Linux (i386)

  • OpenBSD (amd64, i386)

  • Windows XP (i386)

Compiler Versions

  • gcc (3.3.5, 4.2.1, 4.4.3)

Supported Ruby Versions

  • ruby 1.8.7

  • ruby 1.9.2

  • ruby 1.9.3

If your platform, compiler version, or ruby version is not listed above, please test and send me a report including:

* Your operating system and platform (e.g. i386, x86_64/amd64)
* Your compiler
* Your ruby version

Known Issues

  • You must be using the ISO PostgreSQL date format (which is the default). Using the SQL, POSTGRESQL, or GERMAN date formats will result in incorrect date/timestamp handling. In addition to PostgreSQL defaulting to ISO, Sequel also manually sets the date format to ISO by default, so unless you are overriding that setting (via Sequel::Postgres.use_iso_date_format = false), you should be OK.

  • Adding your own type conversion procs only has an effect if those types are not handled by default.

  • The named_timezones plugin does not integrate with sequel_pg, since sequel_pg does it’s own timestamp conversions. The :local and :utc settings for database_timestamp and application_timestamp do work, as does setting the datetime_class to DateTime.

  • You do not need to require the library, the sequel postgres adapter will require it automatically. If you are using bundler, you should add it to your Gemfile like so:

    gem 'sequel_pg', :require=>'sequel'
    

Author

Jeremy Evans <[email protected]>