sequel_pg

sequel_pg overwrites the inner loop of the Sequel postgres adapter row fetching code with a C version. The C version is significantly faster (2-6x) than the pure ruby version that Sequel uses by default.

Real world difference

The speed up that sequel_pg gives you depends on what you are selecting, but it should be noticable whenever many rows are selected. Here’s an example that shows the difference it makes on a couple of models:

$ irb -r model -r benchmark 
irb(main):001:0> Track.count
=> 140854
irb(main):002:0> Album.count
=> 5579
irb(main):003:0> puts Benchmark.measure{Track.each{}}
 10.740000   0.190000  10.930000 ( 11.875343)
=> nil
irb(main):004:0> puts Benchmark.measure{10.times{Album.each{}}}
  7.920000   0.070000   7.990000 (  8.482130)
=> nil
irb(main):005:0> require '/data/code/sequel_pg/ext/sequel_pg/sequel_pg'
=> true
irb(main):006:0> puts Benchmark.measure{Track.each{}}
  2.360000   0.400000   2.760000 (  3.723098)
=> nil
irb(main):007:0> puts Benchmark.measure{10.times{Album.each{}}}
  1.300000   0.190000   1.490000 (  2.001393)
=> nil

Here’s an example that uses a modified version of swift’s benchmarks (github.com/shanna/swift/tree/master/benchmarks/):

benchmark         sys       user      total     real     rss
sequel #select    0.090000  2.020000  2.110000  2.246688 46.54m
sequel_pg #select 0.000000  0.250000  0.250000  0.361999  7.33m

sequel_pg also has code to speed up the map, to_hash, to_hash_groups, select_hash, select_hash_groups, select_map, and select_order_map Dataset methods, which is on by default. It also has code to speed up the loading of model objects, which is off by default as it isn’t fully compatible. It doesn’t handle overriding Model.call, Model#set_values, or Model#after_initialize, which may cause problems with the following plugins that ship with Sequel:

  • class_table_inheritance

  • force_encoding

  • serializiation

  • single_table_interitance

  • typecast_on_load

  • update_primary_key

If you want to extract that last ounce of performance when loading model objects and you can live with the limitations, you can enable the model optimization via:

# All datasets
DB.optimize_model_load = true

# Specific dataset
Artist.dataset.optimize_model_load = true

Streaming

If you are using PostgreSQL 9.2beta3 or higher on the client, then sequel_pg should enable streaming support. This allows you to stream returned rows one at a time, instead of collecting the entire result set in memory (which is how PostgreSQL works by default). You can check if streaming is supported by:

Sequel::Postgres.supports_streaming?

If streaming is supported, you can load the streaming support into the database:

DB.extension(:pg_streaming)

Then you can call the Dataset#stream method to have the dataset use the streaming support:

DB[:table].stream.each{|row| ...}

If you want to enable streaming for all of a database’s datasets, you can do the following:

DB.stream_all_queries = true

Note that pg 0.14.1+ is required for streaming to work. This is not required by the gem, as it is only a requirement for streaming, not for general use.

Installing the gem

gem install sequel_pg

Note that by default sequel_pg only supports result sets with up to 256 columns. If you will have a result set with more than 256 columns, you should modify the maximum supported number of columns via:

gem install sequel_pg -- --with-cflags=\"-DSPG_MAX_FIELDS=512\"

Make sure the pg_config binary is in your PATH so the installation can find the PostgreSQL shared library and header files. Alternatively, you can use the POSTGRES_LIB and POSTGRES_INCLUDE environment variables to specify the shared library and header directories.

While previous versions of this gem supported Windows, the current version does not, due to the need to call C functions defined in the pg gem.

Running the specs

sequel_pg doesn’t ship with it’s own specs. It’s designed to replace a part of Sequel, so it just uses Sequel’s specs. Specifically, the spec_postgres rake task from Sequel.

Reporting issues/bugs

sequel_pg uses GitHub Issues for tracking issues/bugs:

http://github.com/jeremyevans/sequel_pg/issues

Contributing

The source code is on GitHub:

http://github.com/jeremyevans/sequel_pg

To get a copy:

git clone git://github.com/jeremyevans/sequel_pg.git

There are only a few requirements, which you should probably have before considering use of the library:

  • Rake

  • Sequel

  • pg

  • libpq headers and library

Building

To build the library from a git checkout, after installing the requirements:

rake build

Platforms Supported

sequel_pg has been tested on the following:

  • ruby 1.8.7

  • ruby 1.9.3

  • ruby 2.0.0

  • ruby 2.1.4

  • rbx 2.2.9

Known Issues

  • You must be using the ISO PostgreSQL date format (which is the default). Using the SQL, POSTGRESQL, or GERMAN date formats will result in incorrect date/timestamp handling. In addition to PostgreSQL defaulting to ISO, Sequel also manually sets the date format to ISO by default, so unless you are overriding that setting (via Sequel::Postgres.use_iso_date_format = false), you should be OK.

  • Adding your own type conversion procs only has an effect if those types are not handled by default.

  • You do not need to require the library, the sequel postgres adapter will require it automatically. If you are using bundler, you should add it to your Gemfile like so:

    gem 'sequel_pg', :require=>'sequel'
    
  • sequel_pg currently calls functions defined in the pg gem, which does not work on Windows and does not work in some unix-like operating systems that disallow undefined functions in shared libraries. If RbConfig::CONFIG['LDFLAGS'] contains -Wl,--no-undefined, you’ll probably have issues installing sequel_pg. You should probably fix RbConfig::CONFIG['LDFLAGS'] in that case.

Author

Jeremy Evans <[email protected]>