serializable_proc

As the name suggests, SerializableProc is a proc that can be serialized (marshalled). A proc is a closure, which consists of the code block defining it, and binding of local variables. SerializableProc’s approach to serializability is to extract:

  1. the code from the proc (using ParseTree or RubyParser), and

  2. the local, instance, class & global variables reference within the proc from the proc’s binding, using deep copy via Marshal.load(Marshal.dump(var))

A SerializableProc differs from the vanilla Proc in the following 2 ways:

1. Isolated variables

By default, upon initializing, all variables (local, instance, class & global) within its context are extracted from the proc’s binding, and are isolated from changes outside the proc’s scope, thus, achieving a snapshot effect.

require 'rubygems'
require 'serializable_proc'

x, @x, @@x, $x = 'lx', 'ix', 'cx', 'gx'

s_proc = SerializableProc.new { [x, @x, @@x, $x].join(', ') }
v_proc = Proc.new { [x, @x, @@x, $x].join(', ') }

x, @x, @@x, $x = 'ly', 'iy', 'cy', 'gy'

s_proc.call # >> "lx, ix, cx, gx"
v_proc.call # >> "ly, iy, cy, gy"

Sometimes, we may want global variables to behave as truely global, meaning we don’t want to isolate globals at all, this can be done by declaring @@_not_isolated_vars within the code block:

s_proc = SerializableProc.new do
  @@_not_isolated_vars = :global # globals won't be isolated
  $stdout << "WakeUp !!"         # $stdout is the $stdout in the execution context
end

Supported values are :global, :class, :instance, :local & :all, with :all overriding all others. The following declares all variables as not isolatable:

s_proc = SerializableProc.new do
  @@_not_isolated_vars = :all
  ...
end

When invoking, Kernel.binding should be passed in to avoid unpleasant surprises:

s_proc.call(binding)

(take a look at SerializableProc’s rdoc for more details)

2. Marshallable

No throwing of TypeError when marshalling a SerializableProc:

Marshal.load(Marshal.dump(s_proc)).call # >> "lx, ix, cx, gx"
Marshal.load(Marshal.dump(v_proc)).call # >> TypeError (cannot dump Proc)

Installing It

The religiously standard way:

$ gem install ParseTree serializable_proc

Or on 1.9.* or JRuby:

$ gem install ruby_parser serializable_proc

By default, SerializableProc attempts to load ParseTree, which supports better performance & offers many dynamic goodness. If ParseTree cannot be found, SerializableProc falls back to the RubyParser-based which suffers some gotchas due to its static analysis nature (see ‘Gotchas’ section).

Performance

SerializableProc relies on ParseTree or RubyParser to do code extraction. While running in ParseTree mode, thanks to the goodness of dynamic code analysis, SerializableProc performs faster by a magnitude of abt 6 times for the same ruby, as illustrated with the following benchmark results:

MRI & implementation    user      system    total      real
1.8.7p299 (ParseTree)   0.000000  0.000000  3.510000   3.660623
1.8.7p299 (RubyParser)  0.000000  0.000000  20.780000  21.328566
1.9.1p376 (RubyParser)  0.010000  0.000000  16.990000  17.370586

Note:

  • the above are obtained from running the specs suite of 393 specifications with 1330 requirements

  • hardware & OS specs: x86_64 Intel® Core(TM)2 Duo CPU P8600 @ 2.40GHz

Gotchas

As RubyParser does only static code analysis, quite a bit of regexp matchings are needed to get SerializableProc to work in RubyParser mode. However, as our regexp kungfu is not perfect (yet), pls take note of the following:

1. Cannot have multiple initializing code block per line

The following initializations throw SerializableProc::CannotAnalyseCodeError:

# Multiple SerializableProc.new per line
SerializableProc.new { ... } ; SerializableProc.new { ... }

# Multiple lambda per line (the same applies to proc & Proc.new)
x_proc = lambda { ... } ; y_proc = lambda { ... }
SerializableProc.new(&x_proc)

# Mixed lambda, proc & Proc.new per line
x_proc = proc { ... } ; y_proc = lambda { ... }
SerializableProc.new(&x_proc)

2. Limited ways to initialize code blocks

Code block must be initialized with lambda, proc, Proc.new & SerializableProc.new, the following will throw SerializableProc::CannotAnalyseCodeError:

def create_magic_proc(&block)
  SerializableProc.new(&block)
end

create_magic_proc { ... }

But the following will work as expected:

x_proc = lambda { ... }
create_magic_proc(&x_proc)

There are several strategies to workaround this limitation:

2.1. Subclassing SerializableProc

Any subclass of SerializableProc shows traits of a SerializableProc:

class MagicProc < SerializableProc ; end
m_proc = MagicProc.new { ... } # m_proc walks & quacks like a SerializableProc

2.2. Adding custom matcher(s)

To support more match cases, we can declare new matchers:

def work(&block)
  s_proc = SerializableProc.new(&block)
  ...
end

SerializableProc::Parsers::Static.matchers << 'create_magic_proc'
work { ... }

Or if the method above takes arguments:

def create_magic_proc(*args, &block)
  s_proc = SerializableProc.new(&block)
  ...
end

SerializableProc::Parsers::Static.matchers << 'create_magic_proc\W+.*?\W+'
create_magic_proc(1, :a => 2, :b => 3) { ... }

3. One liner for …

This is embarassing, but being flexible can aggrevates performance even more. Currently, the declarative (eg. lambda, proc, SerializableProc.new, Proc.new, subclasses of SerializableProc, & any user-defined matcher(s)), & the start of the code-block (the ‘do’ & ‘{’ chars) must be on the same line. Meaning the following won’t work:

SerializableProc.new \
  do
    ...
  end

create_magic_proc(
  1, :a => 2
) { ... }

Supported Rubies

SerializableProc has been tested to work on the following rubies:

  1. MRI 1.8.6, 1.8.7 & 1.9.1

  2. JRuby (partial, the more conservative usages work, quite abit of the specs are failing due to JRuby’s bug in dumping a proc’s line number when we do Proc#inspect)

TODO (just brain-dumping)

  1. The RubyParser-based implementation probably need alot more optimization to catch up on ParseTree-based one

  2. Implementing alternative means (if possible) of extracting the code block without requiring help of ParseTree or RubyParser

  3. Implement workaround to tackle line-numbering bug in JRuby, which causes the RubyParser-based implementation to fail, for more info abt JRuby’s line-numbering bug, see stackoverflow.com/questions/3454838/jruby-line-numbering-problem & jira.codehaus.org/browse/JRUBY-5014

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 NgTzeYang. See LICENSE for details.