serializable_proc
As the name suggests, SerializableProc is a proc that can be serialized (marshalled). A proc is a closure, which consists of the code block defining it, and binding of local variables. SerializableProc’s approach to serializability is to extract:
-
the code from the proc (using ParseTree or RubyParser), and
-
the local, instance, class & global variables reference within the proc from the proc’s binding, using deep copy via Marshal.load(Marshal.dump(var))
A SerializableProc differs from the vanilla Proc in the following 2 ways:
1. Isolated variables
By default, upon initializing, all variables (local, instance, class & global) within its context are extracted from the proc’s binding, and are isolated from changes outside the proc’s scope, thus, achieving a snapshot effect.
require 'rubygems'
require 'serializable_proc'
x, @x, @@x, $x = 'lx', 'ix', 'cx', 'gx'
s_proc = SerializableProc.new { [x, @x, @@x, $x].join(', ') }
v_proc = Proc.new { [x, @x, @@x, $x].join(', ') }
x, @x, @@x, $x = 'ly', 'iy', 'cy', 'gy'
s_proc.call # >> "lx, ix, cx, gx"
v_proc.call # >> "ly, iy, cy, gy"
Sometimes, we may want global variables to behave as truely global, meaning we don’t want to isolate globals at all, this can be done by declaring @@_not_isolated_vars within the code block:
s_proc = SerializableProc.new do
@@_not_isolated_vars = :global # globals won't be isolated
$stdout << "WakeUp !!" # $stdout is the $stdout in the execution context
end
Supported values are :global, :class, :instance, :local & :all, with :all overriding all others. The following declares all variables as not isolatable:
s_proc = SerializableProc.new do
@@_not_isolated_vars = :all
...
end
When invoking, Kernel.binding should be passed in to avoid unpleasant surprises:
s_proc.call(binding)
(take a look at SerializableProc’s rdoc for more details)
2. Marshallable
No throwing of TypeError when marshalling a SerializableProc:
Marshal.load(Marshal.dump(s_proc)).call # >> "lx, ix, cx, gx"
Marshal.load(Marshal.dump(v_proc)).call # >> TypeError (cannot dump Proc)
Installing It
The religiously standard way:
$ gem install ParseTree serializable_proc
Or on 1.9.* or JRuby:
$ gem install ruby_parser serializable_proc
By default, SerializableProc attempts to load ParseTree, which supports better performance & offers many dynamic goodness. If ParseTree cannot be found, SerializableProc falls back to the RubyParser-based which suffers some gotchas due to its static analysis nature (see ‘Gotchas’ section).
Performance
SerializableProc relies on ParseTree or RubyParser to do code extraction. While running in ParseTree mode, thanks to the goodness of dynamic code analysis, SerializableProc performs faster by a magnitude of abt 6 times for the same ruby, as illustrated with the following benchmark results:
MRI & implementation user system total real
1.8.7p299 (ParseTree) 0.000000 0.000000 3.510000 3.660623
1.8.7p299 (RubyParser) 0.000000 0.000000 20.780000 21.328566
1.9.1p376 (RubyParser) 0.010000 0.000000 16.990000 17.370586
Note:
-
the above are obtained from running the specs suite of 393 specifications with 1330 requirements
-
hardware & OS specs: x86_64 Intel® Core(TM)2 Duo CPU P8600 @ 2.40GHz
Gotchas
As RubyParser does only static code analysis, quite a bit of regexp matchings are needed to get SerializableProc to work in RubyParser mode. However, as our regexp kungfu is not perfect (yet), pls take note of the following:
1. Cannot have multiple initializing code block per line
The following initializations throw SerializableProc::CannotAnalyseCodeError:
# Multiple SerializableProc.new per line
SerializableProc.new { ... } ; SerializableProc.new { ... }
# Multiple lambda per line (the same applies to proc & Proc.new)
x_proc = lambda { ... } ; y_proc = lambda { ... }
SerializableProc.new(&x_proc)
# Mixed lambda, proc & Proc.new per line
x_proc = proc { ... } ; y_proc = lambda { ... }
SerializableProc.new(&x_proc)
2. Limited ways to initialize code blocks
Code block must be initialized with lambda, proc, Proc.new & SerializableProc.new, the following will throw SerializableProc::CannotAnalyseCodeError:
def create_magic_proc(&block)
SerializableProc.new(&block)
end
create_magic_proc { ... }
But the following will work as expected:
x_proc = lambda { ... }
create_magic_proc(&x_proc)
There are several strategies to workaround this limitation:
2.1. Subclassing SerializableProc
Any subclass of SerializableProc shows traits of a SerializableProc:
class MagicProc < SerializableProc ; end
m_proc = MagicProc.new { ... } # m_proc walks & quacks like a SerializableProc
2.2. Adding custom matcher(s)
To support more match cases, we can declare new matchers:
def work(&block)
s_proc = SerializableProc.new(&block)
...
end
SerializableProc::Parsers::Static.matchers << 'create_magic_proc'
work { ... }
Or if the method above takes arguments:
def create_magic_proc(*args, &block)
s_proc = SerializableProc.new(&block)
...
end
SerializableProc::Parsers::Static.matchers << 'create_magic_proc\W+.*?\W+'
create_magic_proc(1, :a => 2, :b => 3) { ... }
3. One liner for …
This is embarassing, but being flexible can aggrevates performance even more. Currently, the declarative (eg. lambda, proc, SerializableProc.new, Proc.new, subclasses of SerializableProc, & any user-defined matcher(s)), & the start of the code-block (the ‘do’ & ‘{’ chars) must be on the same line. Meaning the following won’t work:
SerializableProc.new \
do
...
end
create_magic_proc(
1, :a => 2
) { ... }
Supported Rubies
SerializableProc has been tested to work on the following rubies:
-
MRI 1.8.6, 1.8.7 & 1.9.1
-
JRuby (partial, the more conservative usages work, quite abit of the specs are failing due to JRuby’s bug in dumping a proc’s line number when we do Proc#inspect)
TODO (just brain-dumping)
-
The RubyParser-based implementation probably need alot more optimization to catch up on ParseTree-based one
-
Implementing alternative means (if possible) of extracting the code block without requiring help of ParseTree or RubyParser
-
Implement workaround to tackle line-numbering bug in JRuby, which causes the RubyParser-based implementation to fail, for more info abt JRuby’s line-numbering bug, see stackoverflow.com/questions/3454838/jruby-line-numbering-problem & jira.codehaus.org/browse/JRUBY-5014
Note on Patches/Pull Requests
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
-
Send me a pull request. Bonus points for topic branches.
Copyright
Copyright © 2010 NgTzeYang. See LICENSE for details.