Module: Wukong
- Defined in:
- lib/wukong.rb,
lib/wukong/store.rb,
lib/wukong/logger.rb,
lib/wukong/schema.rb,
lib/wukong/script.rb,
lib/wukong/encoding.rb,
lib/wukong/streamer.rb,
lib/wukong/datatypes.rb,
lib/wukong/decorator.rb,
lib/wukong/store/base.rb,
lib/wukong/streamer/base.rb,
lib/wukong/datatypes/enum.rb,
lib/wukong/store/cassandra.rb,
lib/wukong/streamer/filter.rb,
lib/wukong/filename_pattern.rb,
lib/wukong/streamer/reducer.rb,
lib/wukong/script/emr_command.rb,
lib/wukong/datatypes/fake_types.rb,
lib/wukong/extensions/hash_like.rb,
lib/wukong/script/local_command.rb,
lib/wukong/streamer/set_reducer.rb,
lib/wukong/script/hadoop_command.rb,
lib/wukong/store/cassandra_model.rb,
lib/wukong/store/flat_file_store.rb,
lib/wukong/streamer/list_reducer.rb,
lib/wukong/streamer/line_streamer.rb,
lib/wukong/streamer/record_streamer.rb,
lib/wukong/streamer/struct_streamer.rb,
lib/wukong/streamer/summing_reducer.rb,
lib/wukong/extensions/hashlike_class.rb,
lib/wukong/streamer/counting_reducer.rb,
lib/wukong/store/chunked_flat_file_store.rb,
lib/wukong/streamer/accumulating_reducer.rb,
lib/wukong/streamer/rank_and_bin_reducer.rb,
lib/wukong/streamer/uniq_by_last_reducer.rb,
lib/wukong/script/cassandra_loader_script.rb,
lib/wukong/store/chh_chunked_flat_file_store.rb
Defined Under Namespace
Modules: Datatypes, EmrCommand, HadoopCommand, HashLike, HashlikeClass, LocalCommand, Schema, Store, Streamer Classes: CassandraScript, Decorator, FilenamePattern, Script
Constant Summary collapse
- RESOURCE_CLASS_MAP =
{ }
Class Method Summary collapse
-
.class_from_resource(rsrc) ⇒ Object
Find the class from its underscored name.
-
.decode_str(str, strategy = :xml) ⇒ Object
Decode string from its encode_str representation.
-
.encode_components(hsh, *fields) ⇒ Object
Replace each given field in the hash with its encoded value.
-
.encode_str(str, strategy = :xml) ⇒ Object
By default (or explicitly with the :xml strategy), convert string to * XML-encoded ASCII,.
-
.html_encoder ⇒ Object
HTMLEntities encoder instance.
-
.logger ⇒ Object
Common logger.
- .logger=(logger) ⇒ Object
- .run(mapper, reducer = nil, options = {}) ⇒ Object
Class Method Details
.class_from_resource(rsrc) ⇒ Object
Find the class from its underscored name. Note the klass is non-modularized. You can also pre-seed RESOURCE_CLASS_MAP
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# File 'lib/wukong/datatypes.rb', line 8 def self.class_from_resource rsrc # This method has been profiled, so don't go making it more elegant unless you're doing same. klass_name = rsrc.to_s return RESOURCE_CLASS_MAP[klass_name] if RESOURCE_CLASS_MAP.include?(klass_name) # kill off all but the non-modularized class name and camelize klass_name.gsub!(/(?:^|_)(.)/){ $1.upcase } begin # convert it to class name klass = klass_name.constantize rescue Exception => e warn "Bogus class name '#{klass_name}'? #{e}" klass = nil end RESOURCE_CLASS_MAP[klass_name] = klass end |
.decode_str(str, strategy = :xml) ⇒ Object
Decode string from its encode_str representation. This can include dangerous things such as tabs, newlines, backslashes and cryptofascist propaganda.
69 70 71 72 73 74 75 |
# File 'lib/wukong/encoding.rb', line 69 def self.decode_str str, strategy=:xml case strategy when :xml then self.html_encoder.decode(str) when :url then Addressable::URI.unencode_component(str) else raise "Don't know how to decode with strategy #{strategy}" end end |
.encode_components(hsh, *fields) ⇒ Object
Replace each given field in the hash with its encoded value
81 82 83 84 85 |
# File 'lib/wukong/encoding.rb', line 81 def self.encode_components hsh, *fields fields.each do |field| hsh[field] = hsh[field].to_s.wukong_encode if hsh[field] end end |
.encode_str(str, strategy = :xml) ⇒ Object
By default (or explicitly with the :xml strategy), convert string to
-
XML-encoded ASCII,
-
with a guarantee that the characters “ quote, ‘ apos \ backslash, carriage-return r newline n and tab t (as well as all other control characters) are encoded.
-
Any XML-encoding in the original text is encoded with no introspection:
encode_str("<a href=\"foo\">") # => "&lt;a href="foo"&gt;"
With the :url strategy,
-
URL-encode the string
-
This is as strict as possible: encodes all but alphanumeric and _ underscore. The resulting string is thus XML- and URL-safe. addressable.rubyforge.org/api/classes/Addressable/URI.html#M000010
Wukong.decode_str(Wukong.encode_str(str)) returns the original str
If you’re seeing bad_encoding errors, try
$KCODE='u' unless "1.9".respond_to?(:encoding)
at the start of your script.
48 49 50 51 52 53 54 55 56 57 58 |
# File 'lib/wukong/encoding.rb', line 48 def self.encode_str str, strategy=:xml begin case strategy when :xml then self.html_encoder.encode(str, :basic, :named, :decimal).gsub(/\\/, '\') when :url then Addressable::URI.encode_component(str, /[^\w]/) else raise "Don't know how to encode with strategy #{strategy}" end rescue ArgumentError => e '!bad_encoding!! ' + str.gsub(/[^\w\s\.\-@#%]+/, '') end end |
.html_encoder ⇒ Object
HTMLEntities encoder instance
60 61 62 |
# File 'lib/wukong/encoding.rb', line 60 def self.html_encoder @html_encoder ||= HTMLEntities.new end |
.logger ⇒ Object
Common logger
Set your own at any time with
Wukong.logger = YourAwesomeLogger.new(...)
If you have log4r installed you can use
Wukong.logger = Wukong.default_log4r_logger
If Wukong.logger is too much typing for you, use the Log constant
Default format:
I, [2009-07-26T19:58:46-05:00 #12332]: Up to 2000 char message
15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/wukong/logger.rb', line 15 def self.logger return @logger if defined?(@logger) require 'logger' @logger = Logger.new STDERR @logger.instance_eval do def dump *args debug args.inspect end end @logger end |
.logger=(logger) ⇒ Object
27 28 29 |
# File 'lib/wukong/logger.rb', line 27 def self.logger= logger @logger = logger end |