Method: Wukong::Schema::ClassMethods#sql_load_mysql
- Defined in:
- lib/wukong/schema.rb
#sql_load_mysql(filename = nil) ⇒ Object
A mysql snippet to bulk load the tab-separated-values file emitted by a Wukong script.
Let’s say your class is ClickLog; its resource_name is “click_log” and thus its table_name is ‘click_logs’. sql_load_mysql will:
-
disable indexing on the table
-
import the file, replacing any existing rows. (Replacement is governed by primary key and unique index constraints – see the mysql docs).
-
re-enable indexing on that table
-
show the number of
The load portion will
-
Load into a table named click_logs
-
from a file named click_logs.tsv
-
where all rows have the string ‘click_logs’ in their first column
-
and all remaining fields in their #members order
-
assuming strings are wukong_encode’d and so shouldn’t be escaped or enclosed.
Why the “LINES STARTING BY” part? For map/reduce outputs that have many different objects jumbled together, you can just dump in the whole file, landing each object in its correct table.
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 |
# File 'lib/wukong/schema.rb', line 216 def sql_load_mysql(filename=nil) filename ||= ":resource_name.tsv" filename.gsub!(/:resource_name/, self.table_name) str = [] # disable indexing during bulk load str << %Q{ALTER TABLE `#{self.table_name}` DISABLE KEYS; } # Bulk load the tab-separated-values file. str << %Q{LOAD DATA LOCAL INFILE '#{filename}'} str << %Q{ REPLACE INTO TABLE `#{self.table_name}` } str << %Q{ COLUMNS } str << %Q{ TERMINATED BY '\\t' } str << %Q{ OPTIONALLY ENCLOSED BY '' } str << %Q{ ESCAPED BY '' } str << %Q{ LINES STARTING BY '#{self.resource_name}' } str << %Q{ ( @dummy,\n } str << ' '+self.sql_members str << %Q{\n ); } # Re-enable indexing str << %Q{ALTER TABLE `#{self.table_name}` ENABLE KEYS ; } # Show it loaded correctly str << %Q{SELECT NOW(), COUNT(*), '#{self.table_name}' FROM `#{self.table_name}`; } str.join("\n") end |