Method: Wukong::Schema::ClassMethods#sql_load_mysql

Defined in:
lib/wukong/schema.rb

#sql_load_mysql(filename = nil) ⇒ Object

A mysql snippet to bulk load the tab-separated-values file emitted by a Wukong script.

Let’s say your class is ClickLog; its resource_name is “click_log” and thus its table_name is ‘click_logs’. sql_load_mysql will:

  • disable indexing on the table

  • import the file, replacing any existing rows. (Replacement is governed by primary key and unique index constraints – see the mysql docs).

  • re-enable indexing on that table

  • show the number of

The load portion will

  • Load into a table named click_logs

  • from a file named click_logs.tsv

  • where all rows have the string ‘click_logs’ in their first column

  • and all remaining fields in their #members order

  • assuming strings are wukong_encode’d and so shouldn’t be escaped or enclosed.

Why the “LINES STARTING BY” part? For map/reduce outputs that have many different objects jumbled together, you can just dump in the whole file, landing each object in its correct table.



216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# File 'lib/wukong/schema.rb', line 216

def sql_load_mysql(filename=nil)
  filename ||= ":resource_name.tsv"
  filename.gsub!(/:resource_name/, self.table_name)
  str = []
  # disable indexing during bulk load
  str << %Q{ALTER TABLE            `#{self.table_name}` DISABLE KEYS; }
  # Bulk load the tab-separated-values file.
  str << %Q{LOAD DATA LOCAL INFILE '#{filename}'}
  str << %Q{  REPLACE INTO TABLE   `#{self.table_name}`    }
  str << %Q{  COLUMNS                                         }
  str << %Q{    TERMINATED BY           '\\t'                 }
  str << %Q{    OPTIONALLY ENCLOSED BY  ''                    }
  str << %Q{    ESCAPED BY              ''                    }
  str << %Q{  LINES STARTING BY     '#{self.resource_name}'   }
  str << %Q{  ( @dummy,\n }
  str << '    '+self.sql_members
  str << %Q{\n  ); }
  # Re-enable indexing
  str << %Q{ALTER TABLE `#{self.table_name}` ENABLE KEYS ; }
  # Show it loaded correctly
  str << %Q{SELECT NOW(), COUNT(*), '#{self.table_name}' FROM `#{self.table_name}`; }
  str.join("\n")
end