Class: Marty::DataImporter
- Inherits:
-
Object
- Object
- Marty::DataImporter
- Defined in:
- lib/marty/data_importer.rb
Class Method Summary collapse
-
.do_import(klass, data, dt = 'infinity', cleaner_function = nil, validation_function = nil, col_sep = "\t", allow_dups = false, preprocess_function = nil) ⇒ Object
Given a Mcfly klass and CSV data, import data into the database and report on affected rows.
-
.do_import_summary(klass, data, dt = 'infinity', cleaner_function = nil, validation_function = nil, col_sep = "\t", allow_dups = false, preprocess_function = nil) ⇒ Object
perform cleaning and do_import and summarize its results.
Class Method Details
.do_import(klass, data, dt = 'infinity', cleaner_function = nil, validation_function = nil, col_sep = "\t", allow_dups = false, preprocess_function = nil) ⇒ Object
Given a Mcfly klass and CSV data, import data into the database and report on affected rows. Result is an array of tuples. Each tuple is associated with one data row and looks like [tag, id]. Tag is one of :same, :update, :create and “id” is the id of the affected row.
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
# File 'lib/marty/data_importer.rb', line 46 def self.do_import(klass, data, dt = 'infinity', cleaner_function = nil, validation_function = nil, col_sep = "\t", allow_dups = false, preprocess_function = nil ) parsed = data.is_a?(Array) ? data : CSV.new(data, headers: true, col_sep: col_sep) # run preprocessor parsed = klass.send(preprocess_function.to_sym, parsed) if preprocess_function klass.transaction do cleaner_ids = cleaner_function ? klass.send(cleaner_function.to_sym) : [] raise "bad cleaner function result" unless cleaner_ids.all? {|id| id.is_a?(Integer) } eline = 0 begin res = parsed.each_with_index.map do |row, line| eline = line # skip lines which are all nil next :blank if row.to_hash.values.none? Marty::DataConversion.create_or_update(klass, row, dt) end rescue => exc # to find problems with the importer, comment out the rescue block raise Marty::DataImporterError.new(exc.to_s, [eline]) end ids = {} # raise an error if record referenced more than once. res.each_with_index do |(op, id), line| raise Marty::DataImporterError. new("record referenced more than once", [ids[id], line]) if op != :blank && ids.member?(id) && !allow_dups ids[id] = line end begin # Validate affected rows if necessary klass.send(validation_function.to_sym, ids.keys) if validation_function rescue => exc raise Marty::DataImporterError.new(exc.to_s, []) end remainder_ids = cleaner_ids - ids.keys raise Marty::DataImporterError. new("Missing import data. " + "Please provide header line and at least one data line.", [1]) if ids.keys.compact.count == 0 klass.delete(remainder_ids) res + remainder_ids.map {|id| [:clean, id]} end end |
.do_import_summary(klass, data, dt = 'infinity', cleaner_function = nil, validation_function = nil, col_sep = "\t", allow_dups = false, preprocess_function = nil) ⇒ Object
perform cleaning and do_import and summarize its results
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# File 'lib/marty/data_importer.rb', line 16 def self.do_import_summary(klass, data, dt = 'infinity', cleaner_function = nil, validation_function = nil, col_sep = "\t", allow_dups = false, preprocess_function = nil ) recs = self.do_import(klass, data, dt, cleaner_function, validation_function, col_sep, allow_dups, preprocess_function, ) recs.each_with_object(Hash.new(0)) {|(op, id), h| h[op] += 1 } end |